Skip to content
Back to blog
developer-guide14 min

Building Agentic Commerce #3: Trust Scores — How Agents Decide Who to Buy From

When an AI agent evaluates merchants, it doesn't read reviews or recognize logos. It reads trust scores — 12 machine-verifiable signals that determine search ranking, checkout eligibility, and payment friction. Here's how the system works.

Executive summary

Third post in the 'Building Agentic Commerce' series. A developer guide to the 12-component trust scoring system: how scores are calculated, how agents use them for ranking and gating, how eIDAS QTSP verification provides the highest trust boost, and how the KYAI policy engine enforces merchant thresholds.

Published

2026-04-06

14 min

Author

Integration Architecture Team

Implementation architects

The integration architecture team focuses on practical rollout patterns for stores adopting MCP-compatible commerce surfaces.

View profile

Category

developer-guide

building-agentic-commercetrust-scoreseidaskyaimerchant-verificationdeveloper-guideAI Agentssecurityranking

When you buy something online, you rely on brand recognition, star ratings, friend recommendations, and gut instinct. An AI agent has none of that. It needs machine-verifiable, quantitative signals to decide: should I buy from this merchant? How much friction should I accept? Is this product listing trustworthy? This is the problem trust scores solve — and in this post, we'll show you exactly how AgenticMCPStores computes, serves, and enforces them across every agent interaction.

This is Part 3 of the 'Building Agentic Commerce' series. Part 1 covered multi-protocol checkout (MCP + x402 + ACP). Part 2 covered agent discovery via NLWeb and llms.txt. Part 4 will cover x402 stablecoin payments in depth.

Essential insight

Why Agents Need Trust Scores

Human shoppers evaluate trust subconsciously. You see a polished website, recognize a brand, check a Trustpilot badge, and decide in seconds. AI agents operate differently — they process JSON responses, not visual cues. A merchant with a beautiful storefront and a merchant with a broken product feed look identical unless there's a structured trust signal attached to the data. Trust scores provide that signal. Every merchant in AgenticMCPStores has a score between 0 and 1.0, computed from 12 real-time components. This score determines three things: search ranking, checkout eligibility, and payment friction level.

The 12 Trust Components

The trust score is not a single metric — it's a weighted composite of 12 independent signals. Each component measures a different dimension of merchant reliability. The system was designed so that no single component can dominate the score, and gaming one metric doesn't significantly improve the total.

Original 8 Components (59% weight)

These eight components were part of the initial trust framework and focus on catalog quality and transaction reliability: • catalog_completeness (11%) — Percentage of products with complete metadata (title, description, price, images, categories). A store with 200 products where 180 have full metadata scores 0.90. • catalog_freshness (11%) — How recently the catalog was synced. Freshness decays over time — a sync from 2 hours ago scores higher than one from 48 hours ago. Recalculated every 6 hours. • price_accuracy (12%) — Deviation between declared prices and actual checkout prices. If a product lists $29.99 but charges $34.99 at checkout, this component drops. The highest-weighted original component because price trust is foundational. • availability_accuracy (8%) — Match between declared stock status and real availability. If 'in_stock' products consistently fail at checkout, this score degrades. • policy_coverage (8%) — Completeness of return, shipping, and refund policies. Agents need clear, machine-readable policies to make purchase decisions on behalf of users. • checkout_success_rate (11%) — Ratio of completed checkouts to initiated checkouts. A high abandonment rate (caused by errors, not user choice) signals infrastructure problems. • fulfillment_rate (8%) — Orders successfully fulfilled versus orders placed. Tracks actual delivery performance over a rolling 30-day window. • dispute_rate (7%) — Chargebacks and disputes per transaction. The inverse metric — lower is better. A dispute rate above 2% triggers automatic review.

V2 Components (24% weight)

Four additional components were added to capture agent-specific quality signals: • agent_satisfaction_rate (8%) — Ratio of CHECKOUT_COMPLETE to CHECKOUT_START events from the AgentRequestMetric table. Unlike human conversion rates, this measures whether agents that start buying actually finish — a proxy for API reliability. • response_latency (5%) — Average API response time for agent requests. Faster responses score higher. Currently uses a 0.8 synthetic default for merchants without enough data. • review_sentiment (5%) — Normalized average product rating (averageRating / 5). Bridges human feedback into the agent trust model. • data_consistency (6%) — Delta between sync data and live API responses. Catches merchants whose catalog sync diverges from their actual storefront. Currently uses a 0.85 synthetic default.

Weight Distribution

The 12 weights sum to exactly 1.0. The final score is a simple weighted average: ``` trustScore = Σ(component_score × component_weight) = (catalog_completeness × 0.11) + (catalog_freshness × 0.11) + (price_accuracy × 0.12) + (availability_accuracy × 0.08) + (policy_coverage × 0.08) + (checkout_success_rate × 0.11) + (fulfillment_rate × 0.08) + (dispute_rate × 0.07) + (agent_satisfaction_rate × 0.08) + (response_latency × 0.05) + (review_sentiment × 0.05) + (data_consistency × 0.06) ``` Scores are recomputed every 6 hours by a scheduled job. The trust-component-calculator.ts service handles the math.

eIDAS QTSP: The Verification Boost

Trust scores measure behavioral reliability, but they don't verify identity. A merchant can have a perfect catalog and still be fraudulent. This is where eIDAS QTSP verification enters the picture. Merchants who complete government-certified identity verification through a Qualified Trust Service Provider (InfoCert) receive a trust score boost that's additive to their computed score.

Three verification levels exist: ```typescript VERIFICATION_TRUST_BOOST = { BASIC: 0.05, // Personal ID, MDL, EUDI PID STANDARD: 0.10, // Business Registration QUALIFIED: 0.18 // eIDAS QTSP (InfoCert KYB) } ``` The QUALIFIED level (+0.18) is the highest because it requires government-supervised Know Your Business verification via InfoCert, with legal standing across 30 EU/EEA countries under eIDAS Regulation 910/2014. A merchant with a behavioral score of 0.72 and QTSP verification would have an effective score of 0.90 — a significant advantage in agent search ranking.

Important detail: boosts are not additive across verification types. If a merchant has both a VDC (Verifiable Digital Credential) boost and a KYApay identity boost, the system uses the maximum of the two, not the sum. This prevents boost stacking: ```typescript // vdc-trust-boost.service.ts effectiveBoost = Math.max(kyapayBoost, vdcBoost) ```

How Trust Scores Affect Search Ranking

When an agent calls search_products or uses AgentFinder semantic search, trust scores directly influence which results appear and in what order.

Hard Eligibility Filter

Before any ranking happens, a hard filter removes ineligible merchants: ```typescript // agent-search.service.ts store: { verificationLevel: { in: ['BASIC', 'STANDARD', 'PREMIUM'] }, trustScore: { gte: 0.3 }, suspendedAt: null } ``` Merchants with a trust score below 0.3 are invisible to agents. They don't appear in search results, product listings, or discovery endpoints. This is a hard gate — no amount of text relevance can override it.

Trust-Weighted Ranking Formula

For merchants that pass the filter, the final relevance score blends text match quality with trust: ```typescript // nlweb-agentfinder.service.ts relevanceScore = Math.min(1, relevanceScore * 0.7 + trustWeight * 0.3) // where trustWeight = trustScore / 5 ``` 70% of the ranking comes from text/semantic relevance (how well the product matches the query). 30% comes from trust. This means a highly relevant product from a low-trust merchant can be outranked by a moderately relevant product from a high-trust merchant.

Status Tiers Based on Score

Trust scores map to four operational statuses: • ACTIVE (score ≥ 0.5) — Full visibility in search results. Top ranking positions. • DEPRIORITIZED (0.3 ≤ score < 0.5) — Appears in search but ranked lower. Agents see these results but prefer ACTIVE merchants. • HIDDEN (0.2 ≤ score < 0.3) — Removed from search results entirely. Direct links still work. • SUSPENDED (score < 0.2) — Auto-suspension triggered. All MCP tools return errors. The merchant must resolve the issue to reactivate. When a merchant drops below 0.3, a security event is logged and a GDPR Article 22 notification is sent (because automated scoring affects commercial access).

KYAI Policy Engine: Trust at Checkout

Trust scores don't just affect discovery — they gate checkout. The KYAI (Know Your AI) policy engine evaluates every purchase intent against a chain of rules, each with a priority level. Two rules specifically use trust scores:

Agent Trust Gate (Priority 10) ```typescript // agent-trust-gate.rule.ts MINIMUM_TRUST_SCORE = 0.3 // Decision: BLOCK if // intent.agent.trustTier === 'BLOCKED' // intent.agent.trustScore < 0.3 ``` This is a hard gate. Agents with a trust score below 0.3 are blocked from initiating any checkout. The first BLOCK in the rule chain terminates evaluation — no subsequent rule can override it.

Minimum Buyer Trust (Priority 15) ```typescript // min-buyer-trust.rule.ts DEFAULT_MIN_BUYER_TRUST = 0.3 // Decision: FRICTION if // intent.agent.trustScore < merchant_threshold ``` This is softer — it applies friction (additional verification steps) rather than blocking. Merchants can configure their own minimum buyer trust threshold. A premium merchant might require agents with a score of 0.6+ while a marketplace merchant accepts 0.3+.

The KYAI engine processes rules in priority order (1 through 15). The possible outcomes are: • ALLOW — Proceed normally • FRICTION — Additional verification required (e.g., confirm intent, provide user authorization) • REVIEW — Flagged for manual merchant review • BLOCK — Transaction denied Multiple FRICTION rules accumulate — the worst decision wins. But one BLOCK terminates the chain immediately.

The Trilateral Trust Model

AgenticMCPStores uses a trilateral trust model — three separate actors, each with their own score: • Merchant Trust — 12 components, updated every 6 hours, minimum threshold 0.3 • Buyer Agent Trust — 8 components, updated hourly, minimum threshold 0.5 • Seller Agent Trust — 8 components, updated hourly, minimum threshold 0.6 The buyer and seller agent scores are higher-threshold because agent actions have more immediate impact — a compromised agent could make unauthorized purchases or list fraudulent products at scale. The asymmetry between buyer (0.5) and seller (0.6) reflects the greater damage potential of seller-side agent compromise.

Accessing Trust Scores via API

Trust scores are available through both REST and MCP interfaces. REST API Endpoints: ``` GET /trust/scores/:storeId → Current score + 12 components GET /trust/scores/:storeId/history → Score history (days param) POST /trust/scores/:storeId/compute → Force recalculation (admin) GET /trust/health → Network-wide stats GET /trust/recommendations/:storeId → Actionable improvement tips ``` MCP Tool: Trust data is embedded in the `get_merchant_profile` tool response. There's no separate trust tool — agents get trust information as part of merchant discovery, which aligns with the principle that trust should be contextual, not isolated.

Production Tips

  • 1
    Sync your catalog frequently — catalog_freshness (11% weight) decays over time. A daily sync keeps this component near 1.0.
  • 2
    Complete product metadata — catalog_completeness (11%) checks titles, descriptions, prices, images, and categories. Missing images alone can drop this component to 0.8.
  • 3
    Fix checkout errors promptly — checkout_success_rate (11%) is one of the highest-weighted components. A broken payment integration can tank your score in 6 hours.
  • 4
    Get eIDAS QTSP verified — the +0.18 boost is the single largest improvement available. It's especially impactful for new merchants who haven't built behavioral history yet.
  • 5
    Monitor the /trust/recommendations endpoint — it returns specific, actionable suggestions ranked by impact. Start with the highest-weight components.
  • 6
    Keep prices consistent — price_accuracy (12%) is the highest-weighted original component. Any discrepancy between listed and checkout price degrades trust fast.

What's Next

In Part 4 of Building Agentic Commerce, we'll dive deep into x402 stablecoin payments — how agents pay with USDC on-chain, the HTTP 402 flow, multi-chain support (Base, Ethereum, Solana, Polygon, Arbitrum), and the async settlement pipeline. Trust scores play a role there too: the KYAI engine can require higher trust for high-value x402 transactions.

Frequently asked questions

What is the minimum trust score for a merchant to appear in agent search results?

0.3. Merchants with a trust score below 0.3 are completely invisible to agents — they don't appear in search results, product listings, or any discovery endpoint. Between 0.2 and 0.3, merchants are HIDDEN (only accessible via direct links). Below 0.2, auto-suspension is triggered.

How often are trust scores recalculated?

Every 6 hours for merchant trust scores. The trust-component-calculator service runs as a scheduled job and recomputes all 12 components for every active merchant. Agent trust scores (buyer and seller) are updated hourly due to their higher-impact nature.

Can merchants see their own trust score components?

Yes. The REST endpoint GET /trust/scores/:storeId returns the overall score plus all 12 individual component scores. The /trust/recommendations/:storeId endpoint provides actionable suggestions for improvement, ranked by impact on the total score.

How does the eIDAS QTSP trust boost compare to other verification methods?

BASIC verification (personal ID) provides a +0.05 boost. STANDARD verification (business registration) provides +0.10. QUALIFIED verification (eIDAS QTSP via InfoCert) provides +0.18 — the highest available. The boost is additive to the behavioral score but not stackable with other boost types (the system uses max, not sum).

What happens if a merchant's trust score drops suddenly?

If the score drops below 0.3, the merchant is removed from search results and a security event is logged. Below 0.2, auto-suspension is triggered. In both cases, a GDPR Article 22 notification is sent because automated scoring affects commercial access. The merchant dashboard shows real-time trust metrics and alerts for score changes.

Sources and references

Related articles

Trust Scores for AI Agents: 12 Components That Determine Who to Buy From | Building Agentic Commerce | AgenticMCPStores