Agentic Customer Experience and the Architecture of Programmable Transactions

On August 14, AGI House hosted a dinner sponsored by Forethought on the future of agentic customer experience. Guests included leading researchers like Christopher Manning and Sebastian Thrun, operators such as Nikita Shamgunov (Databricks) and Russ d'Sa (Founder of LiveKit), and founders and executives from Inworld, xAI, Pika, Character.AI, Cresta, and others. Discussion revolved around how agentic commerce will emerge, the constraints of current systems, and the technical pathways toward making transactions programmable.

Agentic Customer Experience Discussion

1 Execution Reliability is Core Limitation

The central bottleneck in agentic commerce is not intent recognition but reliable execution. Large language models can already parse natural language with sufficient accuracy. What breaks down is the ability to carry out multi-step transactional workflows at production-grade reliability.

Execution Reliability Challenges

Three technical levers for solving the execution problem:

1.1 Substrate of Execution

One source of fragility lies in the substrate of execution. Modern SaaS platforms expose APIs designed for machine consumption, but many legacy systems still force interaction through GUIs. This produces hybrid architectures where agents call APIs when available and fall back to browser automation elsewhere. Browser automation, however, is inherently brittle. A multi-step sequence with a per-step success rate of 0.9 completes only 35 percent of the time across ten steps, and UI changes can break flows without notice.

The long-term solution is standardized machine-native protocols, such as the emerging Model Control Protocol (MCP), which would allow agents to negotiate, execute, and settle transactions with guarantees similar to TCP/IP in networking. Until such protocols consolidate, execution reliability will remain uneven.

1.2 Model Design

A second dimension of reliability comes from model design. Retrieval-augmented generation is effective for open-ended reasoning, but brittle in transactional contexts. Hallucinating a product ID or misfiring on a payment API erodes trust irreparably. In practice, fine-tuned vertical models consistently outperform generic RAG systems in domains such as travel booking, procurement, or IT service management.

The emerging pattern is orchestration: a general-purpose model handles broad instruction parsing, then delegates execution-critical steps to fine-tuned agents with embedded guardrails. This specialization is not optional in commerce. Precision is mandatory whenever money or compliance is at stake.

1.3 Continuous Feedback Loops

Finally, execution reliability depends on continuous feedback loops. Every transaction produces structured signals: completion rates, error categories, resolution latency, and customer satisfaction. Without reintegrating this data, agent systems plateau. With feedback, they evolve asymptotically toward reliability.

For example, a browser sequence that fails repeatedly should automatically trigger a request for API integration. A pricing agent should continuously refine its negotiation strategies based on which offers are accepted or rejected. Enterprises that close these loops build compounding performance advantages over those that treat agents as static systems.

Taken together, substrate stability, model specialization, and feedback integration form the three technical levers for solving the execution problem. Until all three mature, agentic commerce will remain constrained to narrow workflows. Once they converge, agents will be able to transact end-to-end with reliability that exceeds human operators.

That reliability imperative is already reshaping how companies are organized.

2 Organizational Restructuring to Flatter Organization Charts

Several guests described engineering capacity no longer as a bottleneck. Code generation has freed engineers from routine integration work, allowing them to be redeployed into customer experience, embedding directly in teams that curate and tune agent workflows.

Organizational Changes

This produces flatter organization charts, with ratios of PMs to engineers inverting and in some cases exceeding 1:1. Engineers move closer to the customer surface, while PMs act as system designers: defining workflow boundaries, selecting models, and integrating performance feedback. As agents blur the traditional lines between support, sales, and success, Customer Experience organizations consolidate under single leadership. CX stops being a back-office function and becomes the primary product surface.

3 Adoption Dynamics and the Build-versus-Buy Shift

Build vs Buy Dynamics

3.1 Enterprise vs Consumer Adoption

Enterprise adoption is structurally easier because the problems are narrower, repeatable, and already well-instrumented. Support flows, procurement approvals, and IT tasks follow deterministic patterns with clear state transitions and logged outcomes. This makes them ideal for fine-tuned agents, which can be measured on metrics like resolution latency, error category, or customer satisfaction. Enterprises also operate in controlled software environments (Salesforce, ServiceNow, Workday) where agents can be restricted to a finite stack, further improving reliability. ROI is immediate: replacing human hours in a call center produces quantifiable savings, so adoption has a direct economic rationale.

Consumer adoption is harder to predict because it requires rewiring human behavior rather than just optimizing workflows. Habits like shopping, browsing, or media consumption are open-ended and less structured, so agents must succeed in noisy and heterogeneous environments. But the potential upside is larger. When agents collapse multiple state transitions into a single loop, they reduce latency and drop-off. Just as infinite scroll transformed media consumption by removing friction, "infinite transactions" could transform commerce.

3.2 The Economics Reversal

The economics of adoption amplify this divide. Historically, enterprises bought SaaS because building was too costly. With agents, the equation reverses. Low-cost model fine-tuning and code generation make it feasible for enterprises to replace horizontal SaaS with internal stacks tuned to their own data.

Examples of this shift:

Coho replaced Intercom with an in-house support agent, saving $55,000 annually while gaining tighter integration
Databricks built R2D2, an internal AI agent managing over 140 SaaS tools

The common thread is that defensibility shifts from distribution to proprietary data. Without network effects or differentiated datasets, horizontal SaaS vendors risk being displaced by bespoke agents. For consumers, the same principle applies at platform scale: whoever owns the feedback loops on browsing, checkout, and payments will accumulate the data advantage that powers retention.

4 Outlook

The near-term architecture of agentic commerce will remain hybrid: browser automation to bridge legacy systems, APIs where available, and fine-tuned sub-agents orchestrated by general LLMs. Enterprises are likely to standardize on these patterns over the next two years. Within three, protocol layers for agent-to-agent transactions may consolidate, laying the foundation for a truly agent-native economy.

At every stage, the key variable is reliability. Gains in per-step execution, error recovery, and feedback integration will dictate how quickly workflows shift from human-mediated to fully agentic. The demand is clear. Enterprises want efficiency, and consumers want seamless experiences.

Commerce is among the first domains to to become a truly agent-native economy since it’s structured, repeatable, and directly monetizable. Each transaction generates feedback signals that agents can use to improve. This makes commerce both a proving ground and a wedge: the domain where reliability will be solved first, and from which the architecture will spread to other sectors.

Agentic Customer Experience and the Architecture of Programmable Transactions