In the race to bring artificial intelligence into the enterprise, a small but well-funded startup is making a bold claim: The problem holding back AI adoption in complex industries has never been the models themselves.
Contextual AI, a two-and-a-half-year-old company backed by investors including Bezos Expeditions and Bain Capital Ventures, on Monday unveiled Agent Composer, a platform designed to help engineers in aerospace, semiconductor manufacturing, and other technically demanding fields build AI agents that can automate the kind of knowledge-intensive work that has long resisted automation.
The announcement arrives at a pivotal moment for enterprise AI. Four years after ChatGPT ignited a frenzy of corporate AI initiatives, many organizations remain stuck in pilot programs, struggling to move experimental projects into full-scale production. Chief financial officers and business unit leaders are growing impatient with internal efforts that have consumed millions of dollars but delivered limited returns.
Douwe Kiela, Contextual AI’s chief executive, believes the industry has been focused on the wrong bottleneck. “The model is almost commoditized at this point,” Kiela said in an interview with VentureBeat. “The bottleneck is context — can the AI actually access your proprietary docs, specs, and institutional knowledge? That’s the problem we solve.”
Why enterprise AI keeps failing, and what retrieval-augmented generation was supposed to fix
To understand what Contextual AI is attempting, it helps to understand a concept that has become central to modern AI development: retrieval-augmented generation, or RAG.
When large language models like those from OpenAI, Google, or Anthropic generate responses, they draw on knowledge embedded during training. But that knowledge has a cutoff date, and it cannot include the proprietary documents, engineering specifications, and institutional knowledge that make up the lifeblood of most enterprises.
RAG systems attempt to solve this by retrieving relevant documents from a company’s own databases and feeding them to the model alongside the user’s question. The model can then ground its response in actual company data rather than relying solely on its training.
Kiela helped pioneer this approach during his time as a research scientist at Facebook AI Research and later as head of research at Hugging Face, the influential open-source AI company. He holds a Ph.D. from Cambridge and serves as an adjunct professor in symbolic systems at Stanford University.
But early RAG systems, Kiela acknowledges, were crude.
“Early RAG was pretty crude — grab an off-the-shelf retriever, connect it to a generator, hope for the best,” he said. “Errors compounded through the pipeline. Hallucinations were common because the generator wasn’t trained to stay grounded.”
When Kiela founded Contextual AI in June 2023, he set out to solve these problems systematically. The company developed what it calls a “unified context layer” — a set of tools that sit between a company’s data and its AI models, ensuring that the right information reaches the model in the right format at the right time.
The approach has earned recognition. According to a Google Cloud case study, Contextual AI achieved the highest performance on Google’s FACTS benchmark for grounded, hallucination-resistant results. The company fine-tuned Meta’s open-source Llama models on Google Cloud’s Vertex AI platform, focusing specifically on reducing the tendency of AI systems to invent information.
Inside Agent Composer, the platform that promises to turn complex engineering workflows into minutes of work
Agent Composer extends Contextual AI’s existing platform with orchestration capabilities — the ability to coordinate multiple AI tools across multiple steps to complete complex workflows.
The platform offers three ways to create AI agents. Users can start with pre-built agents designed for common technical workflows like root cause analysis or compliance checking. They can describe a workflow in natural language and let the system automatically generate a working agent architecture. Or they can build from scratch using a visual drag-and-drop interface that requires no coding.
What distinguishes Agent Composer from competing approaches, the company says, is its hybrid architecture. Teams can combine strict, deterministic rules for high-stakes steps — compliance checks, data validation, approval gates — with dynamic reasoning for exploratory analysis.
“For highly critical workflows, users can choose completely deterministic steps to control agent behavior and avoid uncertainty,” Kiela said.
The platform also includes what the company calls “one-click agent optimization,” which takes user feedback and automatically adjusts agent performance. Every step of an agent’s reasoning process can be audited, and responses come with sentence-level citations showing exactly where information originated in source documents.
From eight hours to 20 minutes: what early customers say about the platform’s real-world performance
Contextual AI says early customers have reported significant efficiency gains, though the company acknowledges these figures come from customer self-reporting rather than independent verification.
“These come directly from customer evals, which are approximations of real-world workflows,” Kiela said. “The numbers are self-reported by our customers as they describe the before-and-after scenario of adopting Contextual AI.”
The claimed results are nonetheless striking. An advanced manufacturer reduced root-cause analysis from eight hours to 20 minutes by automating sensor data parsing and log correlation. A specialty chemicals company reduced product research from hours to minutes using agents that search patents and regulatory databases. A test equipment maker now generates test code in minutes instead of days.
Keith Schaub, vice president of technology and strategy at Advantest, a semiconductor test equipment company, offered an endorsement. “Contextual AI has been an important part of our AI transformation efforts,” Schaub said. “The technology has been rolled out to multiple teams across Advantest and select end customers, saving meaningful time across tasks ranging from test code generation to customer engineering workflows.”
The company’s other customers include Qualcomm, the semiconductor giant; ShipBob, a tech-enabled logistics provider that claims to have achieved 60 times faster issue resolution; and Nvidia, the chip maker whose graphics processors power most AI systems.
The eternal enterprise dilemma: should companies build their own AI systems or buy off the shelf?
Perhaps the biggest challenge Contextual AI faces is not competing products but the instinct among engineering organizations to build their own solutions.
“The biggest objection is ‘we’ll build it ourselves,'” Kiela acknowledged. “Some teams try. It sounds exciting to do, but is exceptionally hard to do this well at scale. Many of our customers started with DIY, and found themselves still debugging retrieval pipelines instead of solving actual problems 12-18 months later.”
The alternative — off-the-shelf point solutions — presents its own problems, the company argues. Such tools deploy quickly but often prove inflexible and difficult to customize for specific use cases.
Agent Composer attempts to occupy a middle ground, offering a platform approach that combines pre-built components with extensive customization options. The system supports models from OpenAI, Anthropic, and Google, as well as Contextual AI’s own Grounded Language Model, which was specifically trained to stay faithful to retrieved content.
Pricing starts at $50 per month for self-serve usage, with custom enterprise pricing for larger deployments.
“The justification to CFOs is really about increasing productivity and getting them to production faster with their AI initiatives,” Kiela said. “Every technical team is struggling to hire top engineering talent, so making their existing teams more productive is a huge priority in these industries.”
The road ahead: multi-agent coordination, write actions, and the race to build compound AI systems
Looking ahead, Kiela outlined three priorities for the coming year: workflow automation with actual write actions across enterprise systems rather than just reading and analyzing; better coordination among multiple specialized agents working together; and faster specialization through automatic learning from production feedback.
“The compound effect matters here,” he said. “Every document you ingest, every feedback loop you close, those improvements stack up. Companies building this infrastructure now are going to be hard to catch.”
The enterprise AI market remains fiercely competitive, with offerings from major cloud providers, established software vendors, and scores of startups all chasing the same customers. Whether Contextual AI’s bet on context over models will pay off depends on whether enterprises come to share Kiela’s view that the foundation model wars matter less than the infrastructure that surrounds them.
But there is a certain irony in the company’s positioning. For years, the AI industry has fixated on building ever-larger, ever-more-powerful models — pouring billions into the race for artificial general intelligence. Contextual AI is making a quieter argument: that for most real-world work, the magic isn’t in the model. It’s in knowing where to look.
