Intel and SambaNova have unveiled a joint blueprint for agentic AI infrastructure that combines GPUs, SambaNova RDUs and Intel Xeon 6 processors to address the limitations of GPU-only inference systems as AI agents move from experimentation into production. The architecture is designed for enterprises, cloud providers and sovereign AI programmes, and will be available in the second half of 2026.
The collaboration splits the inference pipeline into distinct roles. GPUs will handle the prefill stage and prepare key-value caches, SambaNova RDUs will manage high-throughput decoding, and Xeon 6 CPUs will act as the host and execution layer for agentic tasks such as orchestration, API calls, code execution and result validation.
Why the architecture matters
This design reflects a broader shift in AI infrastructure. As coding agents and other agentic systems become more common, enterprises are discovering that no single chip architecture is ideal for every stage of inference. The new blueprint aims to improve latency, efficiency and software compatibility by giving each compute layer a specific job instead of forcing one system to do everything.
SambaNova says the system is built to run in existing air-cooled data centres and supports widely used software tools, which could make deployment easier for organisations that want production-scale AI without rebuilding their infrastructure from scratch. Intel adds that Xeon 6 remains central to enterprise data centre software because of the x86 ecosystem and its ability to support broad compatibility across existing workloads.
Agentic AI moves to production
The collaboration is especially relevant because agentic AI is no longer limited to demos and pilots. Coding agents are now being used to compile code, call APIs and coordinate workflows, which means the infrastructure needs to be faster, more efficient and better coordinated across multiple compute types.
In that context, the Intel-SambaNova approach is meant to solve a practical problem rather than just showcase hardware. GPUs are still useful for the prompt-heavy beginning of the inference process, but the decoding stage and the agent execution layer require different performance characteristics. By separating those tasks, the companies are betting that enterprises can improve throughput and reduce cost without sacrificing compatibility with existing x86 environments.
Enterprise fit and market context
For Intel, the deal reinforces the continued relevance of Xeon in AI data centres, even as attention often focuses on GPUs and accelerators. For SambaNova, it gives its RDUs a clearer place in a production architecture aimed at agentic workloads that demand low latency and high throughput.
The timing also reflects a larger industry trend. Enterprises are increasingly looking for heterogeneous systems that balance performance and efficiency instead of relying on one dominant chip category. That makes this collaboration less about rivalry and more about how the AI stack is being rebuilt for a phase where agentic systems must work reliably at scale.
