Breaking News

Nvidia’s $20B Inference Play

Nvidia’s reported $20 billion licensing deal with AI chip startup Groq signals a decisive move in the battle over AI infrastructure. By integrating Groq’s Language Processing Unit (LPU) technology into a new inference-focused processor, Nvidia is reinforcing its dominance while preventing key customers such as OpenAI from exploring rival hardware alternatives.

For years, Nvidia’s GPUs, including the H100 and Blackwell series, have led the AI training market. Training requires enormous computational power, and Nvidia has excelled there. But inference, the stage where trained models generate real-time responses, presents a different challenge. GPUs, built for parallel processing at scale, can face latency issues during model “decoding,” especially in complex tasks like multi-step reasoning or autonomous agents.

Groq’s LPU architecture is optimized specifically for ultra-low latency inference. Its deterministic processing model reduces bottlenecks, making it well suited for the coming wave of agent-driven AI systems expected to mature over the next few years. By licensing this technology rather than acquiring the company outright, Nvidia sidesteps potential antitrust scrutiny while securing exclusive access to critical innovation.

The agreement reportedly includes the recruitment of key Groq talent, including founder Jonathan Ross, a former Google TPU engineer. The move also appears to have disrupted discussions between OpenAI and competing chipmakers such as Cerebras. In effect, Nvidia has strengthened customer retention by offering a direct solution to inference performance concerns.

The partnership deepens an already complex financial relationship. Nvidia has previously indicated plans to invest heavily in OpenAI, creating a reinforcing cycle: Nvidia funds AI development, and OpenAI deploys that capital on Nvidia hardware. This tight integration strengthens Nvidia’s strategic position across both training and inference.

Attention now turns to Nvidia’s upcoming GTC conference in San Jose, where CEO Jensen Huang is expected to unveil a hybrid compute system combining Nvidia GPUs with Groq-licensed LPUs. If successful, the platform could address the “last mile” of AI performance, delivering both massive training power and near-instant inference response in a unified stack.