Confusing Success: Who's Really Winning the AI Stack?

Last week's Q1 earnings from Amazon and Google left investors with a problem they couldn't quite name. By every observable signal, the hyperscalers are winning. AWS backlog ended the quarter at $364 billion, and that figure does not include the Anthropic deal worth more than $100 billion on top. Google Cloud's backlog doubled year-on-year to $460 billion. Management at both companies described enterprise AI adoption that is broad, accelerating, and pulling core cloud services along with it.

Andy Jassy's framing on the AWS call was unambiguous: AI is no longer a side workload. AWS is being chosen for AI "due to its broad range of capabilities and strong security performance," and crucially "customers want their inference to be near their applications and data — and most of it is stored in AWS." Management described a flywheel: "customers that are deploying and seeing the benefits of AI are also accelerating their transition to the cloud," with "a strong correlation between AI spend and core non-AI services growth."

So the hyperscalers are winning. And yet — free cash flow is being eaten alive. Stock-market reaction was muted-to-negative for the cloud names.

If the hyperscalers are this dominant in demand, why does the market is not sure they are winning? And who is? This is the question every investor with AI exposure is now sitting with. The naive answer is: the labs. Rotate from infrastructure to the model layer. Yes, but...

The apparent answer: value moved to the labs

The data here is genuinely stunning, and SemiAnalysis has done the cleanest job of laying it out.

"This year Anthropic's ARR has exploded from $9B to over $44B today, their gross margins on their inference infrastructure have increased from 38% to over 70% over the same period."

The frontier labs went from approximately zero margin capture to capturing the entire incremental dollar of AI value, in roughly twelve months. Token production costs have collapsed. Inference cost per million tokens fell roughly 280x in two years. The drivers compound across three layers: algorithmic efficiency, hardware efficiency and facility efficiency.

Why doesn't competition compete this away? Two reasons, both load-bearing. First, frontier quality is sticky: open-source alternatives like Kimi K2.6 at $0.95/$4 exert almost no downward pressure on Opus pricing because they aren't actually substitutes for real knowledge work. Second, every frontier lab is compute-constrained. None of them can serve the entire market.

By inspection, the lab layer is the place to be. So the natural conclusion: rotate. Sell the cloud names whose free cash flow is shrinking, lighten on the infrastructure trade that's already had its run, and concentrate exposure in the labs and the application companies running on top of them. We think this conclusion is incomplete. Here's why.

The situation is engineered

The 70% gross margin Anthropic exists because three of the most important actors in the AI stack are independently choosing not to extract their share.

Nvidia: the central bank of AI

Nvidia is practicing voluntary restraint, and SemiAnalysis names it explicitly:

"Nvidia's position in the AI compute stack is already under increasing antitrust scrutiny, given its dominance across GPUs, interconnect, and software. In this environment, aggressively repricing systems to fully capture the value delivered risks drawing further attention, particularly if it results in outsized margin expansion while downstream AI labs are also generating significant profits."

The framing SemiAnalysis offers: "Nvidia is actively supporting the development of the broader ecosystem, ensuring long-term demand expansion rather than maximizing near-term extraction... By taking the oxygen out of the room, Nvidia aims to ensure it remains the main protagonist in the AI era for the foreseeable future."

2. TSMC: the fairest and most just company in the world

One layer up, the same dynamic plays out — for different reasons.

N3 is the tightest constraint in the entire AI compute system. Every major accelerator roadmap — Nvidia, Broadcom, Annapurna (Amazon), MediaTek, AMD — has converged on N3 for this year and next. And yet TSMC's pricing remains relatively stable. Customers would pay more.

TSMC isn't doing it. Their strategy has long been to "protect profitability through downcycles" — the flipside being that this "policy also blunts upside during upcycles." Long-term relationship durability and ecosystem stability are the priorities.

3. The hyperscalers: silicon as cloud lock-in, not profit center

Why don't Amazon and Google reprice TPU and Trainium?

They're not running a chip business. They're running a cloud business that happens to use proprietary silicon as a strategic differentiator. The margin Google and Amazon optimize is the multi-year cloud contract, not the per-chip markup. They want TPU and Trainium to look as cheap as possible relative to renting Nvidia capacity, because cheap custom silicon is the bait that pulls workloads onto GCP and AWS — where they then consume storage, networking, databases, and the rest of the cloud stack for the next decade.

Andy Jassy described this flywheel explicitly on the AWS call: AI workloads pull along core services, and "customers want their inference to be near their applications and data — and most of it is stored in AWS." Translated into silicon strategy: every Trainium hour booked is a customer one step deeper into the AWS data gravity well. Charging more for Trainium would defeat the purpose.

Google's external TPU sales are the interesting wrinkle here. As Google noted on its Q1 call, it's beginning to sell TPUs to others rather than just using them internally. That's the early signal of a strategy shift — but for now, the pricing logic remains cloud-flywheel-first.

The labs are sitting on a stack of three subsidies, each granted for its own reason. None of these is charity. Most powerful actors in the AI stack have voluntarily delayed their value capture.

In our opinion, where are regime-change triggers worth monitoring.

Trigger 1: Nvidia shifts from cost-plus to value-based pricing

Once enterprise AI ROI is broadly accepted, the optics of repricing improve. The signal to watch is language from Jensen and Nvidia IR around pricing philosophy, plus any meaningful softening in the antitrust posture (a friendlier FTC, a more permissive policy environment, or a clear divestiture-free outcome from any pending review).

Trigger 2: TSMC moves on pricing or capacity

The likely path is not headline price increases but long-term agreements with prepayments and guaranteed capacity commitments. Either way, more value flows to TSMC and less remains downstream.

Trigger 3: Google or Amazon decide custom silicon is a chip business

This is the genuinely disruptive scenario, and Google's nascent external TPU sales are the early signal. If either hyperscaler decides their custom silicon is strategic enough to monetize as a standalone product rather than a cloud-lock tool, that forces Nvidia to reprice in self-defense.

Stay diversified and keep monitoring the situation!

Confusing Success: Who's Really Winning the AI Stack?

Enjoyed this article?

Leo

Ask anything