Semiconductors & Advanced Manufacturing
The AI boom is minting money — but not necessarily for the chip companies.
Three years into the AI hardware supercycle, the conventional wisdom was that Nvidia and the chip supply chain were the permanent winners. This week's analysis from Dylan Patel of SemiAnalysis — one of the semiconductor industry's most closely-read researchers — complicates that picture significantly. The profits are migrating up the stack, toward the companies that sell AI rather than the ones that build the hardware to run it. Meanwhile, the data center power crisis is moving from warning-sign to hard constraint.
Where the AI Money Is Actually Going: Model Labs Are Capturing the Boom
Patel's analysis this week is the signal you need to understand the industry right now. His core finding: from 2023 to 2025, the infrastructure layer — Nvidia, data center operators, power companies — captured essentially all the value from AI. In 2026, that's reversed. The model labs are the new winners.
The clearest example is Anthropic (the AI safety company behind the Claude models). Patel reports that Anthropic's annualized revenue has exploded from $9B to over $44B this year. More striking is what's happened to its gross margins — the percentage of revenue left after the cost of actually running the AI — which expanded from 38% to over 70% over the same period. That kind of margin improvement, at that scale, at that speed, is exceptional by any standard.
What's driving it? Better hardware and better software compounding together. Nvidia's Blackwell GPUs — the current generation of AI chips, succeeding the widely-discussed H100s — can generate 30x more tokens per second (a "token" being roughly one word of AI output) while running frontier AI workloads, compared to last year's Hopper-generation chips. ASICs (Application-Specific Integrated Circuits — custom chips designed for a single task, like Google's TPUs or Amazon's Trainium, as opposed to the general-purpose GPUs Nvidia sells) are showing similar efficiency gains.
The paradox Patel surfaces: TSMC (the Taiwanese foundry — a semiconductor manufacturing company that fabricates chips for designers like Nvidia, Apple, and AMD, rather than designing chips itself) and Nvidia have not raised prices to capture this value surge. Patel calls it "venting vast value into every vertical of the ecosystem." In other words, the two most irreplaceable players in the AI supply chain are effectively subsidizing everyone else's margins.
GPU rental prices — what companies pay to lease Nvidia hardware in the cloud — have recovered from a 2025 trough, with H100 one-year contract rates up 40% from the October 2025 bottom. But Patel's analysis suggests even these recovered prices dramatically underprice the value being extracted from AI workloads. SemiAnalysis itself is illustrative: Patel discloses his firm now spends at a $10.95M annual rate on Claude tokens — and says the productivity return more than justifies it.
The broader implication: inference providers (companies that run AI models as a service, sitting between the hardware and the end customer), neoclouds (GPU rental companies that aren't the big hyperscalers), and AI labs are all seeing widening margins. Something has to give — either TSMC and Nvidia eventually reprice, or their shareholders ask why they aren't.
Cloud Infrastructure Hits a New Gear
The raw spending numbers leave no room for debate about AI demand. Cloud infrastructure spending — what companies pay to rent computing from AWS, Google Cloud, Microsoft Azure, and their peers — reached $129B in Q1 2026, according to Synergy Research. That's the ninth consecutive quarter of growth, and Synergy reports it's the highest growth rate since Q4 2021, meaning the AI-driven cloud boom is accelerating.
AWS (Amazon Web Services, Amazon's cloud computing division and the largest cloud provider globally) crossed $150B in annualized revenue — a number that would rank it as a standalone Fortune 50 company. AWS's CEO made an interesting observation alongside the results: memory price increases are driving more companies toward managed cloud services rather than building their own infrastructure. When components get expensive, renting wins.
Google is making its hardware ambitions explicit. The company announced it will sell TPUs (Tensor Processing Units — Google's custom AI chips, designed specifically for the matrix math that underlies neural networks) to a "select group" of external customers for deployment in their own data centers — a significant expansion of a product line previously kept internal. Google also raised its capex (capital expenditure — investment in physical infrastructure like data centers and hardware) forecast.
Microsoft, meanwhile, brought 1GW of new data center capacity online in the latest quarter alone. For perspective: 1GW is roughly the output of a large nuclear reactor. Microsoft appears on pace to double its total data center footprint within two years.
Power Is the New Chip Shortage
If 2023's binding constraint was chip supply, 2026's is electricity. The question of who pays for AI power — and whether the grid can physically deliver it — has moved from theoretical concern to active business crisis.
PJM, the grid operator that manages electricity transmission across 13 U.S. states and Washington D.C. (covering the densely industrialized mid-Atlantic and Midwest corridor where much of U.S. data center capacity sits), reported 220GW of new grid connection requests under a reformed interconnection process. To calibrate that number: the entire U.S. generating capacity is roughly 1,200GW. A queue of 220GW of new connection requests represents an extraordinary bet on electrification — much of it AI-driven.
Companies are moving to secure power on their own terms. Mara — originally a Bitcoin mining company now pivoting to AI infrastructure — acquired a 505MW gas plant in Ohio for approximately $1.5B, with plans to build a 200MW data center on the same site. The logic is straightforward: in an environment where utility grid connection queues stretch years into the future, owning generation is a competitive moat.
On the engineering side, liquid cooling — which circulates water or refrigerant through server racks to remove heat, far more efficiently than blowing air — is shifting from premium option to operational necessity. Where traditional air-cooled racks might handle 10-20 kilowatts of heat per rack, high-density AI clusters now demand 100kW or more per rack. Physics wins: air can't keep up. Companies like Danfoss are at the center of this infrastructure buildout.
Memory's Breakout Year
Samsung Electronics — the South Korean conglomerate and world's largest producer of DRAM (Dynamic Random-Access Memory, the short-term working memory chips in virtually every computing device) and NAND flash storage — reported that its Q1 2026 operating profit already exceeds its entire full-year 2025 total. That's a dramatic reversal for a company that was mired in a deep memory downcycle just 18 months ago.
The mechanism is straightforward: AI models are extraordinarily memory-hungry, and getting hungrier. Patel reports that memory prices have risen 6x over the past year. The center of this boom is HBM (High Bandwidth Memory — a specialized, high-speed memory type that gets physically stacked directly on top of AI chips using advanced packaging techniques, enabling much faster data transfer than conventional memory). Every AI GPU ships with as much HBM as its maker can procure, and supply remains tight.
AWS's CEO connected the dots directly: memory price inflation is one of the forces pushing enterprise customers toward managed cloud services rather than building their own AI infrastructure.
China Builds Its Own Path
One item from the week that deserves attention: China's National Supercomputing Center announced plans for a machine called LineShine, targeting 2 exaflops of performance (an exaflop is 10¹⁸ floating-point operations per second — a unit so large it's easier to think of it as "a lot" — the current world record is roughly in this range) built entirely from Chinese-made hardware. U.S. export controls have cut off China's access to Nvidia's advanced chips and TSMC's leading-edge manufacturing processes. A domestically-built 2 exaflop system would be a meaningful signal of how far China has progressed in filling those gaps with its own chip industry.
The Trend to Watch
Patel's value-capture analysis is the framework that makes sense of everything else this week. If model labs are absorbing most of the economic surplus from AI while foundries and chip designers hold prices steady, the pressure for repricing will eventually become irresistible — either through TSMC and Nvidia raising prices, or through their competitors filling the gap. The moment that inflection arrives, it cascades through the entire stack: cloud compute costs, inference pricing, and what enterprises actually pay to use AI. Watch Nvidia's next major pricing move and TSMC's contract renegotiation cycle with hyperscaler customers as the leading indicators.
TL;DR - The AI boom's profits are migrating to model labs: Anthropic's revenue is up nearly 5x this year with margins now above 70% — while chip giants TSMC and Nvidia haven't repriced despite being indispensable. Patel calls this the central paradox of the current moment. - Cloud spending is at its fastest growth rate since 2021: AWS crossed $150B in annualized revenue, overall cloud hit $129B in Q1, and Microsoft is bringing online 1GW of new capacity per quarter — AI demand is accelerating, not plateauing. - Power is the new bottleneck: 220GW of U.S. grid connection requests and Mara's $1.5B gas plant acquisition signal that electricity supply is now the binding constraint on AI infrastructure buildout, with liquid cooling becoming essential at AI-scale heat densities. - Memory prices are up 6x in a year: Samsung just had its best quarter ever, and the surge shows no signs of stopping — the AI hardware boom is very much still a hardware boom, just centered on memory rather than processors.
Compiled from 2 sources · 21 items
- Data Center Dynamics (20)
- Dylan Patel (1)