AI Infrastructure
Articles (40)

Three Bets Against Nvidia's Inference Margin, One Shared Dependency
OpenAI, Qualcomm, and Etched are betting against Nvidia's inference margin. Escaping it means queuing for the same TSMC packaging and memory. Most ASIC challengers die on software, not silicon.

One GPU in Orbit, a Million Satellites on Paper: Inside the Orbital Data Center Filing Arms Race
Since Starcloud flew the first H100 last November, SpaceX, Blue Origin, Starcloud, and a five-month-old startup have filed with the FCC for constellations totaling more than a million satellites. The physics hasn't moved as fast as the paperwork.

The President Says It's Fine. The Order Says It Isn't. The Models Are Still Dark.
An export-control order blacked out two Anthropic frontier models worldwide. The President has since softened; the order hasn't. A hosted model enforces a nationality rule only by going da

Reconstructing FP64: How Supercomputing's Establishment Is Adapting Science to AI Silicon
Two papers, Matsuoka's FP8-emulation preprints and the Dongarra 'Ride the Wave' paper, point to a field adapting scientific computing to AI silicon it no longer controls.

Project Prometheus Raised $12B to Train an "Artificial General Engineer." The Training Data Doesn't Exist Yet
Jeff Bezos and Vik Bajaj's startup now has $18.2 billion and roughly 150 employees. What it doesn't have is an internet of manufacturing data, so the corpus will have to be manufactured... much of it on supercomputers.

AWS quietly retired the fat tree. Fifty-year-old graph theory took its place.
By April 2026, Amazon's random-graph fabric had become the default for most new AWS datacenters. The efficiency claims behind it are still Amazon's own, with no independent benchmark yet.

Britain's Sovereign-Compute Day: A £2bn AMD Bet Meets a £1.1bn State Plan
Britain is funding homegrown silicon for machines that, for now, run on an American vendor's chips. Whether that buys sovereignty or quietly rebrands dependence is the question the spending leaves open.

The Electrician Bottleneck: Skilled Trades Increasingly Gate the AI Supercomputer Buildout
The GPU supply chain has the industry's attention. But the constraint that increasingly decides when an AI factory energizes is no longer the chip. It is power delivery, and the licensed electricians who commission it.

Co-Packaged Optics Has Two Front Doors. Only One Fixes the Scale-Up Bottleneck.
At Computex 2026, Wiwynn and eight ecosystem partners showed a full optical scale-up rack built around compute-side optics. It is a useful moment to separate two technologies the supercomputing industry keeps filing under one acronym.

Inside Meta's 83,000-GPU AI Supercomputer: Why It Runs the Silicon at 80% Power on Purpose
Meta's first end-to-end account of running a 150 MW, 83,000-GB200 cluster - when power is the ceiling, the cluster, not the chip, is what you optimize.

The Pentagon's FY2027 Budget Asks Congress for $46 Billion in Sovereign AI Infrastructure
Multi-year mandatory funding for a mix of government-owned, contractor-operated, and commercial-surge compute... reversing the July 2025 White House AI Action Plan that told DoW to lean on hyperscalers.

The 800V DC Rack Transition: How Rubin Ultra Is Rewiring the Supercomputing Industry's Last 50 Feet
NVIDIA has published the architecture. OCP and the supplier alliance have published the spec and the timeline. The colocation operators have published, so far, very little.

Thermodynamic Computing's First Silicon Is Back from the Fab. The Power Math Comes Next.
Normal Computing's CN101 is in characterization. Extropic has a prototype platform, an MIT-co-authored arXiv preprint, and an ETH Zurich hackathon in June. After two years as a manifesto, thermodynamic computing is producing the kind of artifacts readers can evaluate.

AI Training Power Demand Is Outrunning Grid Build Times. xAI Bet It Could Outrun Regulators Too.
xAI operates 46 unpermitted gas turbines at its Southaven power plant. A federal court ruling will determine if the turbine-first playbook is viable.

Nebius's $50 Billion Sells Out. Public Science Gets None of It.
Nebius's $50B backlog: 94% to Meta and Microsoft, zero to NAIRR, CloudBank, or DOE Genesis. The largest neocloud sells out before science can access it.

MRC Gives Open Ethernet Its First 75,000-GPU Production Proof Point
The 50-author MRC paper gives Ethernet its first multi-vendor, open-spec, production-trace answer to the one argument InfiniBand had left at frontier-training scale.

Apple's Mac Shortage Signals Memory Supply Chain Has Reorganized Around Data Center AI
Apple cut Mac memory ceilings and delayed M5 Ultra by four months as HBM production for data center AI consumes edge LPDDR5X allocation.

Orbital Compute in 2026: What Has Flown, What Is Slideware, and What the Physics Allows
Hardware has reached orbit and SpaceX has filed for a million-satellite constellation. Thermal physics, launch cadence, and bandwidth still push gigawatt orbital AI to post-2030, at best.

When the Grid Says No: Denmark and the New Shape of the Power Question
Energinet didn't pause grid connections in a contested metro or a zoning fight - it paused them because 60 GW of queued demand met a 7 GW national peak, and the math stopped working.

The training stack is starting to optimize itself
Anthropic’s 2.9× to 51.9× training-optimization curve signals that AI training infrastructure is becoming machine-optimizable, raising rebound demand and control-plane risks for HPC operators.

DeepSeek V4-Pro on Ascend 950PR: The Two-Stack AI Reality
DeepSeek V4-Pro runs on Huawei Ascend 950PR as the State Department pivots export controls from chip access to model IP, describing two parallel AI stacks.

HFAC Clears 16-Bill Chip Export Package on 150-Day Allied Clock
Sixteen bills cleared, silicon-level verification on deck, and an industry that hasn't spoken.

Sweden's Mimer Buys an AI Services Layer, Not a New Flagship
EuroHPC signed a EUR 29.76M contract with the relaunched Bull to deploy the Mimer AI Factory in Linköping, Sweden. At roughly a tenth the budget of Italy's IT4LIA, Mimer is less a new flagship machine and more a services envelope.

VAST Data's $30B mark is a bet on the middle layer of AI, not storage
VAST Data closed a $1B Series F at a $30B post-money, 3.3x its 2023 mark and roughly 1.7x Everpure's public cap. Here's what the math and the customer list actually signal.

Vera Rubin's Memory Stack Is Korean. How Three Vendors Got There Tells You Why It Will Stay That Way.
Samsung, SK hynix, and Micron converged on SOCAMM2 mass production within six weeks for NVIDIA's Vera Rubin. Korean suppliers now control both memory tiers.

Slingshot Held Performance Under AI Traffic Patterns That Collapsed InfiniBand by 5x on Production Exascale
ISC 2026 research on LUMI, Leonardo, CRESCO8: Slingshot held performance; InfiniBand collapsed 5x under Incast, the AI gradient-sync traffic pattern.

HBM Allocation, Not HBM Supply, Is the 2026 AI Infrastructure Story
HBM scarcity has moved beyond semiconductor supply into system planning. Accelerator availability, server bill-of-materials, cluster economics, and 2026 data center buildouts are all being rewritten around memory - not compute.

Copper Kings Buy the Fiber Layer: Credo and Molex Lock Down Silicon Photonics in 48 Hours
Two acquisitions, two days apart, at adjacent layers of the same stack. Credo's $873M cash-and-stock deal for DustPhotonics and Molex's Teramount buy tell the market the copper-era interconnect champions have decided the AI factory's fiber layer isn't something they're willing to source.

DOE's SYNAPS-I Platform Targets Unified AI Analysis Across Seven Beamline Facilities
DOE's SYNAPS-I targets unified AI analysis across seven beamline facilities. Can it coordinate deployment or will it fragment like existing implementations?

Anthropic Locks 3.5 GW of Google TPU Capacity as Commercial AI Pre-Purchases Infrastructure Scientific Computing Will Need
Broadcom will supply Anthropic with 3.5 GW of Google TPU capacity through 2031; ~23-35x the power of DOE's largest planned science supercomputer.

UCCL-EP vs. NCCL EP: Portability or Consolidation for MoE Communication?
Two new expert-parallel efforts point to different futures for MoE systems: one built for heterogeneous fleets, the other folded into NVIDIA’s stack.
NVIDIA's $4 Billion Photonics Bet Is an Admission: The AI Buildout Has a Materials Problem
NVIDIA's $4B investment in Lumentum and Coherent signals indium phosphide scarcity and power equipment lead times are gating $2.52T AI spending forecast.

IBM's Arm Partnership Bets on Dual-Architecture Enterprise AI — But the Benchmarks Aren't There Yet
IBM's Arm collaboration introduces Telum II and Spyre for enterprise AI, but lacks benchmarks, named customers, and CUDA compatibility disclosure.

AI Costs Are Falling 1,000x. It Is Not Enough.
AI inference costs have fallen 1,000x yet agentic workloads still cost hundreds daily, as Anthropic blocking OpenClaw from subscriptions proves consumer pricing cannot absorb real infrastructure economics.

FlatAttention Claims 4× Speedup Over FlashAttention-3 — But on What Hardware?
FlatAttention claims 4× speedup over FlashAttention-3 on unnamed tile-based accelerators. No code, no hardware vendor, no deployment path yet.

d-Matrix Acquires GigaIO's Data Center Business, Betting That Inference Is a Systems Problem
d-Matrix acquired GigaIO's data center business, gaining FabreX PCIe memory fabric and SuperNODE rack-scale technology to build a vertically integrated AI inference platform around its Corsair accelerator.
Samsung Bets Vertical Integration Can Close TSMC's Silicon Photonics Lead
Samsung launched a silicon photonics foundry platform at OFC 2026, targeting 2028 PIC mass production and 2029 CPO services, pitching vertical integration of HBM, logic, packaging, and photonics against TSMC's production lead

DOE Drops $293M in Genesis Mission Funding - And the Real Test Begins
The first competitive funding call under the Genesis Mission signals a shift from presidential ambition to operational reality. But questions about new money, missing partners, and timeline pressure linger.

The shadow TOP500: private AI superclusters are redefining supercomputing
xAI's Cortex 2, Meta's mega-clusters, and the $100B NVIDIA-OpenAI deal represent computing installations that dwarf anything on the official rankings. The supercomputing world hasn't reckoned with what that means.

NVIDIA's Vera Rubin Is a Capex Grenade - and Every Hyperscaler's 2027 Budget Knows It
The Blackwell-to-Rubin transition is a forcing function for the entire data center industry.