AI Infrastructure The Backbone of the Intelligence Boom
Why AI Infrastructure Is the Real Arms Race
While ChatGPT and humanoid robots dominate headlines, a quieter revolution is underway beneath the surface, the race to build, optimize, and own the infrastructure that powers AI. From chips to cooling, data centres to orchestration software, the battle for long term defensibility in AI is being fought in steel, silicon, and megawatts.
Investors and founders are now looking beyond the model, toward the new software, silicon, and systems required to support exponential compute.
Key Trends We’re Watching
Inference: The New Frontier
The AI hype used to be all about training models, but today, the real action (and cost) is in inference (the process of running models at scale for billions of end users.) As large language models become embedded in products, startups that optimize inference speed and cost stand to win big.
- Groq’s LPUs deliver ultra fast token streaming for real time AI experiences.
- Platforms like Baseten, Modal, and OctoML are pioneering cost aware, low latency inference serving with built in model operations.
Inference optimization is the next big moat, watch for breakthroughs in runtime acceleration and compression tech.
The Cooling & Power Challenge
As AI clusters balloon in density, from 8 kW per rack in 2021 to 30 kW+ projected by 2027, conventional air cooling is becoming obsolete. Liquid and immersion cooling systems, championed by innovators like ZutaCore, are now table stakes. Two-phase cooling faces PFAS regulatory risks, pushing the industry toward new materials and direct-to-chip designs capable of handling 120 kW per rack for training workloads like GPT scale models.
But the cooling challenge is sparking innovation far beyond Earth.
Emerging players like StarCloud are exploring space based data centres, leveraging direct solar energy for power and the natural cold vacuum of space as a near free cooling medium. By hosting compute in orbit, these systems could bypass terrestrial limits on grid power, cooling costs, and environmental constraints, while delivering unprecedented energy efficiency. Recently Jeff Bezos was discussing space based data centres, you can read it here.
Cooling and power efficiency, from liquid immersion on Earth to radiative cooling in orbit will determine which operators can scale AI sustainably and competitively across the next decade.
AI-Native DevOps & ModelOps
Traditional DevOps tools can’t handle the complexity of AI models. The next wave of infrastructure is AI native orchestration, platforms that combine deployment, cost optimization, and real time monitoring for large models.
Companies like Modal, OctoML, and Baseten are leading this shift, merging DevOps and ModelOps into a unified stack that allows engineers to deploy foundation models with minimal latency and maximum observability.
Expect rapid consolidation as infrastructure players merge compute, orchestration, and monitoring into end-to-end AI infra suites.
Verticalized AI Infrastructure
Just as cloud software became verticalized by industry, AI infrastructure is following suit, tailored stacks for healthcare, finance, defense, and beyond:
- Healthcare: HIPAA-compliant AI for diagnostics and imaging.
- Security: Edge-based inference for real-time threat detection.
- Enterprise: Private model hosting with audit trails for regulated industries.
Vertical AI stacks offer defensibility and high margins, the new “application layer” for infra founders.
Custom Silicon on the Rise
While Nvidia dominates today’s landscape, the next decade belongs to custom silicon chips built for LLM inference, training, and edge optimization.
Startups like Tenstorrent, Mythic, Etched.ai, and Zettascale (formerly Exa) are designing next-generation AI accelerators optimized for extreme efficiency and tailored compute workloads. Zettascale’s focus on modular chiplet architectures and ultra-dense interconnects aims to push performance beyond the limits of monolithic GPUs, reducing both latency and power draw.
Meanwhile, big tech giants, such as Google (TPU), Amazon (Trainium and Inferentia), and Meta (MTIA) are doubling down on in house silicon to cut inference costs and secure their supply chains.
The chip layer will be one of the fiercest battlegrounds of the AI economy. From Zettascale’s specialized architectures to hyperscaler built processors, custom AI chips will define efficiency, cost, and access to compute, the foundation of long term defensibility in the intelligence era.
Compute Supply & Demand
Global data center demand: 60 GW (2023) → 171–219 GW by 2030 (+19–22% CAGR).
High case: up to 298 GW (+27% CAGR).
AI-ready centers: ~70% of all capacity by 2030.
Generative AI: ~40% of total global load.
AI workloads are the primary growth driver. The global compute demand will triple by the decade’s end.
Sources: McKinsey Data Center Demand Model, Exhibits 1–2
U.S. deficit risk: >15 GW by 2030.
Vacancy rates: <1% in Northern Virginia (world’s largest data hub).
Colocation pricing: +35% (2020–2023).
Constraints: grid power, transformers, labor shortages.
Policy limits: Ireland halted new grid connections until 2028.
The race to expand capacity is constrained by power and infrastructure, not capital.
Sources: CBRE — North America Data Center Trends H2 2023, JLL — Data Centers 2024 Global Outlook, EirGrid — Ireland Capacity Outlook 2022–2031, CBRE — Global Data Center Trends 2024
Hyperscaler Control
Hyperscalers: AWS, Azure, Google, Baidu control >50% of AI-ready capacity.
By 2030: 60–65% of AI workloads hosted on hyperscalers; 35–40% private.
GPU Clouds: CoreWeave → 45k GPUs, 28 global sites (2024).
Partnerships: hyperscalers + colos expanding in tandem.
The cloud giants will remain the backbone of global AI compute.
Sources:McKinsey Exhibit 3, Data Center Frontier — CoreWeave, Chirisa Tap Bloom Energy for Illinois AI Data Center Project (Jul 2024)
Infrastructure Shift
Facility scale: 30 MW (2013) → 200 MW+ (2024).
Rack density: 8 kW (2021) → 17 kW (2023) → 30 kW+ (2027).
Training workloads: up to 120 kW/rack (e.g., GPT-scale models).
Cooling evolution:RDHX: 40–60 kWDirect-to-Chip: 60–120 kWLiquid immersion: 100–150+ kW (PFAS concerns) Electrical redesign: 12V → 48V systems (25% efficiency gain), larger UPS, modular units.
AI compute density is rewriting mechanical and electrical engineering standards.
Sources:Electronics Cooling — Will PFAS Be the Death of Two-Phase Cooling? (Jun 2024), Power Electronics News — 48V: The New Standard for High-Density, Power-Efficient Data Centers (Aug 2016)
Location Trends
Old hubs: Northern Virginia, Santa Clara = grid strain.
New builds: Indiana, Iowa, Wyoming (cheap power + land).
Nuclear colocations: Amazon + Talen Energy (2024).
Future trend: on-site microgrids, fuel cells, SMRs, space
Emerging frontier: Space-based data centers with companies like StarCloud, or deep sea data centres with companies like Microsoft, are exploring orbital or deep sea data processing to eliminate cooling costs and harness solar power directly in orbit.
Power availability, not proximity, defines tomorrow’s data map.
Sources:McKinsey Exhibit 4, Nuclear Newswire — Amazon Buys Nuclear-Powered Data Center from Talen (Mar 2024), Microsoft Research — Project Natick Underwater & Space Edge Computing
Investment Wave
$1T+ needed across the global ecosystem by 2030.
$250B+ for cooling, power, and modular systems.
Major moves:
Blackstone × Digital Realty: $7B JV (Frankfurt, Paris, Virginia)
Supermicro Expansion — U.S. & Taiwan (Apr 2019)
HCLTech × Schneider Electric — APAC Sustainability Builds (Jul 2023)
Every layer of the data centre stack: construction, power, and hardware, is becoming an investable frontier.
Sources: Blackstone (2023), HCLTech (2023), McKinsey (2024)
Why This Matters for Founders & LPs
Founders: Build the Rails, Not Just the Models
The next wave of AI giants won’t build models, they’ll build the infrastructure that models run on.
From custom silicon to cooling, orchestration, and energy systems, this is where the hardest problems and highest margins live.
Founders who solve for latency, power density, and inference efficiency will own the foundation every future product depends on.
Infrastructure startups may be capital intensive, but they’re deeply defensible. Once you’re embedded in an enterprise’s AI stack, you don’t get ripped out.
This is your chance to own the rails of the intelligence economy, where switching costs are massive, and scale compounds advantage.
LPs: The New Industrial Base of Intelligence
For investors, AI infrastructure represents the new oil, refineries, and highways of the digital age.
It’s not glamorous, it’s foundational. These are assets that generate predictable, compounding revenue streams while underpinning trillion dollar ecosystems.
Over $1 trillion will be needed by 2030 to expand global compute, cooling, and power capacity.
Funds that move early into data centre construction, clean energy partnerships, and silicon startups, will capture the hardest to replicate moats of the next decade.
Like railroads in the 1800s and telecom in the 1990s, the infrastructure being built today will determine who owns the future of AI.
Every model, every LLM, every humanoid robot, they all depend on this backbone of electricity, heat, and steel.
Final Takeaway
AI’s bottleneck is no longer algorithms, it’s infrastructure.
The next trillion dollar winners will be those who scale compute fastest, cleanest, and smartest.
Whether you’re a founder building the rails or an LP financing them, this is the moment to stake your claim in the infrastructure of intelligence.
AI Infra Company Spotlights
Layer | Company | Focus Area | Notable Highlights |
---|---|---|---|
Chips & Compute | Nvidia | GPU acceleration | Dominates >80% of AI training compute; CUDA ecosystem moat |
Zettascale (Exa) | Modular chiplet AI processors | Space-efficient architecture targeting Zettascale compute with lower latency and power draw | |
Tenstorrent | AI inference and training silicon | Founded by chip veteran Jim Keller; focus on scalability and open hardware | |
Etched.ai | LLM-specific ASICs | Fixed-function accelerators for transformer workloads with 10x speedup claims | |
Mythic | Analog compute chips | Ultra-efficient edge AI for vision and robotics | |
Inference & ModelOps | Groq | Language Processing Units (LPUs) | Real-time token streaming for low-latency inference |
Baseten | Model deployment & orchestration | Simplified serverless platform for production LLMs | |
Modal | Cloud-native inference serving | Developer-friendly compute orchestration platform | |
OctoML | Cost optimization for inference | Model compilation & runtime acceleration for multi-cloud environments | |
Cooling & Power Systems | ZutaCore | Liquid & two-phase cooling | Direct-on-chip liquid cooling with 50%+ energy savings |
Submer | Immersion cooling systems | Used by hyperscalers for dense compute racks | |
Schneider Electric | Power distribution & efficiency | Partnering with HCLTech and hyperscalers on sustainable builds | |
Data Centers & Power Infrastructure | CoreWeave | GPU cloud infrastructure | 45,000+ GPUs across 28 global sites; low-latency AI hosting |
Digital Realty × Blackstone JV | Hyperscale data center expansion | $7B joint venture in Frankfurt, Paris, Virginia | |
Brookfield Renewable | Clean energy data center power | Partnering with hyperscalers for sustainable AI capacity | |
StarCloud | Space-based data centers | Using orbital solar power and the vacuum of space for natural cooling and near-zero energy waste | |
Vertical Infra Platforms | Lambda Labs | AI training & cloud | GPU cloud with custom hardware for AI researchers |
Voltage Park | Decentralized AI compute leasing | Backed by the Ethereum Foundation; renting compute for LLM training | |
Cerebras Systems | Wafer-scale AI chips & clusters | Massive single-chip compute optimized for AI workloads |
Read More:
- McKinsey: How Data Centers and the Energy Sector Can Sate AI’s Hunger for Power (Jun 2024)
- CBRE: Global Data Center Trends 2024
- JLL: Data Centers 2024 Global Outlook
- Data Center Frontier: Energy-Sector Partnerships for AI Infrastructure (2024)
- Microsoft Research: Project Natick Underwater & Space Edge Computing
- Reuters: Data Centres in Space? Jeff Bezos says it’s possible.
- Brookfield: AI Infrastructure takes Centre Stage