Skip to content

AI’s Hidden Battlefield: Data Centers, Power, and the Race to Scale Compute

AI Infrastructure The Backbone of the Intelligence Boom

Why AI Infrastructure Is the Real Arms Race

While ChatGPT and humanoid robots dominate headlines, a quieter revolution is underway beneath the surface, the race to build, optimize, and own the infrastructure that powers AI. From chips to cooling, data centres to orchestration software, the battle for long term defensibility in AI is being fought in steel, silicon, and megawatts.

Investors and founders are now looking beyond the model, toward the new software, silicon, and systems required to support exponential compute.


Key Trends We’re Watching

Inference: The New Frontier

The AI hype used to be all about training models, but today, the real action (and cost) is in inference (the process of running models at scale for billions of end users.) As large language models become embedded in products, startups that optimize inference speed and cost stand to win big.

  • Groq’s LPUs deliver ultra fast token streaming for real time AI experiences.
  • Platforms like Baseten, Modal, and OctoML are pioneering cost aware, low latency inference serving with built in model operations.

Inference optimization is the next big moat, watch for breakthroughs in runtime acceleration and compression tech.

The Cooling & Power Challenge

As AI clusters balloon in density, from 8 kW per rack in 2021 to 30 kW+ projected by 2027, conventional air cooling is becoming obsolete. Liquid and immersion cooling systems, championed by innovators like ZutaCore, are now table stakes. Two-phase cooling faces PFAS regulatory risks, pushing the industry toward new materials and direct-to-chip designs capable of handling 120 kW per rack for training workloads like GPT scale models.

But the cooling challenge is sparking innovation far beyond Earth.
Emerging players like StarCloud are exploring space based data centres, leveraging direct solar energy for power and the natural cold vacuum of space as a near free cooling medium. By hosting compute in orbit, these systems could bypass terrestrial limits on grid power, cooling costs, and environmental constraints, while delivering unprecedented energy efficiency. Recently Jeff Bezos was discussing space based data centres, you can read it here.

Cooling and power efficiency, from liquid immersion on Earth to radiative cooling in orbit will determine which operators can scale AI sustainably and competitively across the next decade.

AI-Native DevOps & ModelOps

Traditional DevOps tools can’t handle the complexity of AI models. The next wave of infrastructure is AI native orchestration, platforms that combine deployment, cost optimization, and real time monitoring for large models.

Companies like ModalOctoML, and Baseten are leading this shift, merging DevOps and ModelOps into a unified stack that allows engineers to deploy foundation models with minimal latency and maximum observability. 

Expect rapid consolidation as infrastructure players merge compute, orchestration, and monitoring into end-to-end AI infra suites.

Verticalized AI Infrastructure

Just as cloud software became verticalized by industry, AI infrastructure is following suit, tailored stacks for healthcare, finance, defense, and beyond:

  • Healthcare: HIPAA-compliant AI for diagnostics and imaging.
  • Security: Edge-based inference for real-time threat detection.
  • Enterprise: Private model hosting with audit trails for regulated industries.

Vertical AI stacks offer defensibility and high margins, the new “application layer” for infra founders.


Custom Silicon on the Rise

While Nvidia dominates today’s landscape, the next decade belongs to custom silicon chips built for LLM inference, training, and edge optimization.

Startups like TenstorrentMythicEtched.ai, and Zettascale (formerly Exa) are designing next-generation AI accelerators optimized for extreme efficiency and tailored compute workloads. Zettascale’s focus on modular chiplet architectures and ultra-dense interconnects aims to push performance beyond the limits of monolithic GPUs, reducing both latency and power draw.

Meanwhile, big tech giants, such as Google (TPU), Amazon (Trainium and Inferentia), and Meta (MTIA) are doubling down on in house silicon to cut inference costs and secure their supply chains.

The chip layer will be one of the fiercest battlegrounds of the AI economy. From Zettascale’s specialized architectures to hyperscaler built processors, custom AI chips will define efficiency, cost, and access to compute, the foundation of long term defensibility in the intelligence era.

Compute Supply & Demand


Global data center demand: 60 GW (2023) → 171–219 GW by 2030 (+19–22% CAGR).
High case: up to 298 GW (+27% CAGR).
AI-ready centers: ~70% of all capacity by 2030.
Generative AI: ~40% of total global load.

AI workloads are the primary growth driver. The global compute demand will triple by the decade’s end.
 
Sources: McKinsey Data Center Demand Model, Exhibits 1–2
 


U.S. deficit risk: >15 GW by 2030.
Vacancy rates: <1% in Northern Virginia (world’s largest data hub).
Colocation pricing: +35% (2020–2023).
Constraints: grid power, transformers, labor shortages.
Policy limits: Ireland halted new grid connections until 2028.

The race to expand capacity is constrained by power and infrastructure, not capital.

Sources: CBRE — North America Data Center Trends H2 2023, JLL — Data Centers 2024 Global Outlook, EirGrid — Ireland Capacity Outlook 2022–2031, CBRE — Global Data Center Trends 2024

Hyperscaler Control

Hyperscalers: AWS, Azure, Google, Baidu control >50% of AI-ready capacity.
By 2030: 60–65% of AI workloads hosted on hyperscalers; 35–40% private.
GPU Clouds: CoreWeave → 45k GPUs, 28 global sites (2024).
Partnerships: hyperscalers + colos expanding in tandem.

The cloud giants will remain the backbone of global AI compute.

Sources:McKinsey Exhibit 3, Data Center Frontier — CoreWeave, Chirisa Tap Bloom Energy for Illinois AI Data Center Project (Jul 2024)

Infrastructure Shift

Facility scale: 30 MW (2013) → 200 MW+ (2024).
Rack density: 8 kW (2021) → 17 kW (2023) → 30 kW+ (2027).
Training workloads: up to 120 kW/rack (e.g., GPT-scale models).
Cooling evolution:RDHX: 40–60 kWDirect-to-Chip: 60–120 kWLiquid immersion: 100–150+ kW (PFAS concerns) Electrical redesign: 12V → 48V systems (25% efficiency gain), larger UPS, modular units.

AI compute density is rewriting mechanical and electrical engineering standards.

Sources:Electronics Cooling — Will PFAS Be the Death of Two-Phase Cooling? (Jun 2024), Power Electronics News — 48V: The New Standard for High-Density, Power-Efficient Data Centers (Aug 2016) 

Location Trends

Old hubs: Northern Virginia, Santa Clara = grid strain.
New builds: Indiana, Iowa, Wyoming (cheap power + land).
Nuclear colocations: Amazon + Talen Energy (2024).
Future trend: on-site microgrids, fuel cells, SMRs, space
Emerging frontier: Space-based data centers with companies like StarCloud, or deep sea data centres with companies like Microsoft, are exploring orbital or deep sea data processing to eliminate cooling costs and harness solar power directly in orbit.

Power availability, not proximity, defines tomorrow’s data map.

Sources:McKinsey Exhibit 4, Nuclear Newswire — Amazon Buys Nuclear-Powered Data Center from Talen (Mar 2024), Microsoft Research — Project Natick Underwater & Space Edge Computing 

Investment Wave

$1T+ needed across the global ecosystem by 2030.
$250B+ for cooling, power, and modular systems.
Major moves:

Blackstone × Digital Realty: $7B JV (Frankfurt, Paris, Virginia)

Supermicro Expansion — U.S. & Taiwan (Apr 2019)

HCLTech × Schneider Electric — APAC Sustainability Builds (Jul 2023)

Every layer of the data centre stack: construction, power, and hardware, is becoming an investable frontier.

Sources: Blackstone (2023), HCLTech (2023), McKinsey (2024)
 

Why This Matters for Founders & LPs

Founders: Build the Rails, Not Just the Models

The next wave of AI giants won’t build models, they’ll build the infrastructure that models run on.
From custom silicon to cooling, orchestration, and energy systems, this is where the hardest problems and highest margins live.

Founders who solve for latency, power density, and inference efficiency will own the foundation every future product depends on.

Infrastructure startups may be capital intensive, but they’re deeply defensible. Once you’re embedded in an enterprise’s AI stack, you don’t get ripped out.

This is your chance to own the rails of the intelligence economy, where switching costs are massive, and scale compounds advantage.

LPs: The New Industrial Base of Intelligence

For investors, AI infrastructure represents the new oil, refineries, and highways of the digital age.
It’s not glamorous, it’s foundational. These are assets that generate predictable, compounding revenue streams while underpinning trillion dollar ecosystems.

Over $1 trillion will be needed by 2030 to expand global compute, cooling, and power capacity.
Funds that move early into data centre construction, clean energy partnerships, and silicon startups, will capture the hardest to replicate moats of the next decade.

Like railroads in the 1800s and telecom in the 1990s, the infrastructure being built today will determine who owns the future of AI.

Every model, every LLM, every humanoid robot, they all depend on this backbone of electricity, heat, and steel.

Final Takeaway

AI’s bottleneck is no longer algorithms, it’s infrastructure.
The next trillion dollar winners will be those who scale compute fastest, cleanest, and smartest.
Whether you’re a founder building the rails or an LP financing them, this is the moment to stake your claim in the infrastructure of intelligence.

AI Infra Company Spotlights

LayerCompanyFocus AreaNotable Highlights
Chips & ComputeNvidiaGPU accelerationDominates >80% of AI training compute; CUDA ecosystem moat
Zettascale (Exa)Modular chiplet AI processorsSpace-efficient architecture targeting Zettascale compute with lower latency and power draw
TenstorrentAI inference and training siliconFounded by chip veteran Jim Keller; focus on scalability and open hardware
Etched.aiLLM-specific ASICsFixed-function accelerators for transformer workloads with 10x speedup claims
MythicAnalog compute chipsUltra-efficient edge AI for vision and robotics
Inference & ModelOpsGroqLanguage Processing Units (LPUs)Real-time token streaming for low-latency inference
BasetenModel deployment & orchestrationSimplified serverless platform for production LLMs
ModalCloud-native inference servingDeveloper-friendly compute orchestration platform
OctoMLCost optimization for inferenceModel compilation & runtime acceleration for multi-cloud environments
Cooling & Power SystemsZutaCoreLiquid & two-phase coolingDirect-on-chip liquid cooling with 50%+ energy savings
SubmerImmersion cooling systemsUsed by hyperscalers for dense compute racks
Schneider ElectricPower distribution & efficiencyPartnering with HCLTech and hyperscalers on sustainable builds
Data Centers & Power InfrastructureCoreWeaveGPU cloud infrastructure45,000+ GPUs across 28 global sites; low-latency AI hosting
Digital Realty × Blackstone JVHyperscale data center expansion$7B joint venture in Frankfurt, Paris, Virginia
Brookfield RenewableClean energy data center powerPartnering with hyperscalers for sustainable AI capacity
StarCloudSpace-based data centersUsing orbital solar power and the vacuum of space for natural cooling and near-zero energy waste
Vertical Infra PlatformsLambda LabsAI training & cloudGPU cloud with custom hardware for AI researchers
Voltage ParkDecentralized AI compute leasingBacked by the Ethereum Foundation; renting compute for LLM training
Cerebras SystemsWafer-scale AI chips & clustersMassive single-chip compute optimized for AI workloads

Read More:

Leave a Reply

Your email address will not be published. Required fields are marked *