🔮 Will genAI cause a compute crunch?
Last year, Google reached a milestone where its spending on compute exceeded its spending on people. This is a watershed moment.
For millennia, until the 1600s, human and animal sweat was the dominant form of work. It was human muscle power that was, in Vaclav Smil’s terms, the “prime mover” in our economies. Today machines have largely taken over this role.
Of course, Google’s cross-over point doesn’t mean human mental efforts have been supplanted by machines. But it does signify the shape the future could take: increasingly high spends on software that relies on computing power. The sweat is now on the computers’ brow.
This is a fundamental shift. And some companies understand it far better than others. If you ask Sam Altman or Satya Nadella or Sundar Pichai what they would do with a thousand or a million more times compute, they will know the answer. That scale is a strategic resource for them. But to what extent is that true for the bosses of the large firms that comprise the bulk of the economy? To what extent is that true for you and your organisations?
Several months ago, I took this question to my friend François Candelon , at the time Global Director of the BCG Henderson Institute (now a partner at the private equity firm Seven2). We agreed that we were both hearing concerns from senior leaders about whether compute would remain affordable, especially in the face of growing demands from AI. That led us to launch this piece of research.
So for the past few months, a team from the BCG Henderson Institute ( Riccarda Joas and David Zuluaga Martínez ) and Exponential View’s Nathan Warren have been working on this question…To what extent will the boom in generative AI impact the availability of affordable computing power?
We built a model with bullish global demand projections and realistic supply constraints. What we found is that the fears are not grounded in reality. Instead of being anxious about a lack of computing power, executives should be gearing up for an abundance of compute. Like Sundar, Sam and Satya, they should all have a clear-sighted view of how their business will change as computation becomes much more widely available.
In today’s Part 1 we will make the argument for why there will be enough compute. Next week, in Part 2, we will propose a framework for how to leverage abundant computing power.
What follows in today’s email is an excerpt of the paper for readers of Exponential View. You can access the full paper here.
Many thanks to François Candelon, Riccarda Joas and David Zuluaga Martínez for collaborating on this research with us!
Introduction
In the race for AI supremacy, computing power has become the new arms race. As generative AI models balloon in complexity, demand for specialised hardware has outstripped Moore’s Law, threatening the digital economy’s foundations. Tech titans are plotting grand schemes: Microsoft and OpenAI’s reported $100bn supercomputer project leads the pack—if realised. Not to be outdone, Elon Musk unveiled Colossus, boasting 100,000 Nvidia processors (with plans to double), while Oracle flexed its muscle with a zettascale cluster sporting 131,072 Blackwell GPUs. Google’s 2023 pivot to prioritise computing infrastructure over personnel spending underscores this compute-intensive AI era. The question remains: can supply keep pace with insatiable demand?
We argue that a nuanced understanding of genAI’s computational demands—specifically, the distinctions between model training and model inference—reveals a less dire outlook. Even under aggressive assumptions about genAI’s growth and compute intensity, our quantitative model indicates that genAI workloads will account for only about 34% of global data centre-based AI computing supply by 2028. Thus, the rise of genAI is unlikely to disrupt the long-standing regime of affordable and widely available computing power.
While other factors, such as the energy required to power data centres, could pose significant constraints, genAI’s computational demands alone are unlikely to outpace the world’s capacity to produce the necessary hardware.
The historical development of computing power
For the past five decades, concerns about computing power supply have been minimal, thanks to two synergistic factors: Moore’s Law and large-scale digitisation. Since around 1970, the number of transistors per chip has approximately doubled every two years, allowing for exponentially more computations per chip. Simultaneously, computing hardware has proliferated globally through data centres, personal computers, smartphones, and a myriad of devices, resulting in an estimated 60% compound annual growth rate in total computing supply since the 1970s.
This abundance has fostered an environment where computing power is both affordable and readily accessible. The prevalence of inefficient or “bad code,” which a 2018 Stripe report estimated costs companies $85 billion annually, suggests that businesses have historically not faced significant computing supply constraints. Much like oil in the early 20th century, computing power has been valuable yet sufficiently plentiful to permit a degree of inefficiency without dire consequences.
Understanding genAI’s computational demands
The central concern is whether genAI will disrupt this equilibrium of ample, affordable computing power. Addressing this requires dissecting the different computational needs associated with genAI, which can be broadly categorised into three types: model training, fine-tuning, and inference.
Understanding that inference—the most compute-intensive phase in terms of aggregate usage—relies less on specialised hardware suggests that fears of an impending computing power scarcity may be overstated. The pivotal question becomes: How extensive will genAI model inference demand be, and how computationally intensive will it become?
Recommended by LinkedIn
Modelling a bullish scenario for genAI inference demand
To explore this question, we constructed a quantitative model projecting a bullish scenario for genAI inference demand against a moderate supply forecast up to 2028. Our model aggregates demand from businesses, governments, and individual consumers, focusing on workloads necessitating AI chips in data centres. Supply is estimated based on the availability of hardware that can handle AI workloads (e.g. GPUs), while acknowledging that some inference can run on less specialised hardware but opting for a conservative approach.
Our findings indicate that even under aggressive assumptions about the growth and intensity of genAI demand, the aggregate global demand for genAI model inference will reach only about 34% of the total available data centre computing power for AI by 2028. This suggests that the rise of genAI is unlikely to outstrip the global capacity for producing the necessary computing hardware.
Key demand assumptions
For digital advertising—a significant potential driver of genAI demand—we independently model a scenario where all ads on Meta’s platforms by 2028 incorporate genAI-powered, personalised images and captions.
We further assume that the intensity of genAI use will grow at a rate comparable to historical mobile data traffic, approximately 60% annually. For individual consumers, we estimate their inference demand at about 15% of total business demand, aligning with ratios observed in services like Microsoft Office 365.
Most of the demand is driven by agentic workflows, which are at the upper end of what is feasible with today’s technology. The crucial point is that even with our very aggressive assumptions, there will be sufficient supply, and genAI inference—reaching ~2e30 FLOPs by 2028—would only take up ~34% of likely global supply.
Supply projections
On the supply side, we adopt moderate growth projections to test the robustness of our conclusions. We estimate the current baseline computing power available for genAI inference based on the quantity of state-of-the-art GPUs in use. Nvidia, a leading GPU manufacturer, shipped approximately 3.8 million data centre GPUs in 2023—a 42% increase from 2022. Accounting for average utilisation rates and technological capabilities, we calculate a baseline supply of roughly ~7e28 FLOPs for 2023.
Looking ahead, we reference industry analyses suggesting that AI computing power will increase by about 60 times by the end of 2025 compared to early 2023 levels. We anticipate this growth rate will moderate to around 60% annually through 2028, resulting in a supply of approximately ~4e30 FLOPs —about 57 times the 2023 figure.
It’s important to note that supply could exceed these estimates due to intensified competition and new market entrants. Companies like AMD have introduced new chips designed for genAI workloads, and hyperscalers are developing specialised chips optimised for inference, such as Google’s TPUs and Microsoft’s Maia 100 chip. This diversification could enhance the total computing supply beyond our moderate projections.
Breaking points beyond computing hardware
Our analysis leads to the conclusion that even with exponential growth in genAI’s computational demands and aggressive adoption rates, the established regime of affordable and widely available computing power is unlikely to collapse due to genAI alone. However, several factors could potentially disrupt this equilibrium:
Encouragingly, advancements in energy-efficient hardware, such as specialised chips designed for inference with lower power consumption, are emerging. Data centres are also adopting innovative cooling systems and leveraging AI to optimise energy use. Additionally, on-site energy generation, including renewable sources, could alleviate pressure on energy grids.
Conclusion
Our exploration suggests that genAI’s rise will not, by itself, outpace the global capacity to produce the required computing hardware. The longstanding regime of affordable and accessible computing power is poised to continue, even under the most aggressive genAI demand scenarios. Therefore, businesses should not prepare for a scarcity of computing resources but should instead focus on leveraging the anticipated abundance of computing power to gain competitive advantages.
Peak Post-Truth Era Human | FinTech Professional | Definitely Not an AI Expert
2moDatabricks VP of AI Rao says only 10% of GenAI projects make it past the lab. So what are they going to compute with all this strategic resource? Accuracy and liability have been cited as a issue and a major impediment in a recent WSJ article. No clear path to ROI besides the sad lonely peole paying for an LLM "companion." Seriously this is a house of cards and sooner or later it will crumble leaving us with ecological catastrophe, trillions in losses and very few actual use cases for GenAI.
Comms Lead, AI @AdeccoGroup | IC+AI Chief Explorer | AI Educator | AI Filter | ➡️ Internal Comms Folk⭐
2moA more nuanced question is around, is all of this worth it and at what cost?
Tech Ethicist | Prolific Author | @DemystifyTech Founder | Responsible AI
2moFascinating analysis, Azeem Azhar. It's reassuring to see that even under aggressive assumptions, the compute supply is likely to meet genAI demand. A few thoughts: The distinction between training, fine-tuning, and inference is crucial. It's good to see this nuanced approach in the analysis. Your point about energy constraints being a more pressing concern than hardware availability is intriguing. How do you see the development of more energy-efficient AI chips and cooling systems playing out? The potential for consumer-driven demand explosion is an interesting wild card. Do you foresee any particular applications that might trigger this? Given your conclusion that compute scarcity isn't likely, what do you think will be the key differentiators for companies in leveraging AI? Will it be more about talent, data, or novel applications? How might geopolitical factors affect this outlook, particularly regarding supply chain disruptions you mentioned? Your analysis provides a valuable perspective for business leaders planning their AI strategies. It suggests the focus should be on innovative applications and efficiency rather than worrying about compute availability. Looking forward to Part 2!
Business Development Leader | Driving Sales Growth & Client Engagement | SaaS & B2B Expert
2moReally interesting take, Azeem! It’s good to know that despite all the buzz around AI, there’s enough compute power to meet the demand. The breakdown between training and inference really puts things into perspective. Looking forward to the next part and how businesses can better leverage this abundance of compute!