By Brad Casemore in AI — Aug 21, 2024

Exploring the Iron Law of Cloud-Era Datacenter Infrastructure

On more than occasion in the recent past, I’ve discussed why I believe Nvidia’s impressive sprint to nosebleed heights on the stock market has set the stage for a descent to a more modest plateau. In the end, Nvidia’s inevitable decline in market valuation will result from its outsize success rather than from strategic or tactical missteps.

Despite the relentless hype about genAI, the technology does not represent a revolutionary epoch in the history of information technology. GenAI remains a potentially useful, productivity-enhancing technology, more for some industries and professions than for others, but it has been egregiously oversold as a miraculous panacea that will imminently remedy all the problems of society and industry.

In the longer term? Well, AI might eventually have broader and deeper applicability, especially if it evolves beyond rudimentary probabilistic models and gets closer to approximating the fever dream of reasoning and thinking machines posited by artificial general intelligence (AGI). That prospect, if it ever comes, is more than a decade out, discouraging some observers while comforting others. Unfortunately for the would-be purveyors of AGI, savvy investors and responsible enterprise IT departments generally aren’t deploying capital today for a speculative payoff that’s more than a decade away from coming to fruition. Far-sightedness has its limits, particularly when money is involved. You can only tell loan sharks so many times that you'll pay them tomorrow before a tire iron makes an unwelcome appearance.

In our immediate circumstances, we are still living in what we might call the Cloud Era, which began in about 2007 and will likely continue for another decade. In this context, GenAI represents not a self-sustaining market in its own right, but more a booster rocket for accelerated cloud growth. Relatively new and pre-existing workloads were already migrating to the major clouds (AWS, Microsoft Azure, Google Cloud) at a brisk pace; genAI is another migratory class of application destined for the cloud. Yes, you might object and say that some genAI is running in enterprise on-premises environments, but the trend toward cloud deployments is well established and is set to predominate for years to come.

As it happens, I now have some third-party data to support my thesis, and while the uncharitable among us might accuse me of indulging confirmation bias, I think you’ll find that the preponderance of evidence supports the following verdict: Cloud still rules, and genAI is not a major new narrative in the history of information technology; instead, genAI is but a new chapter in the unfolding tale of cloud supremacy. All empires decline and fall, as history demonstrates, but, even in the accelerated blur of tech years, cloud empires are nowhere near their death throes.

GenAI Serves at the Pleasure of the Cloud

Let’s now consider some data, which come from two market-research firms. First, there is this assessment from Synergy Research, contained in a recent press release:

The number of large data centers operated by hyperscale companies recently passed the 1,000 milestone, and new data from Synergy Research Group shows that they now account for 41% of the worldwide capacity of all data centers. Just over half of that hyperscale capacity is now in own-built, owned data centers with the balance being in leased facilities. With non-hyperscale colocation capacity accounting for another 22% of capacity, that leaves on-premises data centers with just 37% of the total. This is in stark contrast to six years ago, when almost 60% of data center capacity was in on-premises facilities. Looking ahead to 2029, hyperscale operators will account for over 60% of all capacity, while on-premises will drop to just 20%. Over that period, the total capacity of all data centers will continue to rise rapidly, driven primarily by hyperscale capacity growing almost threefold over the next six years. While colocation share of total capacity will slowly decrease, colocation capacity will continue to rise steadily. On-premises share of the total will drop by almost three percentage points per year, though the actual capacity of on-premises data centers will remain relatively stable.

When Synergy refers to “on-premises datacenters” (gratuitous aside: I spell “datacenters” as one word but Synergy does not. Eventually everybody will contract it into one word, but, alas, we’re not there yet), they mean on-premises enterprise datacenters. Aside from the expansion of cloud hegemony, with the rewards reaped by Big Three of the cloud, a significant role will continue to be played by co-location and interconnection-oriented providers such as Equinix and Digital Realty, through edge environments and in emerging markets where cloud providers have yet to build their own regions and availability zones (or have yet to build a cogent business case for doing so).

Despite some regional differences, Synergy found the overall trend was globally consistent. John Dinsdale, Chief Analyst at Synergy Research Group, makes the following observation:

In 2012 enterprises spent twelve times as much on their data center hardware and software as they did on cloud infrastructure services, while today they spend three times more on cloud services then they do on their own data center infrastructure,” said John Dinsdale, a Chief Analyst at Synergy Research Group. “Add to that the huge growth in SaaS and consumer-oriented digital services such as social networking, e-commerce and online gaming, and the result is the burgeoning growth in hyperscale data centers. Enterprises are also choosing to house an ever-growing proportion of their data center gear in colocation facilities, further reducing the need for on-premises data center capacity. The rise of generative AI technology and services will only exacerbate those trends over the next few years, as hyperscale operators are better positioned to run AI operations than most enterprises.”

I could cavil here and advise that either accentuate or intensify might have been a preferable word selection than exacerbate, which pejoratively suggests that these trends are undesirable, akin to a chronic skin condition. Still, I find it exceedingly difficult to take issue with the conclusion Dinsdale and the Synergy team have reached.

Another Brick in the Cloud Datacenter

When I was at IDC, not that long ago, we were seeing similar trends, and I’m reasonably confident that, if I had more spare time and searched more diligently, I could find similar data to Synergy’s from IDC. These are secular trends, and all the major market researchers are documenting, qualifying, and quantifying them in their own ways.

Now that we’ve considered Synergy’s findings, let’s turn our attention to data from Dell’Oro Group, which has revised upward its datacenter capex forecast to indicate a 24-percent compound annual growth rate (CAGR) by 2028, driven heavily by demand for AI-related datacenter infrastructure.

Quote from Dell’Oro:

“AI has the potential to generate more than a trillion dollars in AI-related infrastructure spending in cloud and enterprise data centers over the next five years,” said Baron Fung, Senior Research Director at Dell ‘Oro Group. “AI infrastructure, which includes servers with GPU or custom accelerators, along with dedicated networking, storage, and facilities, are highly capital-intensive. While the industry continues to assess the potential return on AI-related investments, major efforts have been underway in the ecosystem in achieving long-term sustainable capex growth,” explained Fung.

Dell’Oro also forecasts that the top four US-based cloud service providers — Amazon, Google, Meta, and Microsoft— will account for half of global datacenter capex as early as 2026.

Taken together, these data points, and probably many others besides, suggest the growing domination of the major cloud providers, with genAI providing a significant supporting role in consolidating the cloud imperium.

Nonetheless, even empires have problems. A problem for the cloud overlords is represented by the rising costs of datacenters and datacenter infrastructure associated with genAI. A lot of money and opportunity cost is allocated to genAI, and while the technology still has many true believers, it is no longer heresy to question whether genAI will fulfil the industry’s exalted expectations.

On recent post-earnings conference calls hosted by the cloud giants. we have witnessed the deepening consternation of Wall Street analysts and institutional investors, aghast at the rising tide of capex outlays in pursuit of genAI. Investors and market watchers would be less concerned if genAI services were proving indispensable to enterprise buyers and consumers, but the commercial reception and early results have been tentative. Meanwhile, as the capex spending on genAI infrastructure soars, CFOs reach for the Pepto-Bismol (or for other benumbing elixirs) before they head into meetings with the Wall Street cognoscenti. Investors are counting on genAI to deliver tangible returns on investment, not to mount an existential theatrical production called Waiting for genAI.

A significant portion of spending on genAI infrastructure is allocated to GPUs and AI accelerators. This is why and how Nvidia climbed to the top of the tech-stock charts. Consequently, the cloud giants consider that particular line item an area where they must contain or, ideally, reduce capital expenditures. Nvidia faces its own market pressures from Wall Street analysts and investors, which means it’s not inclined to help the hyperscalers with their dilemma, at least not the degree that the hyperscalers require assistance. Besides, GPUs and AI accelerators have now become integral genAI infrastructure for hyperscale cloud providers, and that means they now want to have such technology customized and tailored to accord with the business and operational requirements of their datacenters, which are the engines of their business empire.

Succeed at Your Peril

This dynamic creates an interesting paradox, one which we might even be able to develop into an iron law of cloud infrastructure: A third-party infrastructure supplier, such as Nvidia, can achieve exceptional revenue growth and enviable profitability selling a relatively unique form of IT infrastructure to the cloud giants, but the relationship is inherently volatile and unsustainable. As soon as the infrastructure in question becomes essential and material to the business growth and long-term competitive success of the hyperscale cloud customer, the latter seeks a means of reducing or eliminating its reliance on the technology supplier by designing the technology in question in-house, thus achieving control over the availability, reliability, suitability, and optimization of the technology in question.

The Nvidias of the world can win big money at the cloud era’s picks-and-shovels concession, but not indefinitely. Once success is meaningfully attained at scale, the stage is set for an inevitable rollback. The cloud giants have a track record along these lines already, as evidenced by the in-house design and sourcing strategies they’ve set for other types of datacenter infrastructure, including servers and datacenter switches.

I believe this pattern will repeat itself for the next several years with other emerging forms of IT infrastructure that gradually become more valuable to hyperscalers. For a time, the hyperscalers will be content to benefit from the creativity, innovation, and R&D initiatives of third-party suppliers, deploying third-party vendors’ products in areas that are relatively nascent revenue generators. In time, however, as the revenues increase in the businesses predicated on the newer types of infrastructure, the calculus will shift toward design/build rather than buy. We can expect such a scenario to recur across technologies and markets within cloud domains for the next several years.

What form will these technologies take? I’m not omniscient, nor do I prophesize in the ambiguous style Nostradamus, but I’d offer that cooling technologies, off-grid energy technologies, and other forms of specialized processing (beyond the GPU) are reasonable candidates. In the grand old world of networking, I expect more networking silicon to be designed by cloud giants rather than sourced from third parties. Timelines for each potentiality depend on business exigencies and the concomitant ordering of priorities.

What’s important to remember is that a vendor’s loss of cloud patronage should not be viewed as stigma on their character or a dishonor to their reputation. If you supply a technology that becomes critical to the delivery of a lucrative cloud service or product, you should expect that the cloud giant will ultimately reduce its dependence on your technology, eventually reaching a point where it replaces that technology wholesale. Success, not failure, leads a vendor to such a fateful paradox. Put another way, a downfall only happens after great heights have been scaled.

It's not personal. It’s just business during the cloud era.

GenAI Serves at the Pleasure of the Cloud

Another Brick in the Cloud Datacenter

Succeed at Your Peril

Subscribe to Crepuscular Circus