Generative AI is exploding and will continue to do so. As such, both on-premises and public cloud providers have seen a boom in demand for their AI wares, which will likely continue for at least the next five years.

Cloud is, of course, a core benefit of companies’ intererest in AI. However, this growth spurt may not continue as many believe. CIOs and CFOs are complaining loudly and often about unexpectedly high cloud expenses—about 2.5 times more expensive than they anticipated. With cloud AI on the horizon, they are concerned about the potential for larger, even more unpredictable cloud bills in the future. Everyone wants to find cost-effective alternatives.

The real opportunity in AI may not lie in public clouds, at least not how they’re currently positioned. Despite laying the groundwork and touting their AI readiness, public cloud providers risk losing the market they helped create if they remain tone-deaf to their customers’ concerns and shifts in the marketplace.

The high costs of AI systems

AI workloads are expensive. This is especially true of workloads that involve large language models and other compute-intensive systems. Training a single advanced AI model can cost tens of millions of dollars, with ongoing costs for fine-tuning, retraining, and inferencing. Public cloud providers have the massive infrastructure to handle these tasks but at an increasingly unsustainable price for many enterprises.

Alastair Edwards, chief analyst at Canalys, highlights the dilemma organizations are facing. As enterprises move past the experimental and training phases of AI adoption into production-scale inferencing, the financial costs start to outweigh the benefits.

Cloud computing offers predictable economic benefits, including pay-as-you-go pricing and on-demand elasticity. As AI use cases grow and scale across an organization, these predictable economics quickly lose their luster when companies face around-the-clock usage of hundreds or thousands of GPUs or other resources needed for AI. It’s not that companies don’t see the benefit of using public cloud providers; it’s the growing disparity between costs and benefits.

Further compounding the issue, energy costs are rising worldwide at the same time AI systems demand ever-increasing power for training, cooling, and deployment. A report from IDC suggests that corporate spending on compute and storage hardware for AI deployments grew by 37% in the first half of 2024. It’s interesting to note that a growing portion of that spending is being redirected outside of public cloud providers. Public clouds are still capturing the lion’s share of early-stage AI investments. IDC estimates that AI-enabled systems in cloud and shared environments accounted for 65% of server spend on AI during the first half of 2024. However, as enterprises transition to deploying AI at scale, most find that the economics of sticking with hyperscalers don’t work.

The rise of colocation and microclouds

A new ecosystem of AI infrastructure providers has emerged to fill the growing gaps in cost competitiveness that public clouds are leaving behind. Colocation services, GPU-as-a-service specialists, and hybrid cloud providers offer enterprises an attractive middle ground. These alternatives allow businesses to maintain better control over their AI workloads while sidestepping the runaway expenses of running these systems exclusively on public clouds.

CoreWeave and Foundry are two upstarts in the GPU-as-a-service market. These companies offer heavy investments in GPU capacity and pay-as-you-go models that rival those of hyperscalers. Even legacy players like Rackspace are getting in on the action by launching their own GPU-as-a-service offerings, while colocation providers are also seeing renewed interest.

Unlike traditional public clouds, these approaches are often built from the ground up to handle the unique demands of modern AI infrastructure. This means high-density GPU configurations, liquid cooling systems, and energy-efficient designs. More importantly, they allow enterprises to shift to ownership models or shared resources that cut costs over the long term.

Betting on the wrong business model

Public cloud providers are positioning themselves as the natural home for building and deploying AI workloads. Naturally, the focus at AWS re:Invent 2024 was again on generative AI and how the AWS cloud supports generative AI solutions. Early-stage AI experimentation and pilots have driven a short-term spike in cloud revenue as organizations flock to hyperscalers to train complex models and rapidly test new use cases.

Training AI models on public cloud infrastructure is one thing; deploying those systems at scale is another. By betting on AI, public cloud vendors are relying heavily on consumption-based pricing models. Yes, it’s easy to spin up resources in the cloud, but the cracks in this model are becoming harder to ignore. As companies shift from experimentation to production, long-term, GPU-heavy AI workloads don’t translate into cost efficiencies.

Ironically, cloud providers—who helped create today’s AI gold rush—are in danger of pricing themselves out of their market. The very users they worked so hard to attract are finding that colocation services, GPU-as-a-service providers (microclouds), and other hybrid infrastructure models offer a more sustainable balance between cost, control, and flexibility. If public cloud vendors don’t adjust their business models to address these concerns, they risk being sidelined by players more attuned to AI’s unique demands and economics at scale.

Most days I assume they see this coming, but then I wonder. Public cloud vendors failed to notice other seismic shifts in the market in a timely manner, such as multicloud, finops, and now AI optimization. It’s easy to say, “You can’t turn an ocean liner on a dime,” but is “full speed ahead” the right strategy when you’re about to hit the dock?