It seems to be common practice in the IT industry to use the word “lean” to describe processes that need to be more efficient and cost-effective. Generative AI is no exception. If you haven’t noticed, the systems that some enterprises want to run cost millions of operating dollars and require huge amounts of gigawatts from the grid. No wonder many enterprises ask AI architects to provide a more efficient or lean solution.
Naturally, enterprises look to public cloud providers to help them fast-track into generative AI. After all, public clouds offer complete ecosystems at the press of a dashboard button. Indeed, large cloud providers have seen revenue increases from this initial wave of spending on AI. However, countless enterprises have found that using the cloud can lead to higher operating costs than traditional systems in their data centers. Despite this, the focus remains on using the cloud, so companies are exploring ways to employ cloud costs more effectively. This is where the concept of lean AI comes into play.
How does lean AI work?
Lean AI is a strategic approach to artificial intelligence that emphasizes efficiency, cost-effectiveness, and minimal resource consumption while delivering maximum business value. Many lean AI methods are borrowed from lean methodologies initially used in manufacturing and product development.
Lean AI focuses on optimizing the development, deployment, and operation of AI systems. It employs smaller models, iterative development practices, and resource-efficient techniques to reduce waste. By prioritizing agile, data-driven decision-making and continuous improvement, lean AI allows enterprises to harness the power of AI in a sustainable and scalable manner. This ensures that AI initiatives are both impactful and economically feasible.
Today, enterprises are realizing that bigger is not necessarily better. The transforming landscape of enterprise AI is marked by smaller language models (SLMs) and a wave of open source advancements. This evolution is a direct response to the considerable costs and resource demands imposed by generative AI systems that use LLMs (large language models). Many enterprises now want to reassess the balance between costs and business value.
The challenges with LLMs
Large language models such as OpenAI’s GPT-4 and Meta’s Llama have demonstrated extraordinary capabilities in understanding and generating human language. Yet, these strengths come with many challenges that have become increasingly difficult for enterprises to justify. These models’ computational demands and corresponding cloud costs are very high, straining budgets and limiting broader adoption. There’s also the issue of energy consumption, which poses a financial burden as well as significant environmental implications.
Operational latency presents another hurdle, especially for applications that require real-time responsiveness. And let’s not overlook the complexity of managing and maintaining these behemoth models, which demand specialized expertise and infrastructure that isn’t readily available to all organizations.
Shifting to SLMs
This backdrop has accelerated the adoption of SLMs for generative AI deployment in cloud and non-cloud environments. These are increasingly viewed as practical alternatives. SLMs are designed to be significantly more efficient regarding computational resource requirements and energy consumption. This means lower operational costs and a more attractive return on investment for AI initiatives. Faster training and deployment cycles also make SLMs more appealing to enterprises that need agility and responsiveness in a fast-paced market.
Enterprises will generally not use LLMs, so suggesting they will is unrealistic. Instead, they will build more tactically focused AI systems to solve specific use cases, such as equipment maintenance, transportation logistics, and manufacturing optimization, areas where lean AI approaches can return immediate business value.
SLMs also sharpen customization. These models can be finely tuned for specific tasks and industry domains, yielding specialized applications that produce measurable business outcomes. Whether in customer support, financial analysis, or healthcare diagnostics, these leaner models prove their effectiveness.
The open source advantage
The open source community has been a driving force behind the advancement and adoption of SLMs. Meta’s new iteration, Llama 3.1, offers a range of sizes that deliver robust capabilities without excessive resource demands. Other models, such as Stanford’s Alpaca and Stability AI’s StableLM, demonstrate that the performance of smaller models rivals or surpasses that of their larger counterparts, especially in domain-specific applications.
Cloud platforms and tools from Hugging Face, IBM’s Watsonx.ai, and others are making these models more accessible and reducing entry barriers for enterprises of all sizes. This democratization of AI capabilities is a game-changer. More organizations can incorporate advanced AI without relying on proprietary, often prohibitively expensive solutions.
The enterprise pivot
From an enterprise perspective, the advantages of embracing SLMs are multifaceted. These models allow businesses to scale their AI deployments cost-effectively, an essential consideration for startups and midsize enterprises that need to maximize their technology investments. Enhanced agility becomes a tangible benefit as shorter deployment times and easier customization align AI capabilities more closely with evolving business needs.
Data privacy and sovereignty (perennial concerns in the enterprise world) are better addressed with SLMs hosted on-premises or within private clouds. This approach satisfies regulatory and compliance requirements while maintaining robust security. Additionally, the reduced energy consumption of SLMs supports corporate sustainability initiatives. That’s still important, right?
The pivot to smaller language models, bolstered by open source innovation, reshapes how enterprises approach AI. By mitigating the cost and complexity of large generative AI systems, SLMs offer a viable, efficient, and customizable path forward. This shift enhances the business value of AI investments and supports sustainable and scalable growth. I believe that when it comes to sustainable and affordable enterprise AI, we will soon live in a small, small, world.