Artificial intelligence (AI) is reshaping industries across the globe, offering the potential to drive efficiency, innovation, and enhanced customer experiences. Yet, for many enterprises, the path to successfully implementing in-house AI remains unclear. While generative AI services provided by major tech companies have captivated attention, enterprises are increasingly considering self-hosting AI solutions as a means to better control their data, customize models, and integrate AI more deeply into business processes. However, the journey toward in-house AI is fraught with challenges, from infrastructure needs to model selection and deployment strategies. This article explores the current state of in-house AI, informed by feedback from industry experts and enterprises, as well as the practical steps required to navigate this rapidly evolving landscape.
The Shift Toward Self-Hosting AI: Realism Over Hype
As AI adoption grows, many enterprises are beginning to take a more pragmatic view of the technology. According to feedback from 292 enterprises, collected by industry experts, 164 stated that they believe the true benefits of AI will come from self-hosting, rather than relying on public generative AI services. These organizations recognize that while public AI services, such as ChatGPT, offer impressive capabilities, they may not provide the level of control, customization, or security that businesses need. However, this shift toward self-hosted AI is not without its challenges of the 164 enterprises in favor of self-hosting, only 105 felt they had a clear understanding of what such an undertaking would involve, and just 47 expressed confidence in their ability to implement it effectively.
One of the reasons enterprises are gravitating toward self-hosting is the realization that public AI services often fail to meet specific business needs. The promise of AI driving operational efficiencies and boosting profitability has been tempered by the reality that most AI services are designed for general-purpose use cases, and may not align with the unique demands of a business. As one CIO explained, “You can’t buy hardware in anticipation of your application needs. You have to start with what you want AI to do, and then ask what AI software is needed.” This shift in mindset underscores the need for enterprises to approach AI deployment from a strategic perspective, focusing on the business outcomes they hope to achieve, rather than getting caught up in the hype surrounding AI technology.
Despite the challenges, the demand for self-hosted AI is evident. Industry vendors such as Cisco, Juniper, and AI model providers are actively promoting self-hosting solutions, eager to capitalize on the growing interest. For enterprises that are serious about building in-house AI capabilities, the key lies in addressing the infrastructure, data management, and model selection issues that currently complicate the path to implementation.
The Infrastructure Challenge: GPUs, Networks, and Cluster Management
Building a robust infrastructure for in-house AI is one of the most significant challenges enterprises face. Many companies mistakenly believe that AI deployments begin with purchasing GPU chips and data center equipment. However, enterprises with experience in AI hosting emphasize that hardware should not be the starting point. As the aforementioned CIO advised, “You can’t buy hardware in anticipation of your application needs.” Instead, enterprises must first define their AI goals, identify the right AI software, and then plan the necessary infrastructure around those requirements.
When it comes to AI infrastructure, GPUs are often at the center of the conversation. Enterprises tend to think of Nvidia GPUs as the gold standard for AI processing, but in practice, they typically buy servers with GPUs included, from vendors such as Dell, HPE, and Supermicro. The number of GPUs deployed varies widely across enterprises, with some organizations committing to fewer than 100, while others invest in upwards of 600. However, many enterprises report that their initial GPU investments were either too high or too low, leading to further adjustments as they fine-tune their AI hosting strategies. Ultimately, most enterprises expect to deploy between 200 and 400 GPUs, with only a small percentage planning to use more than 450.
In addition to GPU selection, enterprises must also make critical decisions about their networking infrastructure. One common debate is whether to use Ethernet or Infiniband for AI workloads. This debate, however, appears to have been resolved in favor of Ethernet, with most enterprises opting for high-speed Ethernet networks to support AI deployments. Specifically, 800G Ethernet with Priority Flow Control and Explicit Congestion Notification is recommended by enterprises, as it offers the speed and reliability needed to handle large-scale AI workloads. As a result, enterprises are increasingly designing AI clusters as separate, fast cluster networks, distinct from their standard servers, to ensure optimal performance.
The complexity of AI infrastructure extends beyond hardware and networking. Enterprises that plan to host multiple AI applications must also consider how to manage these applications across different clusters. In some cases, organizations may need to deploy multiple AI clusters to handle different workloads, such as customer support chatbots, financial analysis tools, and business intelligence applications. Ensuring that these models operate independently and securely within their respective clusters is critical, as mixing AI applications can lead to unintended consequences, such as a customer support chatbot inadvertently accessing sensitive financial data.
Choosing the Right AI Model: LLMs vs. SLMs
Selecting the right AI model is another key consideration for enterprises embarking on self-hosted AI deployments. While many organizations initially assume they need to host large language models (LLMs), such as those used in public AI services like ChatGPT, the reality is more nuanced. Approximately one-third of enterprises pursuing self-hosted AI continue along this path, but two-thirds now believe that open-source models, or specialized AI models, are more suitable for their needs.
The rise of small language models (SLMs) offers a compelling alternative to LLMs. SLMs are smaller in terms of the number of parameters, and they are typically trained on specialized data for specific business use cases. This makes them a better fit for enterprises with targeted AI applications, such as presale marketing, post-sale customer support, or business analytics. Unlike LLMs, which require massive amounts of data and computational resources, SLMs are more efficient, reducing the risk of “hallucinations” (i.e., generating inaccurate or nonsensical responses) and producing more relevant results for specialized tasks.
Enterprises that have adopted SLMs report positive outcomes. Fourteen organizations using specialized SLMs found that these models not only delivered better results but also saved money by reducing hosting costs. The success of SLMs in specific use cases suggests that enterprises should carefully evaluate their AI requirements before committing to large-scale LLMs. For many, the right approach may be to work with trusted AI vendors to identify models that are tailored to their industry and mission, rather than defaulting to the largest or most popular models.
The Role of AI in Business Analytics and Customer Support
In-house AI deployments are most commonly applied in two key areas: business analytics and customer support. Enterprises that prioritize AI for business analytics are often driven by the need to gain deeper insights into their operations, identify trends, and make data-driven decisions. IBM’s watsonx platform is a popular choice for enterprises in this space, offering robust AI tools for business intelligence and predictive analytics. Other enterprises are turning to Meta’s Llama, which has quickly gained traction as a leading AI model for analytics applications, surpassing competitors like BLOOM and Falcon.
Customer-facing AI chatbots are another common application for in-house AI, particularly in the healthcare vertical and industries with high volumes of customer interactions. Enterprises implementing chatbots for customer support are increasingly drawn to SLMs, as these models offer more focused and accurate responses, reducing the likelihood of errors or irrelevant answers. For businesses that rely heavily on customer engagement, such as retail or telecommunications, AI chatbots provide a scalable solution for handling routine inquiries, freeing up human agents to focus on more complex issues.
These applications illustrate the growing role of AI in transforming how businesses operate, both in terms of improving internal processes and enhancing the customer experience. However, they also highlight the importance of selecting the right AI model and infrastructure to meet specific business needs. Enterprises that approach AI with a clear understanding of their objectives and a strategic plan for implementation are more likely to succeed in realizing the full benefits of the technology.
Testing, Iteration, and Continuous Improvement
The final piece of the AI puzzle for enterprises is the importance of testing and continuous improvement. AI deployments are not static—they require ongoing testing, refinement, and updates to ensure that models remain accurate and relevant as business conditions change. Enterprises that have successfully implemented AI emphasize the need for rigorous testing at every stage of the deployment process. From selecting models to configuring infrastructure, enterprises should take the time to assess their options, experiment with different setups, and validate performance before committing to a full-scale rollout.
In addition to initial testing, enterprises must also be prepared to update their AI models regularly. AI systems are only as good as the data they are trained on, and as business needs evolve, so too must the AI models that support them. Regular testing helps ensure that models are aligned with current business goals, comply with regulatory requirements, and continue to deliver value. As one enterprise leader put it, “AI isn’t out for your job, but like you, it needs refresher courses as things change.”
In conclusion, enterprises looking to embrace in-house AI face a complex but promising landscape. While there are many challenges to overcome, including infrastructure planning, model selection, and ongoing testing, the potential benefits of self-hosted AI are significant. By taking a strategic approach and focusing on practical outcomes, enterprises can build AI systems that drive innovation, improve efficiency, and enhance their ability to compete in an increasingly AI-driven world. The key to success lies in careful planning, continuous testing, and a willingness to adapt as the technology—and the business environment—continues to evolve.