Language Models: Going from Large to Small. An Enterprise push for performance, security, deployment, and customizability
Present-day Enterprise AI Adoption
Organizations integrating Generative AI often take two initial approaches. One is simply offering an enterprise ChatGPT that can answer questions about their own proprietary data, which can provide an intuitive productivity boost. The other is integrating large language models (LLM) in business software, such as function-specific chatbots, virtual agents, content generators, etc.
A common and practical way organizations can leverage LLMs in these applications is through a Retrieval Augmented Generation (RAG) architecture. This enables LLM-powered chatbots and agents to retrieve relevant proprietary information from internal databases (e.g. vector database or general enterprise search software) and tailor their responses using prompt engineering techniques.
Building Specialized AI Applications: A Healthcare Case Study
To illustrate the potential of advanced AI integration, let's consider the development of an AI medical scribe agent. This agent would perform charting and note-taking tasks, allowing healthcare providers to focus more on patient care. The process of building such a specialized AI application involves several key steps:
This approach demonstrates how AI can be tailored to specific industry needs, combining vast knowledge bases with real-time data retrieval and customized outputs.
On the topic of healthcare, last year, we wrote about the potential for LLMs to impact various health and biotech areas such as patient consultation, precision medicine, and protein folding. We’ve seen lots of progress on those fronts materialize since that article. Hospitals and primary care providers are adopting AI assistants in various practice settings. Startups are aggregating patient prescription history, lab test results, and wearable device data to provide personalized health goals and advice. Many of these solutions have been implemented using a RAG solution. However, challenges with model security, deployment, and fine-tuning arise from this.
Challenges in Implementing LLM Solutions
While RAG architectures offer significant advantages, they also present several challenges. One major issue is the computational overhead and latency introduced by the retrieval step. In RAG systems, searching through large databases to find relevant information can be time-consuming and resource-intensive, especially for real-time applications. Additionally, prompts may then be directed to an external third-party LLM for processing and response. This can lead to slower response times and increased costs, particularly when dealing with large-scale deployments or high-frequency queries.
Another challenge is the potential for hallucinations when combining retrieved information with the language model's generated text. LLMs may produce responses that contradict or misinterpret the retrieved data, leading to unreliable outputs. This issue becomes untenable for industries like healthcare, law, or finance, where accuracy is critical.
Security concerns pose another significant hurdle, especially in highly regulated industries like healthcare. Hospitals and other healthcare providers cannot simply adopt off-the-shelf solutions like ChatGPT due to strict data privacy regulations and the sensitive nature of patient information. These organizations need to implement robust security measures and often develop custom solutions to circumvent potential security issues. We plan to cover these security challenges and solutions deeply in a future post.
Recommended by LinkedIn
Evolution Towards AI Agents and Small Language Models
As the industry grapples with these challenges, we're observing a shift towards more sophisticated AI agent architectures. Nvidia, with its NIM (Nvidia Inference Microservices) architecture, is driving this evolution with industry partnerships with companies like Hippocratic AI, which was presented during GTC 2024. Small language models (SLMs) will be a key component in enabling and improving these solutions.
Small Language Models offer several advantages that directly address the challenges faced by traditional LLM implementations:
The race to develop and release SLM models has intensified, with tech giants like Meta, OpenAI, Apple, and Microsoft rapidly iterating on their offerings—Llama 3.1 8B, GPT-4.0 Mini, DCLM-7B, and Phi-3-mini, respectively. These companies promise improved performance, cost-efficiency, and customizability, which could reshape the AI landscape and make advanced AI capabilities more accessible to a broader range of organizations, applications, and edge devices.
Specialized Expertise and Industry Collaboration
Implementing SLM-based solutions requires technical knowledge and close collaboration with subject matter experts (SMEs). My experience at Palantir underscores this critical point.
At Palantir, working closely with client SMEs was essential to building valuable data-augmented applications. The same principle applies to AI development. AI engineers must engage intensively with domain experts at every level of the application stack - from data preparation to model fine-tuning to application design.
This collaboration ensures that the AI solution is not just technically sound but practically valuable and tailored to specific industry needs. It bridges the gap between technical capabilities and real-world application, much like I did at Palantir with data solutions.
The challenge lies in finding AI professionals who can effectively engage with SMEs and domain experts willing to dive deep into the AI development process. Organizations that foster this collaborative environment will be well-positioned to develop genuinely transformative AI solutions, presenting promising opportunities for investors in the AI landscape.
Our Continued Investment Focus
As investors in AI and data, we see a significant opportunity to focus on the infrastructure and tools that enable organizations to build, deploy, and manage SLMs effectively. This focus, in turn, can accelerate AI adoption in Good AI's investment verticals of healthcare, automation, and enterprise solutions.
At Good AI, we have a front-row seat to how our portfolio companies and industry partners are leveraging AI into their workplaces and products. If you're a founder advancing the SLM or AI frontier, particularly in areas that support our focus verticals, please reach out. We'd love to hear from you.