Navigating the World of Open-Source Models
We have now spent more than a year living with generative AI tools. There is now a greater understanding of the technology that at its time of introduction felt like a black box, despite the potential that it offered.
Enterprises spent 2023 experimenting with readily available toolsets to see how large language models (LLMs) could impact their business, both internally and externally. While tools developed by BigTech players like Open AI, Google, and Anthropic have dominated the market, we are now seeing a balance where enterprises feel flexibility and customization is the way to go for certain applications.
Today, I will be exploring the world of open source LLMs, its benefits and opportunities, and potential selection criteria for enterprises.
Breaking Free from Constraints
The prevalence of open source LLMs has caught up in the past few months. MistralAI came out with Mixtral, their own competitive version of an open source LLM in December. We announced our own partnership with them last week to embed Mistral AI’s highly efficient foundational models using their broader generative AI architecture.
Open-source models offer faster deployment, greater cost-efficiency, and unique value propositions. Huggingface alone has over 350k models, 75k datasets, and 150k demo apps, all open source and publicly available, in an online platform where we can easily collaborate and build machine learning (ML) models together. Open-source AI fuels rapid innovation through collaboration and shared code. But this very openness necessitates proactive attention to ethical issues like bias, fairness, and explainability to ensure responsible development.
When to opt for open source LLMs
Enterprise decisions on open source versus proprietary LLMs involve various factors. Both options offer benefits but come with shortcomings addressed by the alternative. Let's explore some key considerations.
Customisation
Open Source LLMs are far and above the favorable option if an enterprise wants to create LLMs for bespoke applications and use cases. The limited experiments we have run on our local machines has allowed us to download multiple models and modify the parameters to suit our needs.
Cost
Open-source models are easily accessible out of the box. This is advantageous to enterprises who are now considering costs (to setup, operate and change) when it comes to generative AI integration into their business offerings. Proprietary models come with licensing fees and other levers that could be detrimental particularly for use cases where results are very well defined.
Recommended by LinkedIn
Compliance
Open-source products are inherently crowd-developed and supported. This could result in hurdles when it comes to regulatory compliance. With the breakneck speed at which data and AI regulations are being ratified, open-source models fall into a grey area that needs to be strengthened if enterprises can readily adopt them for large scale projects in the future.
Support
Post deployment support and improvement is a consideration that must be given thought along the same lines as compliance.
Open-source models benefit from a wider community that is devoted to improving models in the open market. But this doesn’t necessarily promise regularity of support or the continued compliance with regulations. To tackle this shortfall, enterprises will need to train internal resources in the model in question to reduce their reliance on community enthusiasts.
Liquid Neural Networks: A paradigm shift in Architecture for language models
Amidst the ongoing competition between open and closed models for market share and relevance, a notable paradigm shift is emerging in language models, moving beyond the transformer architecture.
Today’s Gen AI models are based on transformer architecture. While we can make large AI models out of them, explainability, hallucinations and compute inefficiencies raise questions about its viability. Liquid model (as the name suggests) uses probabilistic synapses (another fascinating area of mimicking brain functions) going beyond weighted operations techniques.
10-20x faster training, means 10-20x less demand for compute.
Excitingly, we recently announced a significant partnership with Liquid AI enabling us to bring their advanced AI models to a broader audience and accelerate the global AI transformation
To conclude, Small will be the new Big, as we said in our recent TechnoVision report.
Will Gen AI live up to the massive amount of hype it has generated? The short answer is yes. While current ‘Large Language Models’, will continue to thrive, there is also an increasing need for smaller, more cost-efficient models. These models will get smaller and smaller to run on low-footprint installations with limited processing capabilities, including on the edge or on smaller enterprise architectures.
In 2024, new AI platforms will also increasingly battle hallucinations by combining generative AI models with high quality information from Knowledge Graphs. In support of all this, platforms will arise, providing tools for enterprises to leverage generative AI without the need for deep internal technical expertise
Global Practice Leader for Managed Intelligence Data Services & Platform Services. Driving strategic growth in Cloud, Data and AI.
10moMistral AI .. way to go as some of the models are open, focus on efficiency & save capital. Am sure this will accelerate Ai transformation. #mistrallarge vs. #gpt4
CEO of Americas SBU | Member of the Group Executive Board
10moIndeed, clarity is the key to harnessing the real power of technology.
3X CMO/ Growth Strategy / Product + Partner Marketing / Demand Gen/ Digital Strategy/ Fintech / Data & Analytics /ABM /ex-Capgemini, EY, Deloitte, BCG/ Financial Services, IT Services, Professional Services, Risk + Cyber
10moSudhir, great call out to our new AI partners, Mistral AI and Liquid AI, who will help clients build new Gen AI models. Two powerful new partnerships to accelerate global AI transformation!