The Tricky Game of Monetising AI Models
The AI market is a crowded lot with various models at play as giants like OpenAI, Anthropic, Cohere, Meta, and Google continuously refine their offerings.
However, the key question is: Which models will the developers choose to integrate into their application layers? Will it be GPT-4o, Claude 3.5 Sonnet, Command R+, Llama 3, or Gemini 1.5 Flash? And how long can AI startups and tech giants continue to impress developers by offering their AI models for dirt-cheap prices—or even free in some cases?
The struggle is real
“If you’re only selling models for the next little while, it’s gonna be a really tricky game,” said Cohere founder Aidan Gomez in a recent interview. By selling models, he meant selling API access to those AI models. OpenAI, Anthropic, Google, and Cohere offer this service to developers, and they all face a similar problem.
“It’s gonna be like a zero-margin business because there’s so much price dumping. People are giving away the models for free. It’ll still be a big business, it’ll still be a pretty high number because people need this tech — it’s growing very quickly — but the margins, at least now, are gonna be very tight,” he explained.
Interestingly, OpenAI made just about $510 million from API services while earning $1.9 billion off ChatGPT subscription models.
Gomez hinted that Cohere might offer more than just LLMs in the future. “I think the discourse in the market is probably right to point out that value is occurring both beneath, at the chip layer—because everyone is spending insane amounts of money on chips to build these models in the first place—and above, at the application layer.”
Selling it to big tech instead
It’s like AI startups struggling to sell their models to customers will eventually sell their companies to big tech. As AIM previously highlighted, ‘in the end, all AI startups will be acquired’. Interestingly, this trend appears to be unfolding even as you read this.
Recently, Adept was acqui-hired by Amazon, Inflection by Microsoft, and Character.ai by Google. “There will be a culling of the space, and it’s already happening. It’s dangerous to make yourself a subsidiary of your cloud provider. It’s not good business,” said Gomez.
Sully Omar, the co-founder and CEO of Cognosys, echoed similar sentiments and said, “It won’t be long until we see options like ‘login’ with OpenAI/Anthropic/Gemini. In the next 6-8 months, we’re likely to see products that use AI at a scale 100 times greater than today.”
He added that from a business standpoint, it doesn’t make sense to upsell customers on AI fees. “I’d rather charge based on the value provided,” he said.
Omar noted that the current system, which relies on API keys, is cumbersome for most users. “About 90% of users don’t understand how they work. It’s much easier for them to sign in to ChatGPT, pay for compute to OpenAI/Gemini, and then use our app or service at a lower price,” he explained.
He also criticised the credits-based pricing model, suggesting that it is ineffective as it requires constantly managing margins on top of LLM fees.
Recommended by LinkedIn
The end of APIs?
The rise of LLMs has ignited another debate: Will generative AI lead to more APIs or the end of APIs?
“The AI model market is mirroring the early days of cloud computing, where infrastructure (IaaS) was a low-margin game. As cloud providers realised, value creation shifted towards higher-margin services like SaaS and PaaS, layering specialised applications on top of core infrastructure,” said Pradeep Sanyal, AI and ML leader at Capgemini.
“AI startups must move beyond selling raw models to offering differentiated, application-focused solutions,” he explained.
OpenAI recently announced the launch of fine-tuning for GPT-4o, addressing a highly requested feature from developers. As part of the rollout, the company is offering 1 million training tokens per day for free to all organisations until September 23.
The cost for fine-tuning GPT-4o is set at $25 per million tokens. For inference, the charges are $3.75 per million input tokens and $15 per million output tokens. Additionally, GPT-4o mini fine-tuning is available to developers across all paid usage tiers.
This development comes after Google recently reduced the input price by 78% to $0.075/1 million tokens and the output price by 71% to $0.3/1 million tokens for prompts under 128K tokens (cascading the reductions across the >128K tokens tier as well as caching) for Gemini 1.5 Flash.
Enjoy the full story here.
India will Have 100 AI Unicorns in the Next Decade
While AI companies in Silicon Valley are figuring out how to monetise their AI APIs, India’s AI ecosystem is on fire. This year alone, all our feeds were filled with announcements from AI startups raising funds or launching innovative products and solutions. It comes as a surprise that there is still just one AI unicorn in India – Krutrim.
Can that change? Rahul Agarwalla, managing partner at SenseAI Ventures, definitely believes so. Read on.
AI Bytes