What’s Next for Data + AI in 2025? 10 Predictions

What’s Next for Data + AI in 2025? 10 Predictions

According to industry experts, 2024 was destined to be a banner year for generative AI. Operational use cases were rising to the surface, technology was reducing barriers to entry, and general artificial intelligence was obviously right around the corner.

So… did any of that happen? Well, sort of. 

In this week’s newsletter, I share what leading futurist and investor @/Tomasz Tunguz thinks about the state of data and AI for 2025—plus a few predictions of my own.

1. We’re living in a world without reason (Tomasz)

Three years into our AI future, we’re starting to see businesses create value in some of the areas we expect — but not all of them. According to Tomasz, AI tools can be summed up in three categories:

1. Prediction: AI copilots that can complete a sentence, correct code errors, etc.

2. Search: tools that leverage a corpus of data to answer questions

3. Reasoning: a multi-step workflow that can complete complex tasks

While AI copilots and search have seen modest success (particularly the former) among enterprise orgs, reasoning models still appear to be lagging behind—due in large part to model accuracy. Research is happening in this area, buuut don’t expect that to change dramatically in 2025. 

2. Process > Tooling (Barr)

A new tool is only as good as the process that supports it. Over the years, data teams have sometimes found themselves in a state of perpetual tire-kicking. But as the enterprise landscape inches closer to production-ready AI —the time for tire kicking is quickly coming to an end. 

Let’s consider the example of data quality for a moment. You could have the most sophisticated data quality platform on the market — the most advanced automations, the best copilots — but if you can’t get your organization onboarded quickly, all you’ve really got is a line item on your budget.

Over the next 12 months, I expect data teams to lean into proven end-to-end solutions over patchwork toolkits in order to prioritize more critical challenges like data quality ownership, incident management, and long-term domain enablement. The solutions that deliver on those priorities will win the day for AI-readiness.

3. AI is driving ROI — but not revenue (Tomasz)

Like any data product, GenAI’s value comes in one of two forms; reducing costs or generating revenue. On the revenue side, you might have something like AI SDRS or enrichment machines. According to Tomasz, these tools can generate a lot of sales pipeline… but not a lot of sales. 

Image credit: ISG

“Not many companies are closing business from it. It’s mostly cost reduction. Klarna cut two-thirds of their head count. Microsoft and ServiceNow have seen 50–75% increases in engineering productivity.”

Expect cost-cutting use-cases to continue to bubble to the surface in 2025.

4. AI adoption is slower than expected — but leaders are biding their time (Tomasz)

In contrast to the tsunami of “AI strategies” that were being embraced a year ago, leaders today seem to have taken a unanimous step backward from the technology.

“There was a wave last year when people were trying all kinds of software just to see it. Their boards were asking about their AI strategy. But now there’s been a huge amount of churn in that early wave.”

It’s not that the technology isn’t valuable in theory — it’s that organizations haven’t figured out how to leverage it effectively in practice. Tomasz believes that the next wave of adoption will be different from the first because leaders will be more informed about what they need — and where to find it.

5. Small data is the future of AI (Tomasz)

The open source versus managed debate is a tale as old as… something old. But when it comes to AI, that question gets a whole lot more complicated.

At the enterprise level, it’s not simply a question of control or interoperability — though that can certainly play a part — it’s a question of operational cost. While Tomasz believes that the largest B2C companies will use off the shelf models, he expects B2B to trend toward their own proprietary and open-source models instead.

“In B2B, you’ll see smaller models on the whole, and more open source on the whole. That’s because it’s much cheaper to run a small open source model.”

6. The lines are blurring for analysts and data engineers (Barr)

When it comes to scaling pipeline production, there are generally two challenges that data teams will run into: analysts who don’t have enough technical experience and data engineers who don’t have enough time.

As we look to how data teams might evolve, there are two major developments that — I believe — could drive consolidation of engineering and analytical responsibilities in 2025; increased demand  and improvements in automation 

The argument is simple — as demand increases, pipeline automation will naturally evolve to meet demand. As pipeline automation evolves to meet demand, the barrier to creating and managing those pipelines will decrease. The skill gap will decrease and the ability to add new value will increase. Sounds like a nice future.

7. Synthetic data matters — but it comes at a cost (Tomasz)

There are approximately 21–25 trillion tokens (words) on the internet right now. The AI models in production today have used all of them. In order for data to continue to advance, it requires an infinitely greater corpus of data to be trained on. 

As training data becomes more scarce, companies like OpenAI believe that synthetic data will be an important part of how they train their models in the future. And we’ll hear a lot more about synthetic data in 2025.

But is synthetic data a long-term solution? Probably not.

A little artificial flavoring is okay — but if the diet of synthetic training data continues into perpetuity without new organic data being introduced, that model will eventually fail (or at the very least, have noticeably worse nail beds).

8. The unstructured data stack will emerge (Barr)

The idea of leveraging unstructured data in production isn’t new by any means — but in the age of AI, unstructured data has taken on a whole new role.

When it comes to generative AI, enterprise success depends largely on the panoply of unstructured data that’s used to train, fine-tune, and augment it. But according to a report by IDC, only about half of an organization’s unstructured data is currently being analyzed.

As more organizations look to operationalize AI in 2025, enthusiasm for unstructured data — and the burgeoning “unstructured data stack” — will continue to grow. If 2024 was about exploring the potential of unstructured data — 2025 will be all about realizing its value. The question is… what tools will rise to the surface?

9. Agentic AI is great for conversation — but not deployment (Tomasz)

We’ve seen a lot of success around AI copilots in 2024, (just ask Github, Snowflake, the Microsoft paperclip, etc), but what about AI agents?

While “agentic AI” has had a fun time wreaking havoc on customer support teams, it looks like that’s all it’s destined to be in the near term. While these early AI agents are an important step forward, the accuracy of these workflows is still poor.

At present, 75%-90% accuracy is state of the art for AI—but if you have three steps of 75–90% accuracy, your ultimate accuracy is around 50%. We’ve trained elephants to paint with more accuracy than that.

Far from being a revenue driver for organizations, most AI agents would be actively harmful if released into production. In 2025, it will be important to be able to talk about agentic AI—but most teams won’t be deploying one any time soon.

10. Pipelines are expanding — but quality coverage isn’t (Tomasz)

Each year, Monte Carlo surveys real data professionals about the state of their data quality. This year, we turned our gaze to the shadow of AI, and the message was clear. Data quality risks are evolving — but data quality management isn’t.

“We’re seeing teams build out vector databases or embedding models at scale. SQLLite at scale. All of these 100 million small databases. They’re starting to be architected at the CDN layer to run all these small models. Iphones will have machine learning models. We’re going to see an explosion in the total number of pipelines but with much smaller data volumes.”

But the more pipelines expand, the more difficult data quality becomes. Data quality increases in direct proportion to the volume and complexity of your pipelines. The more pipelines you have (and the more complex they become), the more opportunities you’ll have for things to break — and the less likely you’ll be to find and resolve them.

+++

Anything we missed? I’m all ears.

Sarah Levy

Co-Founder & CEO of Euno: Govern data models everywhere ✩

2d

Interesting as always, Barr Moses! AI models are only as reliable as the semantic models behind them. If those are inconsistent, the results are unreliable. P.S. Totally agree on point two—processes > tooling, 100%.

Ajay Patel

✅ 45K Subs | Solving Product problems through Data and AI

2d

The blurring lines between roles and the focus on small data will definitely shift the landscape and foucs. Excited to see how organizations adapt in 2025!

Like
Reply
Ann Kuss

CEO @ Outstaff Your Team | Tech HR Expert | HR Mentor at Projector | Helping businesses grow faster with pre-vetted talent ✅

1w

I completely agree that the slowdown in AI adoption isn’t a sign of disinterest but rather a shift toward more thoughtful integration. Last year’s rush to implement AI felt like experimentation, but at our company, we took a cautious approach. For us, finding AI tools that protect candidate data and meet GDPR standards was a challenge, but with some thorough research and testing, we made it work. It’s all about balancing innovation with responsibility.

Sajjad Masud

SaaS Executive | AI Product Leader | Investor, Advisor, Board | PLG, GTM

1w

Data quality is a foundational need for AI. Thanks for sharing Barr Moses

Like
Reply
Gerri C.-Global Events Manager

Data, AI, & Security Events|Tech Conferences| Connector|IT Community|Business Development|Sponsorships|CDO Magazine

2w

2024-It was the BEST of times, It was the WORST of times🤩

Like
Reply

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics