Mastering data and AI: turning science fiction into fact

Mastering data and AI: turning science fiction into fact

"I propose to consider the question, 'Can machines think?'"

It’s been 74 years since Alan Turing posed this question, and even longer since the first mathematical model of neural networks was created by Walter Pitts and Warren McCulloch. 

Today, it’s easy to look around at the plethora of AI tools right at our fingertips and forget that this widespread availability of AI chatbots, virtual assistants, and GenAI powerhouses was once a figment of science fiction’s imagination. 🧠

Of course, we didn’t just get here overnight. Humans have had to do a lot of thinking to turn fiction into reality. In this edition of The Source, we want to focus on the challenges in AI/ML, and shine a light on education and skill building opportunities.

After all, Turing’s question rests on a fundamental truism: “Only if humans learn.” So, let’s learn!


Lowering the beginner’s barrier into AI/ML and data science

It seems nearly everyone is pursuing an AI project these days. But bringing projects to production is tough. One statistic suggests that nearly 87% of AI projects never actually make it to production. What’s more, organizations are not seeing business value as quickly as they would hope. Take generative AI for example, a 2024 report by Deloitte estimates that only 18-36% of organizations are actually seeing the benefits of their GenAI projects.

Choosing a stack to develop and train your models on is difficult, in part due to the sheer number of tools available. Integrations and package dependencies make matters more difficult, especially for new entrants in the space. 

We recently launched a new stack of tools designed to make it easier than ever to get started with data science. Our Data Science Stack (DSS) saves new data scientists from spending countless confused hours setting up their environments and configuring tools, and more time working with data. ⏰

DSS is packed with design features and tools that make it easier for beginners to get started. It comes standard with popular Data Science Tools, like MicroK8s, JupyterLab and MLflow, and ready-to-use ML frameworks, PyTorch and TensorFlow. It also has a range of design features that help to speed up deployment, keep manual management to a minimum and enable simple scaling. It also handles your packaging dependencies and is compatible with a wide range of machine hardware, with simplified GPU configuration that uses their computational power effectively. 

You can explore this brand-new beginner-friendly tool for yourself.


Solving skills transfer with the Canonical Summer camp 🌴

There are persistent barriers to skills transfer too. A broad survey by SnapLogic found that finding skilled specialists for projects is exceedingly difficult. According to this survey, “More than half (51%) of companies in the US and UK do not have the right in-house AI talent to execute their strategy. In the UK, this in-house skill shortage is considerably more acute, with 73% lacking the needed talent compared to 41% in the US.”

The AI skills gap is a monumental barrier that the open source community is uniquely positioned to help solve, given the free flow of information, guides, and documentation, and close-knit collaboration.

Canonical is also playing its part in this ongoing effort: we recently launched our AI Summer Camp series, a collection of 5 essential AI training workshops on AI from our in-house AI and ML experts. 

In this short summer school series, our AI experts went over a wide range of topics, including:

🏕️ How to build LLMs with open source

🏕️ Kubeflow VS MLflow

🏕️ Vector databases for generative AI applications

🏕️ AI on public cloud with open source

🏕️ AI on private cloud - why is it relevant in the hyperscalers era?

Did you miss out? Catch up on the AI Summer Camp series by watching it on demand.

We’ll share future Summer Camp dates, and other exciting training workshops, in our future newsletters, so stay subscribed.


Learn from industry leaders at Data and AI Masters

In October we’ll also be hosting our Data and AI Masters event: a fully online event that will give you all an up-close deep dive into the most important trends and shifts in AI and data management at scale. By joining, you’ll be able to take part in hands-on workshops and learn from AI professionals and industry figureheads from across the world.

Come along, say hi, and connect with others.

If you want to read more about the event and see what’s in store, check out our pre-event blog.

👉 View a full list of our speakers

Meet some of the speakers at Data and AI Masters

Addressing operational burdens with solutions like Charmed OpenSearch

Of course, the problems don’t stop once you hire a room full of AI/ML masters. Operational challenges, such as performance, system uptime, cloud readiness, scaling, and more, are consistent blockers for database management engineers supporting AI projects. Even something as simple as the tools you can use pose issues. 🛠️

For instance, working with a tool like OpenSearch in production-grade use cases that process vast amounts of data and using extensive data infrastructure can be challenging. Automating the deployment, provisioning, management, and orchestration of production OpenSearch data clusters can also be highly complex.

We’re making it easier to use for everyone with Charmed OpenSearch. Charmed OpenSearch builds on OpenSearch with additional enterprise-grade capabilities that can help you spend less time on operational tasks and more time on high-value data and analytics projects.

Read more in our recent announcement.


Solving security challenges with new technologies

Simply learning how to build AI models is just half the battle; ensuring security will always be a concern. McKinsey reports that 51% of organizations using AI consider cybersecurity to be the highest risk they need to mitigate. 

We’ve been doing a lot of work to address the security challenges in AI, and have been at the vanguard of confidential computing. Confidential computing protects data in use, ensuring the integrity of your AI workloads at run-time across public and private clouds. We have a great set of resources for you to discover confidential computing, including an introductory whitepaper and a blog series. 

Be sure to check them out and learn about Canonical’s solutions for confidential AI.


Grab our hands-on guide to deploying TensorFlow Lite on edge devices

Many experts agree that the value of AI increases at the edge. Being able to process data from devices in near real-time, for instance, has many compelling benefits for businesses, but deploying AI at the edge isn’t easy. There are tools you can use, like TensorFlow Lite, but regardless of their power they still pose a number of challenges for AI engineers. Our deep dive into TensorFlow demystifies this deployment, giving a comprehensive exploration of managing dependencies and updates for these models, and how containerization with Ubuntu Core and Snapcraft can streamline the process.


This edition is contributed by Matthew de Klerk, a content specialist at Canonical.

From edge devices to large search and analytics applications, Canonical is here to help you master data and AI, so that maybe one day you can make Turing’s considerations a reality yourself. 

We hope you enjoyed this edition of The Source. 

📌 Subscribe to our newsletter and ensure you never miss an edition

🌐 Visit us at canonical.com

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics