AI Assistants 101 - the busy person's crash course
In what is already the 9th edition in the Autonomous AI-Assistants (or Agents) 9series, this episode is meant to give you an overview of AI, and some of the major concepts behind this most powerful architecture devised by humans to date. It also dives into the concept of Autonomous Agents, and describes what Agents can do, what a single LLM cannot. If you want to catch up on the concept of AI-Assistants, read this article. The other articles can be found here.
What exactly are AI agents, and what can you really do with them? Let’s explore all of that in this article. Don’t worry if you’re not a machine learning expert. This is a gentle introduction to AI agents for everyone who’s interested.
"Autonomous agents that are AI-powered programs, who able to create tasks by themselves, finish tasks, reprioritize, and repeat this process until they achieve their goal."
Spring 2024 is here, and it’s not just the flowers that are blooming. It’s the season of AI agent frameworks, each one promising to “disrupt everything” more than the last. When doing research for my last post, I found 43 platforms that promise to be an AI agent framework, in one way or another.
And now I can’t scroll through my feed without stumbling upon the next flashy demo on a GitHub repo with thousands of stars, all gained overnight.
These repos promise that anyone, even your grandma who still uses Internet Explorer, can now build an entire app from a single prompt. And for some reason, a large majority of these demos are usually some variation of the classic snake game.
Although this type of excitement isn’t rare in the Tech, and AI world, not every AI-related news gets equal attention. In fact, there are only a few announcements that truly capture everyone’s interest. It’s always the new model releases from giants like OpenAI or Claude. And, of course, after the release of AutoGPT, AI agents have proven that they too can steal the spotlight.
However, unlike OpenAI’s model releases, which usually garner positive reactions from the general public, AI agents have a very polarizing effect. People divide themselves into two groups. The first group is either terrified of AI agents, already envisioning a future where they’re replaced by Terminator-like robots, or they wholeheartedly believe that AI agents will make them rich by increasing their productivity
In one of my previous arcitles, I wrote about people's fear of the AI-automation driven job apocalyps
Yes, you might be surprised, but these two belong to the same group of people. A group that usually overvalues ai agents, at least slightly.
On the other hand, there are people who simply choose to ignore AI agents because they don’t see how they differ from, let’s say, chatting with ChatGPT or using an app built on top of LangChain with simple RAG. For them, this is all just hype fueled by greedy influencers and even greedier companies.
So, who is right? What exactly are AI agents, and what can you really do with them? Let’s explore all of that in this article. Don’t worry if you’re not a machine learning expert. This is a gentle introduction to AI agents for everyone who’s interested.
The Rise of AI
The 1950s marked the birth of artificial intelligence as we know it. In 1950, Alan Turing, the father of computer science, published a groundbreaking paper that asked the big question: “Can machines think?” Many thought that the answer is: NO. In order to prove skeptics wrong, Turing proposed an imitation test (now known as the Turing Test) where a machine tries to fool a human into thinking it’s also human through normal conversation.
Download my latest 1000+ book on Machine Learning
A few years later, a group of scientists got together at Dartmouth College for a summer workshop that would change the world. They set out to build machines that could think like us. This historic meeting, led by computer scientist John McCarthy, kicked off the field of AI. They believed they just needed 2 months and 10 men to build a “smart” machine.
“We propose that a 2-month, 10-man study of artificial intelligence…
Right around that time, a new concept of “symbolic AI” was born.
Symbolic AI was all about representing knowledge through abstract symbols and manipulating them according to strict rules, kind of like a super-advanced version of Aristotle’s logic. McCarthy and his pals believed that by combining enough of these symbols and rules, they could create machines that could reason, plan, and solve problems just like humans.
This approach led to some pretty impressive systems in the 50s and 60s, like Dendral, MYCIN which could do things like interpret lab results and identify unknown molecules.
However, symbolic AI soon ran into some roadblocks. Turns out, the real world is a messy, complicated place that doesn’t always fit neatly into strict logical rules. Imagine trying to write down every single rule for making a sandwich ! As symbolic AI tackled more ambitious problems, its limitations became clearer.
In the late 1960s and early 1970s, the field hit a bit of a rough patch known as the “first AI winter.” Funding dried up, progress slowed, and people started to lose faith in the grand promises of human-like AI. It was clear that symbolic logic alone wasn’t going to cut it — world needed a new approach.
Embracing Uncertainty
As the limitations of symbolic AI became clearer in the 1970s, researchers started exploring new ways to handle the uncertainty and complexity of the real world. Two key ideas emerged during this time: the use of probability and the rise of machine learning.
Let’s start with probability. In the 1980s, Bayesian networks hit the scene, allowing AI systems to “reason” about uncertainty using the language of probability. Instead of relying on strict logical rules, these networks could learn from data and make educated guesses when faced with incomplete information.
Meanwhile, machine learning was also making a comeback. In the 1980s, a new training technique called backpropagation breathed new life into neural networks, allowing them to learn complex patterns from data.
This shift towards probabilistic and learning-based approaches changed the game for AI agents. Instead of just reasoning with abstract symbols, agents could now learn from experience
This new paradigm powered breakthroughs in two key areas of machine learning: reinforcement learning and deep learning. Reinforcement learning is all about teaching agents to make smart decisions
Deep learning, on the other hand, uses neural networks with many layers to learn rich, detailed representations of data, allowing agents to tackle complex tasks like image recognition and natural language processing.
Recommended by LinkedIn
These breakthroughs led to an expanded definition of ai agents. It wasn’t only about “reaching a goal successfully”. This new definition included terms like environment in which an agent perceives something and learns about the world.
So, what can AI agents actually do?
To clarify, this article focuses on Agents that use Large Language Models (LLMs) as their “brain”. While there are various types of agents, such as multimodal and visual agents, LLMs stand out due to their special capabilities.
Regardless of whether they are closed or open source, all LLMs possess varying levels of “Reflection” and “Common-Sense Reasoning” abilities, with some outperforming others. These crucial capabilities enable LLM Agents to make plans, engage in self-reflection, and continuously refine themselves, all stemming from the unique properties of LLMs.
Other than LLMs intrinsic abilities, there are 5 other important characteristics of Agent:
1. Ability to make Autonomous Actions.
Agents can perform tasks independently
2. Memory
Adding memory into an agent allows personalization, enabling it to understand and adapt to individual preferences. And as our preferences evolve throughout our lives, an agent with memory can learn and adjust. This is essential for building long-term relationships between agents and users.
3. Reactivity
To interact with their environment, agents must be able to perceive and process the available information. This reactivity enables agents to respond to changes, make informed decisions, and provide relevant outputs based on the input they receive. By analyzing and interpreting the data within their environment, agents can offer context-aware help.
4. Proactivity
Agents are not only capable of “planning”, “writing tasks” and prioritizing, but they are also able to take proactive steps to accomplish these tasks by using tools, such as search the internet, scrape reddit, use code interpreter etc. At the moment, this is mostly done through api calls and function calling.
5. Social Ability
Agents can collaborate with other agents or humans, they can delegate work and they are capable of “sticking to their defined roles in the conversations”. This social ability enables agents to work collectively towards common goals, distribute workloads, and maintain coherent communication.
What CAN AI agents do that humans CAN’T?
The key advantage of AI agents lies in their ability to process information at a massive scale. As AI researcher Stuart Russell puts it, AI systems can do things “not because of the depth of understanding but because of its scale.”
For example, let’s say you need 100,000 customer reviews to identify common issues with a product. With an average reading speed of 200 words per minute (and if an average customer review has around 150 words), it would take one human around 52 days to read all reviews. Additionally, a human would need many more days to analyze, summarize and extract all the important information. An AI agent could do the same job in a matter of minutes. On top of that, AI agents can easily withing minutes provide ANY type of output you need, whether it’s a newsletter, JSON or an email.
Or if you were asked “to imagine your life in the next 5 years?” You’d probably be able to imagine couple of possible life paths, each consisting of 4–5 big milestones (e.g. get married, move to Europe etc…). But if you asked agents to work together on this task, you’d get a lot more potential life paths and each would contain far more details and diverse milestones.
What CAN AI agents do that single LLM CAN’T?
“So this is just GPT-4 with RAG?” or “Isn’t this the same as chaining together a couple of prompts?” are some of the questions that I get a lot. This proves to me that people don’t understand the benefits of AI agents when compared to, let’s say, better prompting of a single LLM.
So let’s look at 2 main reasons why AI agents perform better than one LLM:
Andrew Ng has shared in this lecture that an agentic workflow with “dumber” models like gpt 3.5 significantly outperforms zero-shot prompting of “smart” models like gpt-4.
Improved accuracy arises from iterations that give agents an opportunity to “fact-check” and “review” their answers which leads to less hallucinations.
2. Offloading decision-making
Let’s imagine that I want to create a blog about Mediterranean culture, but I have no idea where to start since I’ve never had a blog before. To begin, I’d probably need to find answers to questions such as, “What steps are required to start and run a successful blog?” and “What is the first step?”
Alternatively, I could create a team of AI agents and give them the task of breaking down the process of blogging into smaller subtasks. Not only that, but these agents should also be capable of prioritizing all the subtasks. Which means that I’d have more time and energy to spend on strategizing and other important mental tasks.
Well, that's it for now. If you like my article, subscribe to my newsletter or connect with me on LinkedIn.
Signing off - Marco
Other related stuff you might be interested in
Great article. You have a gift!!