Our thoughts on GenAI - Part 2: Theoretical foundations of Agentic AI

Our thoughts on GenAI - Part 2: Theoretical foundations of Agentic AI

Foreword

Continuing upon our series of articles on AI Agents, we will focus on the theoretical foundations behind agents in this article and delve into some of the seminal papers which have spurred the current wave of innovation. Much of our current discussion will focus on “Agentic patterns” as popularized by Prof. Andrew NG & we hope an intuitive understanding of the under-the-hood workings of Agents will enhance our understanding of their applications to solve business problems

How do we define AI Agents?

Let’s first understand how we can define the current crop of AI Agents An AI agent can be defined as a software entity designed to perceive its environment, make decisions, and take actions autonomously to achieve specific goals. Key characteristics include:

- Autonomy: Ability to operate independently without constant human intervention

- Adaptability: Ability to adjust behavior based on changes in environment

- Goal-oriented: Specifically designed to accomplish specific objectives

- Learning capability: Ability to improve their performance over time through experience

- Interaction with environment: Ability to use sensors to perceive their surroundings and actuators to affect changes

What are Agentic design patterns?

Prof. Andrew NG outlines 4 key principles in designing Agentic systems: - Reflection: LLMs are to go through their own output and iterate on improvements

- Tool Use: LLMs are endowed with tools, such as web search, code execution, and other functions, to help access information, execute tasks, or analyze data

- Planning: LLMs develop and follow a sequence of actions to complete a task. For example, it might first outline a plan for analyzing market trends, then gather relevant data, and finally generate a detailed report - Multi-agent collaboration: Multiple AI Agents can collaborate together to work on a task and generate results of higher quality than the work of a single Agent Let’s dive further into each aspect of these design patterns

Reflection

Imagine when you are trying to write a piece on any topic, do you write the entire thing in one go, or do you find yourself writing a bit, reflecting on it, then editing or adding the rest iteratively? The same intuition also follows for AI. Compared to single shot generation, fidelity of output can be increased significantly if another LLM is added which can review the first LLM’s generated content, provide feedback and reprompt the first LLM to regenerate The idea of "reflection" in AI agents—the ability to self-improve through iterative feedback—draws from multiple research fields. A notable paper that brought this concept into the realm of large language models is Self-Refine: Iterative Refinement with Self-Feedback by Aman Madaan et al., published in 2023

This paper introduced a framework, Self-Refine, which enables language models to enhance their outputs by generating initial responses, reviewing them, and making refinements in cycles. Key contributions include:

- Establishing a framework for self-guided refinement in language models - Demonstrating its benefits across tasks like writing, reasoning, and coding

- Showing how self-refinement often outperforms single-shot generation

While self-reflection isn’t new, this paper has been pivotal in advancing it for today’s LLM-driven AI, impacting research on reflective agents and adaptive workflows. The concept also has roots in cognitive science and early AI work, but this framework sharpened its focus for modern AI


Tool Use

For AI to be ubiquitous in our lives, it has to perform tasks, for it to perform tasks it needs access to tools. Developers are focussing deeply on endowing agents with access to tools for internet search / syncing with productivity applications & other enterprises applications for verticalised plays The idea of AI agents using tools—like pulling in data from the web, accessing a calendar, or using other apps to get things done—was first seriously explored in a 2022 paper called Toolformer: Language Models Can Teach Themselves to Use Tools, published by researchers at Meta AI. This paper introduced the concept of allowing language models to call on external APIs as needed, so they can take specific actions rather than just generate text

In Toolformer, the model learns to recognize when using an external tool might help it perform a task more accurately, like looking up facts, doing translations, or making calculations. This approach has been a game-changer for building more effective, adaptable AI agents. The framework has since been foundational for how AI agents can interact with the world around them and manage more complex, realistic scenarios


Planning

The era of AI would be constrained if one had to explicitly write detailed prompts explaining every step for relatively complex tasks to be executed by AI. Thankfully, developers have been able to get LLMs to break down large tasks into smaller tasks and execute them in a stepwise manner, unlocking a vast number of complex enterprise use-cases where Agentic workflows can be put to work The paper Chain-of-Thought Prompting Elicits Reasoning in Large Language Models by Wei et al. (2022) is a landmark study in planning for AI agents. It shows how prompting models to verbalize their reasoning improves their problem-solving skills

Key Contributions:

- Chain-of-Thought Prompting: Encouraging models to articulate their thought process leads to better reasoning and performance on complex tasks.

- Structured Approach: This method allows AI to break down problems into manageable steps, fostering clearer and more coherent reasoning

- Broader Implications: By integrating planning into their processes, AI agents can handle more complex challenges, enhancing their effectiveness in various applications

This work significantly advances how we can develop AI agents capable of sophisticated decision-making and planning


Multi-Agent Collaboration

The analogy of how an actual team of specialists coordinate to deliver a project is being codified for AI. Specialized agents focusing on narrow tasks and collaborating together to complete a larger chunk of work have been delivering progressively better outputs for complex problems

One of the earliest influential papers on multi-agent collaboration in AI using large language models (LLMs) is Communicative Agents for Software Development by Qian et al. (2023). This paper introduced ChatDev, an open-source system of agents that simulates a virtual software company.

Key Contributions:

- Task Decomposition: It showed how to break down complex software tasks into manageable subtasks assigned to different specialized agents.

- Role Assignment: The paper defined roles for agents, such as software engineer, product manager, designer, and QA engineer.

- Agent Interaction: It explored how agents communicate and collaborate to achieve complex goals.

- Enhanced Performance: Findings indicated that multi-agent systems outperform single-agent approaches.

- Scalability: The framework enabled the handling of more complex projects by distributing tasks among specialized agents.

This paper formalized the multi-agent collaboration design pattern for LLM-based AI agents in software development, influencing frameworks like AutoGen, Crew AI, and LangGraph


In the next article we will expand our thoughts on the enterprise applications of Agentic AI & the burgeoning Infrastructure stack which power such applications. We will outline the areas of particular interest to Ideaspring and where we believe emergents have a significant role to play

In the meantime, we would love to hear your thoughts. Please reach out to us over email to further the conversation or to pitch us your newest venture in GenAI - we are available at satrajit@ideaspringcap.com

Sanjay Prasad

Serial Entrepreneur, Mentor & Advisor

1mo
Like
Reply
Vijaya Kumar Ivaturi

Cofounder and CTO of Crayon Data

1mo

Interesting! Will comment on it soon as I prefer to put it in the context of a domain and explore options and challenges.

Sachin Kulkarni

Hiring Founding Engineers to build AI-agents for Insurance Industry | ex-Ethos Life, Uber

1mo

Very well written article Satrajit Neogy Ideaspring Capital

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics