The New New Thing: Agentic Systems

The New New Thing: Agentic Systems

Databricks Week 5

The New New Thing centers around the internet revolution. It seemed a fitting title since we are smack in the middle of the AI revolution, however puke-inducing that may be to say. The epicenter of that revolution isn't the large language models (LLMs), but the ecosystem that is developing around them. Agents are the part of that ecosystem that excites me the most.

Agentic systems are where developers are pushing the boundaries of problem-solving. These systems, characterized by their ability to autonomously perceive, reason, and act, represent a significant leap forward from traditional AI approaches. There are several frameworks that are building scaffolding around agentic systems, including OpenAI, LlamaIndex, DSPy, and LangChain. Today, I’m going to focus on LangChain.

What is an agentic system?

An agentic system is a construct that allows Large Language Models (LLMs) to operate with a high degree of autonomy, make decisions, and take actions to fulfill requests. Unlike traditional AI models that operate within a linear execution framework with narrow, predefined parameters, agentic systems possess the ability to perceive and develop context, formulate strategies, and adapt their behavior based on observations. The core components of these systems include:

  • Perception: The ability to gather and interpret information from the environment.
  • Reasoning: The capacity to analyze information, make inferences, and formulate plans.
  • Action: The capability to execute decisions and interact with the environment.

To draw an analogy with software engineering, if a non-agentic LLM call is a monolithic script, agentic systems are more like microservices. They are dynamic, interconnected, and capable of orchestrating complex workflows autonomously, much like how microservices collaborate to deliver sophisticated applications without incurring excessive technical debt.

Compared to standard LLM usage, which often involves single-turn interactions or simple chain-of-thought processes, agentic systems can maintain long-running, multi-step workflows. They can juggle multiple objectives, prioritize tasks, and even recursively improve their own performance. This capability necessitates a step change in complexity, requiring additions such as memory and tools.

Agents are the near-term future of AI

Current AI systems, while powerful, often struggle with tasks requiring long-term planning, contextual understanding of multiple systems, and adaptive decision-making. Agentic systems address these limitations by introducing planning and autonomous tool calling.

Consider retrieval-augmented generation (RAG) as a basic agent. RAG systems dynamically fetch relevant information to augment their knowledge base before generating responses. This process mirrors the basic cycle of an agent: perceive (retrieve information), reason (process the retrieved data), and act (generate a response). However, agentic systems extend this concept by chaining together multiple tools and even spawning sub-agents for complex tasks.

Here are a few examples where agentic systems could have a significant impact:

  • In finance, agents could autonomously manage investment portfolios, adapting strategies based on real-time market conditions.
  • In manufacturing, agents could assess inventories or failures in real-time and place orders or send alerts to key stakeholders.
  • In software development, agents could automate code generation, testing, and even architectural decisions.

The primary challenges in agent development lie in integrating the right tools, ensuring reliability, and monitoring/debugging performance. While we are still some distance from fully realizing the examples above, the foundational scaffolding is in place, and many companies are actively building upon it.

How is LangChain accelerating agentic workflows?

LangChain has rapidly emerged as a leading framework for developing agentic systems. It provides a comprehensive toolkit for building agentic applications, focusing on:

  • Chains: Allow for the sequential execution of LLM calls and other operations.
  • Agents: Enable dynamic selection of tools and actions based on the task at hand.
  • Memory: Facilitate the retention and retrieval of information across interactions.
  • Tools: Provide interfaces for agents to interact with external systems and data sources.

In addition to the LangChain tooling, they have also released LangGraph, which introduces cyclic graphs, allowing for more complex, non-linear agent behaviours. This enables the creation of agents that can revisit previous steps, handle concurrent tasks, and manage intricate decision trees. Furthermore, they introduced LangSmith, a paid SaaS framework for debugging, monitoring, and optimizing LangChain applications.

Keep up the good work, LangChain!

What about Databricks?

I want to avoid being overly promotional about Databricks, but it’s worth highlighting the impressive suite of tools that the platform offers. I’ve been blown away in my first month by how easy it is for our customers to set up and govern models, perform vector searches, and monitor LLM applications. Here is a small sampling of what Databricks provides in this space:

  • Playground: Facilitates rapid prototyping and testing of different LLMs and agents.
  • Model Serving: Streamlines the deployment and scaling of any model.
  • Review App: Enables collaborative assessment of agents with feedback gathering.
  • Experiment Tracking: Provides comprehensive tracing, logging, and versioning.
  • Evaluation: Offers a robust set of tools for automatically quantifying model performance.

These tools are closely aligned with LangChain's philosophy, offering complementary capabilities that can significantly enhance the development and deployment of agentic systems. I am particularly enthusiastic about the evaluation and feedback-gathering tools, as this is a personal pain point of mine. Shipping user interfaces to gather feedback is cumbersome, but without that feedback, how can you effectively fine-tune or prompt engineer?

An example of a SQL agent

LangChain isn't perfect, but it dramatically streamlines tool development. The following example demonstrates how you can deploy a SQL agent that interprets natural language queries, formulates appropriate SQL statements, and executes them against a database.

from langchain.agents import create_sql_agent
from langchain.sql_database import SQLDatabase
from langchain_community.chat_models import ChatDatabricks

db = SQLDatabase.from_databricks(catalog='main', schema='iot_data')

model = ChatDatabricks(endpoint='databricks-dbrx-instruct')

agent_executor = create_sql_agent(llm=model, db=db, verbose=True)

agent_executor.run("How many failures have occurred today?")        

This agent can handle complex queries, join multiple tables, and even explain its reasoning via the default ReAct framework. While it still requires a decent amount of prompting and guardrails, it can be a game-changer for non-developer analysts if deployed correctly.

An example of a Databricks Genie agent

Databricks Genie Spaces are powerful environments for deploying customized text-to-SQL engines. When integrated with LangChain, they can significantly enhance the capabilities of agentic systems. Here's an example of using an API wrapper to call a Genie Space and passing it as a tool to an agent. Note the similarities with the SQL agent above.

from src.genie import GenieTool, genie_prompt
from langchain.agents import create_react_agent, Tool, AgentExecutor

genie_tool = Tool(
    name="Genie Tool",
    func=GenieTool().query,
    description="Useful for querying daily plant operational data"
)

agent = create_react_agent(
    prompt=genie_prompt,
    tools=[genie_tool],
    llm=model
)

agent_executor = AgentExecutor(agent=agent, tools=[genie_tool], verbose=True)

agent_executor.invoke("What was our uptime yesterday?")        

The best part about this approach is that you are abstracting the curation of a schema into Genie, with its own set of instructions and metadata. This example showcases why the microservices analogy can be powerful in the LLM space.

An example of a compound agentic system

Compound agentic systems involve multiple agents working together to solve complex problems. These systems can handle tasks that require diverse skills and knowledge domains and route requests to different tools. Here's an example of using a SQL agent as a base agent, but providing both a Genie tool and a RAG retriever (see my previous post or this demo).

from langchain.tools.retriever import create_retriever_tool
from src.compound import compound_prompt_template

retriever_tool = create_retriever_tool(
    retriever=vector_search_as_retriever,
    name="Documentation Search",
    description="Provides plant documentation and manuals"
)

agent = create_sql_agent(
    llm=model,
    db=db,
    prompt=compound_prompt_template,
    extra_tools=[retriever_tool, genie_tool],
    output_parser=StrOutputParser()
)

agent.invoke("How do I repair this piece of machinery?")        

I'm not going to pretend that real deployments will be this easy, but I hope you can see the scaffolding forming. LangGraph, Databricks, and other frameworks promise to keep pushing this space forward, so keep your seatbelt on.

Key Challenges in Bringing Agentic Systems to Production

The worst thing about potential is that it provides limited value until it is actually realized. Agentic systems have enormous potential, but there are still numerous hurdles to overcome before we see them widely adopted across companies.

  1. Scalability: As the complexity of tasks increases, ensuring that agentic systems can scale efficiently becomes crucial. Performing multiple tool calls adds significant latency, so be cautious and monitor closely.
  2. Reliability: Guaranteeing consistent behaviour across various scenarios and inputs is challenging, especially for autonomous systems. A single poor response to a C-level executive can sideline a project, so don't underestimate the time required for development and evaluation.
  3. Complexity: LLMs are non-deterministic and complex on their own, and this complexity is compounded when we start layering them on top of each other with different tools. I strongly advocate for both inference monitoring and detailed trace debugging, whether using MLFlow or LangSmith.

In Closing

Agentic AI systems should excite even the most pessimistic among us. They can extend Large Language Models (LLMs) or help mitigate their weaknesses. These systems represent the threshold of LLMs as software and offer unprecedented levels of autonomy and problem-solving capabilities.

Frameworks like LangChain and Databricks are at the forefront of this revolution, providing impressive tooling for accelerated innovation. Agentic AI systems offer tremendous potential to improve the world, but they need to be carefully engineered and deeply considered.

Much like the steam engine, the internet, or Crocs, agentic AI is here to stay. I encourage you to dive in and deploy a couple of agents for testing and feedback gathering. While I may be an excitable person, the potential of these systems is undeniably promising.

📊 Alastair Muir, PhD, BSc, BEd, MBB

Data Science Consultant | @alastairmuir.bsky.social | Risk Analysis and Optimization

5mo

Scott McKean I’m learning more about Databricks from your weekly posts than conference sales pitches. Keep posting

Like
Reply
Godwin Josh

Co-Founder of Altrosyn and DIrector at CDTECH | Inventor | Manufacturer

5mo

Agentic systems, with their emphasis on autonomy and goal-directed behavior, are indeed poised to revolutionize AI. LangChain's modularity and Databricks' robust infrastructure provide a fertile ground for developing these complex agents. However, ensuring ethical alignment and interpretability in such systems remains a significant hurdle. Given the potential for emergent behaviors, how can we effectively design safety mechanisms within these self-learning, goal-oriented architectures?

Like
Reply

To view or add a comment, sign in

More articles by Scott McKean

  • DS Fortune Cookies: FTI Architecture

    DS Fortune Cookies: FTI Architecture

    Three sisters dancing in endless flow, feature, train, and infer they go! I read the LLM Engineer's Handbook over the…

  • Azure Databricks CI/CD

    Azure Databricks CI/CD

    This is an opinionated article on continuous integration and continuous delivery (CI/CD). These are specific practices…

    5 Comments
  • DS Fortune Cookies: LangChain, Agents, and Authentication

    DS Fortune Cookies: LangChain, Agents, and Authentication

    “Embrace LangChain's evolution and your spirit will be unbreakable, unlike your code.” This fortune cookie clarifies…

    2 Comments
  • An Opinionated Primer on Fine-Tuning

    An Opinionated Primer on Fine-Tuning

    Databricks Week 18 I'll admit that when I first heard about 'small language models', I thought it was a ridiculous fad.…

    4 Comments
  • DS Fortune Cookies: System Prompts

    DS Fortune Cookies: System Prompts

    "Lucky numbers: 0, 1. Lucky words: Your system prompt.

    2 Comments
  • Text Similarity

    Text Similarity

    Databricks Week 16 This week I had the pleasure of speaking with a couple of customers that want to compare two bits of…

    1 Comment
  • 100 Days at Databricks

    100 Days at Databricks

    As I hit the 100-day mark at Databricks, I want to review the journey so far with some of the bigger themes that stood…

    6 Comments
  • Anomaly Detection

    Anomaly Detection

    Databricks Week 12/13 I was asked to help a customer out with anomaly detection. I brushed off some of the thoughts I…

    4 Comments
  • Forecasting Deep Dive

    Forecasting Deep Dive

    Databricks Week 10/11 Today is the day - I’m going to really let myself talk nerd. Let’s dive into time series…

    2 Comments
  • DS Fortune Cookies: Liquid AI

    DS Fortune Cookies: Liquid AI

    "When time is of the essence, closed-form solutions make all the difference." Liquid AI introduced a novel class of…

    1 Comment

Insights from the community

Others also viewed

Explore topics