Developing Agentic Capabilities for LLMs to automate business workflows and create smart assistants
In just a few months since the rise of transformer-based models, Large Language Models have demonstrated their stunning ability to engage in human-like conversations, revolutionizing business productivity and rendering traditional search engines more or less obsolete.
However, this ability is limited to the information on which they have been trained, or publicly available information via internet searches, while corporate data remains inaccessible.
Retrieval Augmented Generation (RAG) [1] attempts to address this limitation by allowing a model to respond based on a document database, thus providing justified answers and significantly reducing hallucinations. However, their scope is still restricted to answering questions rather than performing tasks autonomously, using different tools at their disposal (searching the internet, searching a database, rephrasing/clarifying user questions...) to accomplish complex tasks.
This learning-by-doing process is a revolution in the making for Large Language Models (LLMs): it is called developing “Agentic capabilities.”
1. What is an Agent?
To build an effective agent, we can tap into the intrinsic capabilities of LLMs, which extend beyond mere Natural Language Processing (NLP) tasks like text understanding and generation to include genuine reasoning abilities. Recent advancements in Generative AI focus on enhancing LLMs with agentic capabilities, enabling them to plan and execute tasks.
Agentic capabilities involve using a range of tools to achieve specific goals. To accomplish complex tasks, an agent must select the right tools for the job and use them effectively. This requires the agent to understand the task at hand, identify the necessary tools, and apply them in the correct order.
For example, if a user asks this query: “Respond to this customer complaint after reviewing our contract and research for relevant civil code articles.”
The agent's task list might look like this:
1. Clarify the user's question to ensure understanding
2. Analyze the attached customer complaint document
3. Retrieve the client's contract from the company's document database
4. Research relevant civil code articles
5. Generate a response to the customer complaint
By breaking down complex tasks into manageable steps and using the right tools for each step, the agent can effectively respond to the customer complaint and provide a high-quality solution
2. What is a tool?
A tool is a predefined workflow designed to accomplish a specific task, which can be accessed and utilized by an agent to achieve a particular goal or respond to a user request.
In this sense, a tool is a self-contained module or component that performs a specific function or set of functions, and can be combined with other tools to accomplish more complex tasks. Tools can be thought of as building blocks that an agent can use to construct a solution to a user's request.
Examples of tools might include:
By accessing and utilizing these tools, an agent can perform complex tasks and respond to user requests in a more efficient and effective manner.
Recommended by LinkedIn
3. The benefit of developing an Agent Capability: a better “grounding” of the LLM into reality
This agentic augmentation of the LLM has a triple benefit:
Moreover, through its agentic performance, the LLM receives new feed-backs from its actions and thus increases its reasoning capabilities while learning from confronting the outside world.
The agentic capabilities can be considered as a first step to ground the LLM into the real world.
4. The example of TRICE, a two-steps approach to Agent Capability Building
Several research papers are currently being published reflecting the growing interest in Agents. The “Toolformer” paper [2] explains how LLMs can “teach themselves to use external tools” via simple APIs.
The CRAFT methodology (Customizing LLMs by Creating and Retrieving from Specialized Toolsets [3]) aims at creating tool sets specifically curated for some given tasks and equipping LLMs with a component that retrieves tools from these sets to enhance their capability to solve complex tasks.
A recent research experiment [4] introduces a new methodology: TRICE (Tool leaRning wIth exeCution fEedback) which leverages on the grounding capability of the LLM into the real world through the usage of tool. The LLM is trained to use tools in two steps:
The experiment shows that this methodology outperforms other methodologies like “Toolformer” and gives a hint at how an agentic LLM can potentially evolve towards general complex task-solving capabilities when properly leveraged through the grounding dimension of its role as agent. In particular, for mathematical reasoning, which is one of the most challenging use cases for the classical use of the LLM, the TRICE approach uses the Calculator tool, as does the “Toolformer” experiment, yet outperforms “Toolformer” in this exercise.
5. The Agent Capability: overcoming the traditional limitations of LLMs in order to build automatic workflows
The difference in performance between “Toolformer” and “TRICE” is somehow acknowledged in both papers, pointing to the intrinsic limitation of the LLM and how both methodologies managed to overcome them. In the “Toolformer” methodology, main limitations come from two factors:
For Business use, the agentic capabilities of the LLM enable the creation of business workflows by better leveraging on the LLM reasoning capabilities while overcoming its traditional limitations (length of its context windows, lack of proper memory, complex prompt engineering, limited zero-shot capabilities, hallucinations...).
LightOn has implemented its own version of Agentic capabilities in Paradigm as an augmented version of its current “task-builder” and chat with docs (RAG) capability already available with the previous version of Paradigm (see next blog to come).
[1] Retrieval Augmented Generation: LLMs instructions are Augmented with information available in a private database
[2] “BOLAA: BENCHMARKING AND ORCHESTRATING LLM-AUGMENTED AUTONOMOUS AGENTS”, arXiv:2308.05960v1 [cs.AI] 11 Aug 2023
[3] “CRAFT: CUSTOMIZING LLMS BY CREATING AND RETRIEVING FROM SPECIALIZED TOOLSETS”, arXiv:2309.17428v1 [cs.CL] 29 Sep 2023
[4] “Making Language Models Better Tool Learners with Execution Feedback”, arXiv:2305.13068v1 [cs.CL] 22 May 2023