Meet Mr. Prompty! How to Make Your AI Think and Act Like a Human - ReAct Prompt Engineering

Meet Mr. Prompty! How to Make Your AI Think and Act Like a Human - ReAct Prompt Engineering

In this new article, we will delve into another exciting Prompt Engineering Technique called ReAct that builds upon the scenarios discussed in all my previous articles.

Large language models (LLMs) have demonstrated immense capabilities in natural language processing, but also suffer from critical limitations. Two of the most concerning are their tendency to hallucinate facts and make logical errors during reasoning. Researchers have proposed methods to mitigate these weaknesses, but a recently published approach called ReAct offers a novel solution by treating LLMs as intelligent agents that can take actions as well as think. LLMs like GPT-3 and PaLM learn statistical patterns from massive text datasets during training. This allows them to generate remarkably human-like text through prompt engineering techniques. However, their predictions are based solely on recognizing similarities to the training data, not any true understanding or reasoning. As a result, LLMs will confidently generate false information and make obvious mistakes in logic. Prior methods attempted to improve reasoning by providing examples of step-by-step deduction for the LLM to follow. But without any way to verify facts, hallucination still corrupted the process. ReAct expands on this by situating the LLM in a simulated environment it can interact with through textual actions. Taking actions produces real observations, grounding the agent in reality.

ReAct and the Human Behavior

ReAct interleaves actions with verbal reasoning in a Reinforcement Learning from Human Feedback approach. Humans provide examples of thoughts, actions, and observations for a given task. The LLM learns to emulate this behavior, interleaving steps of deduction with information-gathering actions. An environment module executes the actions and returns observations. Studies show ReAct reduces hallucination compared to reasoning alone. It also outperforms imitation learning in decision-making tasks. Some logical errors remain, but hybrid approaches like alternating ReAct and reasoning mitigate this.

ReAct connects to other research on reasoning, acting, and agent modeling with LLMs. Chain-of-thought prompting (Wei et al., 2022) showed LLMs can verbalize reasoning, but suffer from ungrounded facts. WebGPT (Nakano et al., 2021) and SayCan (Ahn et al., 2022) act in environments using LLMs, but without explicit reasoning modeling. ReAct combines reasoning modeling with environment interaction for a more human-like approach.

Applying ReAct

Imagine you're playing a video game where you have a character that can explore a virtual world, like Minecraft. Your character can walk around, pick up objects, place objects, and do other actions. Now also imagine that your character can think out loud, saying what their plan is step-by-step to accomplish some goal. For example, if the goal is to build a house, your character might think:

First I need to get some wood logs. There are probably trees in the forest I can chop down. Let me walk over there and try chopping down a tree to get wood.

Then your character would actually walk to the forest, chop down a tree, and get wood. The next thought might be:

OK now I have wood logs. Next I need to build the walls and floor of the house. I'll place the wood blocks in a square shape for each.

And your character would start placing the wood blocks on the ground in a square shape for the floor.

This combination of thinking out the plan while also taking actions in the game world is the core idea behind ReAct. The "thoughts" allow the character to strategize and track progress, while the actions let them actually accomplish goals. You could write out an example prompt with some thoughts, actions, and observations like:

Thought: I need to find some wood logs to build a house. There are probably trees in the forest. Action: Walk to the forest Observation: Arrived in forest area with many tall trees. Action: Chop down a tree Observation: Tree falls over and breaks into 4 wood logs Thought: Now I have wood logs. Next I'll build the floor of the house. Action: Place wood blocks in 5x5 square on ground. Observation: Wood floor created.

And so on, interleaving thoughts, actions, and observations from the environment. The AI agent would learn from these examples to think and act in this coordinated way.

Let's see an example:

Task: You want to book a flight from New York to Los Angeles next Friday. Find a flight under $300 and book it.

EXAMPLES:
Title: Build a House

Thought: I need to find some wood logs to build a house. There are probably trees in the forest. 

Action: Walk to the forest  
Observation: Arrived in forest area with many tall trees. 

Action: Chop down a tree
Observation: Tree falls over and breaks into 4 wood logs 

Thought: Now I have wood logs. Next I'll build the floor of the house.
Action: Place wood blocks in 5x5 square on ground. 
Observation: Wood floor created.

Title: Make a Sandwich  

Thought: I'm hungry and want to make a sandwich. I'll need to get bread, meat, cheese, and veggies from the kitchen. 

Action: Walk to kitchen 
Observation: Entered kitchen area, sees fridge, counter, breadbox, and cabinets. 
Action: Open fridge 
Observation: Fridge contains milk, juice, cheese, lettuce, tomato, ham, and turkey. 

Thought: I found ingredients in fridge. Next I need to get the bread and assemble the sandwich. 

Action: Take bread from breadbox 
Observation: Got 2 slices of bread 

Action: Take ham, cheese, lettuce, and tomato from fridge 
Observation: Retrieved ingredients 

Action: Place ham and cheese between bread slices. Put lettuce and tomato on top. 
Observation: Sandwich assembled.


Please, use the same approach and work on Task.        

The LLM will respond:

Title: Book a Flight Thought: I want to book a flight from New York to Los Angeles next Friday under $300. I’ll need to search for available flights and compare prices. 

Action: Search for flights 
Observation: Found information about flights from New York to Los Angeles on 09/08/20231. There are also cheap flights from New York to Los Angeles International with prices starting at $108 for one-way flights and $211 for round trip2. 

Thought: I found some flights that meet my criteria. Next, I’ll need to book the flight. 
Action: Book the flight 
Observation: Flight booked successfully.        

Below another example from the paper:


Comparison of (a) Act-only and (b) ReAct prompting to solve an AlfWorld (Shridhar et al., 2020b) game. In both domains, we omit in-context examples in the prompt, and only show task solving trajectories generated by the model (Act, Thought) and the environment (Obs).

and related performance

Table 1: PaLM-540B prompting results on HotpotQA and Fever. Picture on the right: PaLM-540B prompting results with respect to number of CoT-SC samples used


CoT vs ReAct

Chain of Thought (CoT) and ReAct are both advanced prompt engineering techniques. CoT is a prompting technique that allows large language models (LLMs) to generate reasoning traces. On the other hand, ReAct is a framework that combines both reasoning and action generation into one output. This allows the model to better synchronize thoughts with actions, and to interact with external tools to retrieve additional information that leads to more reliable and factual responses. The best approach overall is a combination of ReAct and CoT that allows for the use of both internal knowledge and externally obtained information during reasoning.

The combination of Chain of Thought (CoT) and ReAct improves the performance of Large Language Models (LLMs) by allowing them to generate both reasoning traces and task-specific actions in an interleaved manner. This allows the model to induce, track, and update action plans, and even handle exceptions. The action step allows the model to interface with and gather information from external sources such as knowledge bases or environments1. This leads to more reliable and factual responses. Results show that ReAct can outperform several state-of-the-art baselines on language and decision-making tasks. ReAct also leads to improved human interpretability and trustworthiness of LLMs. Overall, the authors found that the best approach uses ReAct combined with CoT that allows use of both internal knowledge and external information obtained during reasoning.


ReAct and the Real World

In the context of ReAct, an “environment” refers to the external world or system with which the Large Language Model (LLM) interacts. This could be a database, a web search engine, a physical sensor, or any other source of information that is not part of the LLM’s internal knowledge. Observations are pieces of information or data that the LLM receives from this environment.


For example, in a scenario where the LLM is interacting with a web search engine, the search results that it receives would be considered observations. The key point here is that these observations are not generated by the LLM itself. Instead, they are produced by the environment and then passed to the LLM. This allows the LLM to access and utilize information that was not included in its training data, thereby greatly expanding its knowledge and capabilities.

In summary, when we say “observations are not generated by the LLM but by the environment”, we mean that the LLM is receiving information from an external source rather than producing it internally. This is a crucial aspect of how ReAct enhances the performance and versatility of LLMs.

Conclusion

In conclusion, ReAct Prompting stands as a promising stride in the realm of prompt engineering, offering a compelling solution to the challenges posed by chain-of-thought reasoning. While it may not be flawless, primarily due to its occasional reasoning errors, ReAct successfully addresses the critical issue of hallucination, fostering a more reliable and interpretable interaction with language models.

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics