Meet Mr. Prompty! How to Make Your AI Think and Act Like a Human - ReAct Prompt Engineering
In this new article, we will delve into another exciting Prompt Engineering Technique called ReAct that builds upon the scenarios discussed in all my previous articles.
Large language models (LLMs) have demonstrated immense capabilities in natural language processing
ReAct and the Human Behavior
ReAct interleaves actions with verbal reasoning in a Reinforcement Learning from Human Feedback
ReAct connects to other research on reasoning, acting, and agent modeling with LLMs
Applying ReAct
Imagine you're playing a video game where you have a character that can explore a virtual world, like Minecraft. Your character can walk around, pick up objects, place objects, and do other actions. Now also imagine that your character can think out loud, saying what their plan is step-by-step to accomplish some goal. For example, if the goal is to build a house, your character might think:
First I need to get some wood logs. There are probably trees in the forest I can chop down. Let me walk over there and try chopping down a tree to get wood.
Then your character would actually walk to the forest, chop down a tree, and get wood. The next thought might be:
OK now I have wood logs. Next I need to build the walls and floor of the house. I'll place the wood blocks in a square shape for each.
And your character would start placing the wood blocks on the ground in a square shape for the floor.
This combination of thinking out the plan while also taking actions in the game world is the core idea behind ReAct. The "thoughts" allow the character to strategize and track progress, while the actions let them actually accomplish goals. You could write out an example prompt with some thoughts, actions, and observations like:
Thought: I need to find some wood logs to build a house. There are probably trees in the forest. Action: Walk to the forest Observation: Arrived in forest area with many tall trees. Action: Chop down a tree Observation: Tree falls over and breaks into 4 wood logs Thought: Now I have wood logs. Next I'll build the floor of the house. Action: Place wood blocks in 5x5 square on ground. Observation: Wood floor created.
And so on, interleaving thoughts, actions, and observations from the environment. The AI agent would learn from these examples to think and act in this coordinated way.
Let's see an example:
Task: You want to book a flight from New York to Los Angeles next Friday. Find a flight under $300 and book it.
EXAMPLES:
Title: Build a House
Thought: I need to find some wood logs to build a house. There are probably trees in the forest.
Action: Walk to the forest
Observation: Arrived in forest area with many tall trees.
Action: Chop down a tree
Observation: Tree falls over and breaks into 4 wood logs
Thought: Now I have wood logs. Next I'll build the floor of the house.
Action: Place wood blocks in 5x5 square on ground.
Observation: Wood floor created.
Title: Make a Sandwich
Thought: I'm hungry and want to make a sandwich. I'll need to get bread, meat, cheese, and veggies from the kitchen.
Action: Walk to kitchen
Observation: Entered kitchen area, sees fridge, counter, breadbox, and cabinets.
Action: Open fridge
Observation: Fridge contains milk, juice, cheese, lettuce, tomato, ham, and turkey.
Thought: I found ingredients in fridge. Next I need to get the bread and assemble the sandwich.
Action: Take bread from breadbox
Observation: Got 2 slices of bread
Action: Take ham, cheese, lettuce, and tomato from fridge
Observation: Retrieved ingredients
Action: Place ham and cheese between bread slices. Put lettuce and tomato on top.
Observation: Sandwich assembled.
Please, use the same approach and work on Task.
The LLM will respond:
Recommended by LinkedIn
Title: Book a Flight Thought: I want to book a flight from New York to Los Angeles next Friday under $300. I’ll need to search for available flights and compare prices.
Action: Search for flights
Observation: Found information about flights from New York to Los Angeles on 09/08/20231. There are also cheap flights from New York to Los Angeles International with prices starting at $108 for one-way flights and $211 for round trip2.
Thought: I found some flights that meet my criteria. Next, I’ll need to book the flight.
Action: Book the flight
Observation: Flight booked successfully.
Below another example from the paper:
and related performance
CoT vs ReAct
Chain of Thought (CoT) and ReAct are both advanced prompt engineering techniques. CoT is a prompting technique that allows large language models (LLMs) to generate reasoning traces. On the other hand, ReAct is a framework that combines both reasoning and action generation into one output. This allows the model to better synchronize thoughts with actions, and to interact with external tools to retrieve additional information that leads to more reliable and factual responses. The best approach overall is a combination of ReAct and CoT that allows for the use of both internal knowledge and externally obtained information during reasoning.
The combination of Chain of Thought (CoT) and ReAct improves the performance of Large Language Models (LLMs) by allowing them to generate both reasoning traces and task-specific actions in an interleaved manner. This allows the model to induce, track, and update action plans, and even handle exceptions. The action step allows the model to interface with and gather information from external sources such as knowledge bases or environments1. This leads to more reliable and factual responses. Results show that ReAct can outperform several state-of-the-art baselines on language and decision-making tasks. ReAct also leads to improved human interpretability and trustworthiness of LLMs. Overall, the authors found that the best approach uses ReAct combined with CoT that allows use of both internal knowledge and external information obtained during reasoning.
ReAct and the Real World
In the context of ReAct, an “environment” refers to the external world or system with which the Large Language Model (LLM) interacts. This could be a database, a web search engine, a physical sensor, or any other source of information that is not part of the LLM’s internal knowledge. Observations are pieces of information or data that the LLM receives from this environment.
For example, in a scenario where the LLM is interacting with a web search engine, the search results that it receives would be considered observations. The key point here is that these observations are not generated by the LLM itself. Instead, they are produced by the environment and then passed to the LLM. This allows the LLM to access and utilize information that was not included in its training data, thereby greatly expanding its knowledge and capabilities.
In summary, when we say “observations are not generated by the LLM but by the environment”, we mean that the LLM is receiving information from an external source rather than producing it internally. This is a crucial aspect of how ReAct enhances the performance and versatility of LLMs.
Conclusion
In conclusion, ReAct Prompting stands as a promising stride in the realm of prompt engineering, offering a compelling solution to the challenges posed by chain-of-thought reasoning. While it may not be flawless, primarily due to its occasional reasoning errors, ReAct successfully addresses the critical issue of hallucination, fostering a more reliable and interpretable interaction with language models.