Meet Mr. Prompty! How to Make Your AI Think and Act Like a Human - ReAct Prompt Engineering

Mario Fontana

Senior Cloud Solution Architect | Linkedin Top Voice Artificial Intelligence| Microsoft AI LAB | Keynote Speaker, Book Author, Coach. I Help Businesses Drive Innovation with Cutting-Edge AI Solutions.

Published Sep 6, 2023

In this new article, we will delve into another exciting Prompt Engineering Technique called ReAct that builds upon the scenarios discussed in all my previous articles.

Large language models (LLMs) have demonstrated immense capabilities in natural language processing, but also suffer from critical limitations. Two of the most concerning are their tendency to hallucinate facts and make logical errors during reasoning. Researchers have proposed methods to mitigate these weaknesses, but a recently published approach called ReAct offers a novel solution by treating LLMs as intelligent agents that can take actions as well as think. LLMs like GPT-3 and PaLM learn statistical patterns from massive text datasets during training. This allows them to generate remarkably human-like text through prompt engineering techniques. However, their predictions are based solely on recognizing similarities to the training data, not any true understanding or reasoning. As a result, LLMs will confidently generate false information and make obvious mistakes in logic. Prior methods attempted to improve reasoning by providing examples of step-by-step deduction for the LLM to follow. But without any way to verify facts, hallucination still corrupted the process. ReAct expands on this by situating the LLM in a simulated environment it can interact with through textual actions. Taking actions produces real observations, grounding the agent in reality.

ReAct and the Human Behavior

ReAct interleaves actions with verbal reasoning in a Reinforcement Learning from Human Feedback approach. Humans provide examples of thoughts, actions, and observations for a given task. The LLM learns to emulate this behavior, interleaving steps of deduction with information-gathering actions. An environment module executes the actions and returns observations. Studies show ReAct reduces hallucination compared to reasoning alone. It also outperforms imitation learning in decision-making tasks. Some logical errors remain, but hybrid approaches like alternating ReAct and reasoning mitigate this.

ReAct connects to other research on reasoning, acting, and agent modeling with LLMs. Chain-of-thought prompting (Wei et al., 2022) showed LLMs can verbalize reasoning, but suffer from ungrounded facts. WebGPT (Nakano et al., 2021) and SayCan (Ahn et al., 2022) act in environments using LLMs, but without explicit reasoning modeling. ReAct combines reasoning modeling with environment interaction for a more human-like approach.

Applying ReAct

Imagine you're playing a video game where you have a character that can explore a virtual world, like Minecraft. Your character can walk around, pick up objects, place objects, and do other actions. Now also imagine that your character can think out loud, saying what their plan is step-by-step to accomplish some goal. For example, if the goal is to build a house, your character might think:

First I need to get some wood logs. There are probably trees in the forest I can chop down. Let me walk over there and try chopping down a tree to get wood.

Then your character would actually walk to the forest, chop down a tree, and get wood. The next thought might be:

OK now I have wood logs. Next I need to build the walls and floor of the house. I'll place the wood blocks in a square shape for each.

And your character would start placing the wood blocks on the ground in a square shape for the floor.

This combination of thinking out the plan while also taking actions in the game world is the core idea behind ReAct. The "thoughts" allow the character to strategize and track progress, while the actions let them actually accomplish goals. You could write out an example prompt with some thoughts, actions, and observations like:

Thought: I need to find some wood logs to build a house. There are probably trees in the forest. Action: Walk to the forest Observation: Arrived in forest area with many tall trees. Action: Chop down a tree Observation: Tree falls over and breaks into 4 wood logs Thought: Now I have wood logs. Next I'll build the floor of the house. Action: Place wood blocks in 5x5 square on ground. Observation: Wood floor created.

And so on, interleaving thoughts, actions, and observations from the environment. The AI agent would learn from these examples to think and act in this coordinated way.

Let's see an example:

Task: You want to book a flight from New York to Los Angeles next Friday. Find a flight under $300 and book it.

EXAMPLES:
Title: Build a House

Thought: I need to find some wood logs to build a house. There are probably trees in the forest. 

Action: Walk to the forest  
Observation: Arrived in forest area with many tall trees. 

Action: Chop down a tree
Observation: Tree falls over and breaks into 4 wood logs 

Thought: Now I have wood logs. Next I'll build the floor of the house.
Action: Place wood blocks in 5x5 square on ground. 
Observation: Wood floor created.

Title: Make a Sandwich  

Thought: I'm hungry and want to make a sandwich. I'll need to get bread, meat, cheese, and veggies from the kitchen. 

Action: Walk to kitchen 
Observation: Entered kitchen area, sees fridge, counter, breadbox, and cabinets. 
Action: Open fridge 
Observation: Fridge contains milk, juice, cheese, lettuce, tomato, ham, and turkey. 

Thought: I found ingredients in fridge. Next I need to get the bread and assemble the sandwich. 

Action: Take bread from breadbox 
Observation: Got 2 slices of bread 

Action: Take ham, cheese, lettuce, and tomato from fridge 
Observation: Retrieved ingredients 

Action: Place ham and cheese between bread slices. Put lettuce and tomato on top. 
Observation: Sandwich assembled.


Please, use the same approach and work on Task.

The LLM will respond:

CoT vs ReAct

Chain of Thought (CoT) and ReAct are both advanced prompt engineering techniques. CoT is a prompting technique that allows large language models (LLMs) to generate reasoning traces. On the other hand, ReAct is a framework that combines both reasoning and action generation into one output. This allows the model to better synchronize thoughts with actions, and to interact with external tools to retrieve additional information that leads to more reliable and factual responses. The best approach overall is a combination of ReAct and CoT that allows for the use of both internal knowledge and externally obtained information during reasoning.

The combination of Chain of Thought (CoT) and ReAct improves the performance of Large Language Models (LLMs) by allowing them to generate both reasoning traces and task-specific actions in an interleaved manner. This allows the model to induce, track, and update action plans, and even handle exceptions. The action step allows the model to interface with and gather information from external sources such as knowledge bases or environments1. This leads to more reliable and factual responses. Results show that ReAct can outperform several state-of-the-art baselines on language and decision-making tasks. ReAct also leads to improved human interpretability and trustworthiness of LLMs. Overall, the authors found that the best approach uses ReAct combined with CoT that allows use of both internal knowledge and external information obtained during reasoning.

ReAct and the Real World

In the context of ReAct, an “environment” refers to the external world or system with which the Large Language Model (LLM) interacts. This could be a database, a web search engine, a physical sensor, or any other source of information that is not part of the LLM’s internal knowledge. Observations are pieces of information or data that the LLM receives from this environment.

For example, in a scenario where the LLM is interacting with a web search engine, the search results that it receives would be considered observations. The key point here is that these observations are not generated by the LLM itself. Instead, they are produced by the environment and then passed to the LLM. This allows the LLM to access and utilize information that was not included in its training data, thereby greatly expanding its knowledge and capabilities.

In summary, when we say “observations are not generated by the LLM but by the environment”, we mean that the LLM is receiving information from an external source rather than producing it internally. This is a crucial aspect of how ReAct enhances the performance and versatility of LLMs.

Conclusion

In conclusion, ReAct Prompting stands as a promising stride in the realm of prompt engineering, offering a compelling solution to the challenges posed by chain-of-thought reasoning. While it may not be flawless, primarily due to its occasional reasoning errors, ReAct successfully addresses the critical issue of hallucination, fostering a more reliable and interpretable interaction with language models.

Links with this icon were created by LinkedIn and links without it were added by the author.

Meet Mr. Prompty! How to Make Your AI Think and Act Like a Human - ReAct Prompt Engineering

Mario Fontana

Senior Cloud Solution Architect | Linkedin Top Voice Artificial Intelligence| Microsoft AI LAB | Keynote Speaker, Book Author, Coach. I Help Businesses Drive Innovation with Cutting-Edge AI Solutions.

ReAct and the Human Behavior

Applying ReAct

Recommended by LinkedIn

CoT vs ReAct

ReAct and the Real World

Conclusion

More articles by this author

Insights from the community

Others also viewed

10 Mind-Blowing Things You Didn't Know GPT-4 Could Do

Crafting Intelligence: The Art of Tailoring Large Language Models for Precision and Relevance

Customizing and optimizing methods for Large Language Models (LLMs)

An introduction to LLM Prompt Engineering

Prompt Engineering: Unlocking the Power of Large Language Models

A Comparative Analysis: GPT-4 and Falcon LLM

Large Language Models

Exploring the Power of Large Language Models (LLMs): A New Era in AI

Retrieval Augmented Generation (RAG): The Second Coming of LLMs

Will Long-Context LLMs Cause the Extinction of RAG?

Explore topics

ReAct and the Human Behavior

Applying ReAct

Recommended by LinkedIn

CoT vs ReAct

ReAct and the Real World

Conclusion

Dear Diary: Today, My AI Understood My Emotions

Jul 25, 2024

How Generative AI is Transforming Job Roles in IT

Jul 3, 2024

Mastering AI: How to Become an AI Agent Developer with Microsoft Technologies in 2024

May 23, 2024

Navigating the Risks of Shadow AI: Strategies for Ethical Compliance

Mar 12, 2024

Transforming Software Development: Integrating Generative AI with Observability Framework

Dec 5, 2023

Why AutoGen is a Game-Changer for AI Developers and Architects: My Personal View

Oct 10, 2023

Meet Mr. Prompty! Unlocking Creativity in AI with The Power of Constraints

Oct 4, 2023

Meet Mr. Prompty! Planting Ideas: How Tree of Thoughts Cultivates Smarter AI - Prompt Engineering

Jul 31, 2023

Meet Mr. Prompty! Multiple Paths to Success. Self-Consistency Supercharges Chain-of-Thought.

Jul 26, 2023

Meet Mr. Prompty! Break the Tasks Down and Chain of Thought: The Dynamic Duo of Prompt Engineering

Jul 24, 2023

Insights from the community

Others also viewed

10 Mind-Blowing Things You Didn't Know GPT-4 Could Do

Crafting Intelligence: The Art of Tailoring Large Language Models for Precision and Relevance

Customizing and optimizing methods for Large Language Models (LLMs)

An introduction to LLM Prompt Engineering

Prompt Engineering: Unlocking the Power of Large Language Models

A Comparative Analysis: GPT-4 and Falcon LLM

Large Language Models

Exploring the Power of Large Language Models (LLMs): A New Era in AI

Retrieval Augmented Generation (RAG): The Second Coming of LLMs

Will Long-Context LLMs Cause the Extinction of RAG?

Explore topics