Comprehensive Guide to ReAct Prompting and ReAct based Agentic Systems

Comprehensive Guide to ReAct Prompting and ReAct based Agentic Systems

Pranav
Prompt Engineering

What is ReAct prompting?

ReAct prompting technique combines the “reasoning” and “acting” capabilities of an LLM to help with tasks like action planning, verbal reasoning, decision-making, and knowledge integration. It does so by forcing the model to reason and observe before acting. This helps the model to analyze the context and situation so far, and then take the actions necessary to move forward. This technique has shown great improvements over past prompting techniques like CoT (Chain or Thought) and Zero-Shot Prompting.

ReAct prompting has also proved to be very effective in tasks like function calling or tool integration, planning ahead, agentic behavior, etc. And it is highly customizable to the specific task at hand. This makes ReAct one of the most used and prominent prompting techniques out there.

How to Implement React Prompting

Implementing React prompting is actually a very straightforward and easy process. Most of the time a simple prompt with clean instructions suffice. The same goes with customizing the prompt too, minor changes and some examples can easily make the prompt aligned with your specific task.

Zero-Shot

Zero-shot prompting is when the prompt does not contain examples of how to do the task at hand. Zero shot provides instructions on how to do a task instead of examples, this is done when the task at hand is very complicated and there is no specific pattern to follow or it is not possible to provide enough examples to explain the task properly to the LLM model.

Zero Shot prompting works because language models like ChatGPT are “instruction tuned”, meaning they don’t just follow patterns in prompts, but can follow abstract instructions and provide outputs based on that. This removes the need for examples, as the instructions can provide the model with directions on what to do and how to perform the task at hand.

To implement React with Zero-Shot prompting, you can simply provide a set of instructions on how to follow a React based output scheme in detail, and the LLM will follow. Look at the image below on how we implement the React based prompting simply with instructions and no examples at all.

Here is the prompt in the above example, you can use this with any model you like:

I want you to solve problems using the ReACT (Reasoning and Acting) approach.For each step, follow the format:‍

Thought: Reason step-by-step about the current situation and what to do next.
Action: [The specific action to take]
Observation: [The result of the action]‍

Continue this Thought/Action/Observation cycle until you solve the problem.Then provide your Final Answer.‍


Always output in the given format.

Few Shot

Few shot prompting on the other hand provides multiple examples to the model for the task at hand. This technique works really well for models that have not been finetuned to follow instructions. Models like GPT-3 are deemed to be “few shot learners” as shown in their paper. The provided pattern allows the models to understand the problem and learn from the input-output pairs provided in the context. This high-level pattern matching allows models to solve many sophisticated problems without any finetuning.

Now, few shot prompting is rather rare because of the instruction-tuned models, but we often use few shot along with zero shot. Meaning, that we often provide detailed instructions for the task the LLM is supposed to perform, but along with that we also provide many examples in the prompt to give a demonstration of how to do the task properly. This eliminates the guesswork and covers the edge cases where instructions might not be enough. Few Shot can also be helpful in scenarios where the model can learn more from the examples, and providing a proper set of instructions might be more complicated and introduce more edge cases.

To implement React with Few-Shot prompting, you just need to provide some additional examples of your use cases and walk the LLM through them if necessary, along with the instructions we provided in the Zero Shot setting. Here’s how we implement ReAct with few shot:

And here’s the prompt used in the above image:

I want you to solve problems using the ReACT (Reasoning and Acting) approach.
For each step, follow the format:

Thought: Reason step-by-step about the current situation and what to do next.
Action: [The specific action to take]
Observation: [The result of the action]

Continue this Thought/Action/Observation cycle until you solve the problem.
Then provide your Final Answer.

Example 1:
User: What's the population difference between New York City and Los Angeles?

Thought: I need to find the populations of both New York City and Los Angeles, then calculate the difference.
Action: Search for population of New York City
Observation: New York City has a population of approximately 8.8 million people.

Thought: Now I need the population of Los Angeles.
Action: Search for population of Los Angeles
Observation: Los Angeles has a population of approximately 3.9 million people.

Thought: Now I can calculate the difference between the two populations.
Action: Calculate 8.8 million - 3.9 million
Observation: The difference is 4.9 million.

Final Answer: The population difference between New York City and Los Angeles is approximately 4.9 million people, with New York City having the larger population.

Example 2:
User: What's the capital of France and what's its population?

Thought: I need to find the capital of France and then its population.
Action: Identify the capital of France
Observation: The capital of France is Paris.

Thought: Now I need to find the population of Paris.
Action: Search for population of Paris
Observation: Paris has a population of approximately 2.1 million people.

Final Answer: The capital of France is Paris, and it has a population of approximately 2.1 million people.


Always output in this format

Parameters and Points to keep in mind

When working with React, there are many variations and parameters to keep in mind. The classical LLM parameters like temperature and top-p are not very relevant here. As in the paper authors have not mentioned temperature or any specific LLM configuration anywhere, we can assume that they worked mostly with temperature set to 0. 

The authors do mention that they worked with Chain of Thought prompting with Self-Consistency where they generated 21 chains of thoughts using the LLM with temperature set to 0.7 and used the majority as the answer. They mentioned that this technique boosted performance over classical Chain of Thought prompting, but was still less than ReAct’s accuracy. However, Chain of Thought and Chain of Thought with Self Consistency performed much better than React on the HotpotQA benchmark.

Keeping this in mind, React performs much better with temperature 0 on most tasks, except those similar to HotpotQA, which is mainly fact-based question-answering. But, React combined with CoT-SC gives SOTA results, so it’s a very difficult decision on what to use exactly. It is suggested to test out different variations and then pick one that works for your task.

Another variation to test is something the authors call “Act Only” prompting. This is where you take away the REasoning part of the prompt and the model, leaving only Action and Observation. Similar approaches have been used before to make LLMs interact with tools and external information but ReAct improves upon them almost all the time. Still, Act-only prompting is worth testing if your task is basic enough and doesn’t require too much reasoning, this can help you save on token costs and reduce latency a little.

Function Calling With React

Because React prompting has the idea of “act” built into it, it is very easy to build function calling or tool calling flows with it. You simply have to create various “actions” for your model and give it permission to call them, and it will do so in the action section of its outputs.

Then you would need to register triggers based on certain actions of the model, to do that you can use a simple .contains or a regex to detect presence of actions. Once detected, pause the flow of the React model, and execute the action the model wanted, and then provide back the response. You can easily provide back the response by pasting it in the Action section after the function call, like this: 

Act: GetCurrentWeather() -> {“temp”: “72F”, “wind”: “8mph”, “cond.”: “Partly Cloudy”}

You can simply append the output of the function after an -> and resume execution from there. The model will then output an “Observation”, unless some other step is specified.

This is a very basic and straightforward flow, although massively extendable and you can change the set of actions and how the model reasons as the model moves ahead. Maybe the model has already deleted a file once, you can just remove the DELETE action from the model’s action set, or can add new actions to the model hence improving the performance of the model as the flow progresses.

Function calling with Arguments

Another very important aspect of calling functions using LLMs is calling them with arguments. Not all functions are stateless. Some require specific information to act upon, or simply to work with to provide an output to the user. A classic example is if the user asks for weather at a specific place like Seattle or New York, then the function needs to also accept a location as an argument or a parameter to execute properly.

This is also very simple to execute with React as it is mostly about changing how the functions are defined and how they are parsed by the parser afterwards. You can simply tell the model to output actions in a specific format when calling the function with arguments. For example:

Once the format is established, you can update the parser to parse the function calls accordingly and pass on to the execution pipeline.

One very important point to note here is that as you introduce more and more functions and tools for the model, it starts making more mistakes and hallucinations might become more common. This is very rare and happens when you have a lot of functions with lots of parameters, but still, can and does happen specifically in enterprise applications, it is important to maintain these issues properly. One common way is to use forced grammar to make sure the model doesn’t make syntactical errors. The outlines library implements this pretty seamlessly, HuggingFace has a good guide for TGI Structured Outputs, and VLLM also supports multiple libraries for the same task.

Integrate with External Knowledge Bases

Because of the function calling capabilities, it makes it much easier to integrate with external sources using React prompting. Simply need to add the action set to call APIs, perform RAG operations, etc and the model will take care of it.

Here’s our guide on how you can integrate tools and functions with LLMs.

Comprehensive Guide to Integrating Tools and APIs with Language Models

We do not use ReACT in this guide, but React can simply be extended to use the same principles as we outlined in the blog.

ReAct vs Chain of Thought Prompting

Now both ReAct and Chain of Thought (COT) are very similar prompting styles given that they force the model to reason before answering, and that’s where most of the gains in performance comes from.

Although, Chain of thought is much better when it comes to extensive reasoning, and multi step thoughts are required. For example when solving a math problem, the problem needs to be simplified and then solved. COT is much better at linear reasoning and then reaching a result.

Example:

Problem: What is 17 × 24?

CoT: I'll multiply step by step.

  17 × 20 = 340

  17 × 4 = 68

  340 + 68 = 408

Answer: 408

React is more dynamic than Chain of thought. Along with reasoning it lets you introduce a cycle of observations and actions, which let you use react for dynamic conditions and rather for “actions” than just for reasoning and solving problems. This makes it a little worse at solving linear problems, but makes it much better and easier to work with when tools, dynamic conditions and other challenges are involved.

Task: Find population data for New York and calculate its growth rate

Thought: I need population data for New York for different years.

Action: Search for "New York population data 2010 and 2020"

Observation: 2010 population: 19,378,102; 2020 population: 20,201,249

Thought: Now I can calculate the growth rate.

Action: Calculate (20,201,249 - 19,378,102) / 19,378,102 × 100

Observation: 4.25% growth

Thought: The population growth rate of New York from 2010 to 2020 was 4.25%.

Advantages of ReAct Prompting over normal prompting?

There are a few reasons why you would pick the ReAct prompting structure over normal prompting structures. It usually provides you with more control, more structured outputs, and lets you direct the model exactly how to proceed in most scenarios. Let’s discuss these points in more detail.

Gives a window into the LLM’s thinking process

Using ReAct gives you a better understanding of how the LLM is working to solve the problem as it deconstructs the problem and solution steps into a very well-structured format. This means you as the developer or prompter, can sit down and nitpick exactly what you don’t like in the model’s thinking process.

You can tell the model to reason in a specific way, and define exactly how to act and what to observe in certain scenarios. Along with this control, you can group or cluster LLM’s responses and figure out where the model makes the most mistakes, maybe most of the mistakes are happening when a certain ACTION is executed. This gives you a very deep insight into how the model is behaving, and how you can go about steering it to provide better responses.

Gives stricter control over the LLM’s outputs

Better observability leads to better control. Once you see all the thoughts of the LLM, you know exactly where to interject. For example, if the model is working on mathematical problems, you can prompt the model to “Spend some time breaking the equation down into the most basic parts and then call the Calculator too”, or when the model is in self-correction mode, you can tell it to analyze all of its past reasoning steps and actions in massive detail and ask it to find out errors.

This is just from an prompting perspective, you can do much more complex things like asking the model to “Reason Twice” – one time from two different perspectives. And then combine that and in the third reasoning step and then act. Such things are much more complex but are needed to solve some more tricker and complex problems. For example, if you are building a bot which talks about philosophy with the users, you really need it to think and evaluate its arguments from multiple perspectives before providing an answer. Using React prompting in such scenarios is a very straightforward way to build your solution.

ReAct can help design Programmatic Flows and Multistep Solutions

This is an application of ReACT prompting that has not been explored much, but it is possible and leads to amazing results. Because you can control React’s outputs and reasoning traces very strictly, you can build flows on top of them. Often multiple steps are needed to solve a problem, rather than just one single step. For example, generating SEO articles, if you do it in one step the results can be good, but much better flow is to generate the article and THEN optimize it with a SEO optimization specific prompt.

Here’s a visual example of how an agentic or multistep flow would work with ReACT

You can write sophisticated prompts to do all this in one-shot, yes, but then it makes it much harder to control and change parts of the model and the output. Using ReACT and a multistep flow gives you the freedom to tune every single step individually.

Straightforward way to do this is to hook up triggers to certain actions or observations of the model, example, if the model’s action is to edit a specific part of the article, you use another prompt and model set that is more tuned to editing articles than writing it, and if the model observes a factual inconsistency, you can use another model to fix it, than fixing it in the same react flow. This method lets you branch off from the core flow of the model output and do multiple tasks parallelly. In such a flow the ReACT flow sits as the main stream of the model’s actions, and outputs triggers, and various other actions are done based on the triggers from the main stream.

React Prompting with Langchain

Now Langchain is one of the most popular methods to build AI agents and simple workflows, here’s a small snippet on how to build a basic react agent using langchain. You simply need to import “create_react_agent” and configure it with your tools and query and you are good to go!

One small thing is that Langchain is pretty good, but sometimes it abstracts away too much, don’t use it if you want more customization in your flows!


from langgraph.prebuilt import create_react_agent

langgraph_agent_executor = create_react_agent(model, tools)


messages = langgraph_agent_executor.invoke({"messages": [("human", query)]})
{
	"input": query,
	"output": messages["messages"][-1].content,
}


Some tips when working with ReACT Prompting

We at Mercity have been working with LLMs for years now, here are some points that we have learned from a lot of experience in ReACT prompting that you can use too.

ReACT is very Extensible

We mentioned in the starting that the React works by breaking the outputs down into reasoning, action and then observation. This is actually not a strict rule, just a general pattern that is suggested to be followed. You can completely remove the action step if you do not need it. And you can add multiple reasoning steps if you want, or maybe multiple action steps.

This comes in handy when you are working with more tricker and sophisticated applications, maybe it requires you think from multiple perspectives before answering an question, that;s where you can have multiple reasonings, maybe you want to have only one very long and extensive reasoning and then multiple observations and actions based off of that single very long reasoning step – that is possible too. And it will save you a lot of time, as reasoning is where most of the time is spent by the model.

This extensibility allows you to reshape the technique completely for your own use case, without hampering, and often increasing performance over baseline react prompting method.

ReACT prompting format does not matter

Along with extensibility, the format is very flexible too. You can output in JSON if you want, XML or even YAML, none of which affect performance in any major way.

We like to use XML internally, simply because it is easier to manage and work with when things get complex and there are many nested clauses. But, most of the people like to use JSON as it is the easiest way to get the model working as soon as possible and parsing it is very easy too.

The idea is, that you are not bound to any specific format, as long as your are following the general ReACT output pattern, you should be able to get similar performance!

Utilize direct Function Calling

Many providers like OpenAI and Anthropic now provide their own function calling features and have internal systems set up to make sure you get high accuracy when dealing with functions. Here’s openAI’s guide on function calling.

React is indeed a good method to induce function calling from LLMs, but when working with providers it's usually better to use the features they provide for it. They set up a lot of things underneath, make sure you use all that. 

Another very important thing to note is that even Open Source models like LLaMA and Mistral now have special tokens to manage function list and function calling, here is a guide on how llama does it.

Want help with your React Prompting workflows?

We at mercity have been working with LLMs for years now and have written all sorts of prompts, and especially many variants of React.

Contact us, if you need any help with your prompting workflows!

Subscribe to stay informed

Subscribe to our newsletter to stay updated on all things AI!
Subscribe
Awesome, you subscribed!
Error! Please try again.