Justin Lin | Portfolio

Paper Notes: ReAct

Apr 19, 2026

TL;DR

LLMs either reason (CoT) or act (tools), but not both effectively.
ReAct interleaves reasoning steps and actions in a loop.
Key takeaway: reasoning + environment interaction improves accuracy and reduces hallucination.

Bibliographic Snapshot

Field	Detail
Citation	`Yao et al., ICLR 2023`
Keywords	agents, reasoning, tool use
Dataset / Benchmarks	HotpotQA, FEVER, ALFWorld
Code / Repo	https://react-lm.github.io

Problem Statement

Chain-of-thought reasoning is static and hallucination-prone, while action-based systems lack reasoning. There is no unified framework combining reasoning and interaction with external environments.

Core Idea

Introduce ReAct loop:
- Thought → Action → Observation
Thought:
- reasoning trace
Action:
- interact with environment (search, lookup)
Observation:
- feedback from environment
Iterate until answer

Visual / Diagram Notes

Figure 1 (page 2):
- Comparison:
  - CoT (reason only)
  - Act (action only)
  - ReAct (combined)
Shows reasoning guiding search, and search grounding reasoning

Key Results

Improves QA and fact verification over CoT
Reduces hallucination via external grounding
Strong gains in interactive tasks (ALFWorld +34%)
Limitation: requires careful prompt design

Personal Analysis

What worked:

Very intuitive “agent loop”
Strong improvement in interpretability
Flexible across tasks

What puzzled you:

Prompt-based → brittle
Scaling to long tasks is unclear

Connections & Related Work

Precursor to modern LLM agents (LangGraph, AutoGPT)
Complementary to RAG (retrieval as action)
Related to SELF-RAG (self-reflection vs action loop)

Implementation Sketch

Prompt template: Thought: Action: Observation:
Use external tools:
- Search API
- DB queries
Loop until Finish[]

Open Questions / Next Actions

How to train instead of prompt?
Combine with memory (MemGPT)?
Optimize action selection?

Glossary

Thought: reasoning step
Action: tool/environment call
Observation: returned info