Building a Code Review Agent with LangGraph: A Practical Example

Imagine an AI that can review your code, catch potential bugs, suggest improvements, and keep refining its feedback until your code shines—like having a tireless, expert pair of eyes on your project. LangGraph, a dynamic library from the LangChain team, makes this possible with its stateful, graph-based workflows. In this beginner-friendly guide, we’ll walk you through building a code review agent using LangGraph that analyzes code snippets, provides feedback, and iterates until the review is satisfactory. With clear code examples, a conversational tone, and practical steps, you’ll create an AI-powered code reviewer, even if you’re new to coding!

What is a Code Review Agent in LangGraph?

A code review agent in LangGraph is an AI application designed to:

Accept a code snippet as input (e.g., a Python function).
Analyze the code using a language model (like those from OpenAI) to identify issues or suggest improvements.
Provide feedback and store the review history.
Check if the feedback is clear and actionable, looping back to refine it if needed.
End when the review meets quality criteria or a maximum number of attempts is reached.

LangGraph’s nodes (tasks), edges (connections), and state (shared data) enable a flexible workflow that adapts to the code’s complexity, making it ideal for iterative review processes.

This example showcases LangGraph’s ability to handle dynamic, multi-step tasks and can be extended with tools or agent logic. To get started with LangGraph, see Introduction to LangGraph.

What You’ll Build

Our code review agent will: 1. Take a user-submitted code snippet (e.g., a Python function). 2. Store the code and review history in the state. 3. Analyze the code using an AI model to provide feedback, considering past reviews for context. 4. Check if the feedback is clear and actionable (simulated for this example). 5. Loop back to refine the feedback if it’s unclear, up to three attempts. 6. End when the feedback is satisfactory or attempts are exhausted.

We’ll use LangGraph for the workflow and LangChain for memory, AI, and prompt management.

Prerequisites

Before starting, ensure you have:

Python 3.8+: Installed and verified with python --version.
LangGraph and LangChain: Installed via pip.
OpenAI API Key: For the language model (or use a free model from Hugging Face).
Virtual Environment: To manage dependencies.

Install the required packages:

pip install langgraph langchain langchain-openai python-dotenv

Set up your OpenAI API key in a .env file:

echo "OPENAI_API_KEY=your-api-key-here" > .env

For setup details, see Install and Setup and Security and API Keys.

Building the Code Review Agent

Let’s create a LangGraph workflow for the code review agent. We’ll define the state, nodes, edges, and graph, then run it to review a code snippet.

Step 1: Define the State

The state holds the code snippet, feedback, review quality, history, and attempt count to manage iterations.

from typing import TypedDict
from langchain_core.messages import HumanMessage, AIMessage

class State(TypedDict):
    code_snippet: str           # User-submitted code
    feedback: str               # AI-generated feedback
    is_clear: bool              # True if feedback is clear and actionable
    review_history: list        # List of HumanMessage and AIMessage
    attempt_count: int          # Number of review attempts

The review_history ensures context-aware feedback, and attempt_count prevents infinite loops. Learn more at State Management.

Step 2: Create Nodes

We’ll use four nodes:

process_code: Stores the code snippet and initializes the history.
review_code: Generates feedback using an AI model, considering the history.
check_clarity: Evaluates if the feedback is clear and actionable (simulated).
decide_next: Decides whether to end or retry.

from langchain_openai import ChatOpenAI
from langchain.prompts import PromptTemplate
import logging

# Setup logging for debugging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

# Node 1: Process code snippet
def process_code(state: State) -> State:
    logger.info(f"Processing code: {state['code_snippet'][:50]}...")
    if not state["code_snippet"]:
        logger.error("Empty code snippet")
        raise ValueError("Code snippet is required")
    state["review_history"].append(HumanMessage(content=state["code_snippet"]))
    state["attempt_count"] = 0
    logger.debug(f"Updated history: {state['review_history']}")
    return state

# Node 2: Review code
def review_code(state: State) -> State:
    logger.info("Generating feedback")
    try:
        llm = ChatOpenAI(model="gpt-3.5-turbo")
        template = PromptTemplate(
            input_variables=["code_snippet", "history"],
            template="Code: {code_snippet}\nReview history: {history}\nProvide clear, actionable feedback in one or two sentences."
        )
        history_str = "\n".join([f"{msg.type}: {msg.content}" for msg in state["review_history"]])
        chain = template | llm
        feedback = chain.invoke({"code_snippet": state["code_snippet"], "history": history_str}).content
        state["feedback"] = feedback
        state["review_history"].append(AIMessage(content=feedback))
        state["attempt_count"] += 1
        logger.debug(f"Feedback: {feedback}")
    except Exception as e:
        logger.error(f"Review error: {str(e)}")
        state["feedback"] = f"Error: {str(e)}"
    return state

# Node 3: Check feedback clarity (simulated)
def check_clarity(state: State) -> State:
    logger.info("Checking feedback clarity")
    # Simulate clarity: assume clear if feedback is >50 characters and contains a period
    state["is_clear"] = len(state["feedback"]) > 50 and "." in state["feedback"]
    logger.debug(f"Clear: {state['is_clear']}")
    return state

# Node 4: Decide next step
def decide_next(state: State) -> str:
    if state["is_clear"] or state["attempt_count"] >= 3:
        logger.info("Ending workflow: clear or max attempts reached")
        return "end"
    logger.info("Looping back to refine feedback")
    return "review_code"

process_code: Validates the code snippet, adds it to history, and initializes attempt_count.
review_code: Uses the AI to generate feedback, considering history, and updates the state.
check_clarity: Simulates checking if the feedback is clear by evaluating length and punctuation.
decide_next: Decides to end or retry based on clarity or attempts.

For AI integration, see OpenAI Integration.

Step 3: Define Edges

The workflow flows as follows:

Direct Edges: From process_code to review_code, then to check_clarity.
Conditional Edge: From check_clarity, either end or loop back to review_code.

Step 4: Build the Workflow

The graph connects nodes and edges:

from langgraph.graph import StateGraph, END

# Build the graph
graph = StateGraph(State)
graph.add_node("process_code", process_code)
graph.add_node("review_code", review_code)
graph.add_node("check_clarity", check_clarity)
graph.add_edge("process_code", "review_code")
graph.add_edge("review_code", "check_clarity")
graph.add_conditional_edges("check_clarity", decide_next, {
    "end": END,
    "review_code": "review_code"
})
graph.set_entry_point("process_code")

# Compile the graph
app = graph.compile()

Step 5: Run the Code Review Agent

Test the agent with a sample code snippet:

from dotenv import load_dotenv
import os

load_dotenv()

# Sample code snippet
code_snippet = """
def add_numbers(a, b):
    result = a + b
    print(result)
"""

# Run the workflow
try:
    result = app.invoke({
        "code_snippet": code_snippet,
        "feedback": "",
        "is_clear": False,
        "review_history": [],
        "attempt_count": 0
    })
    print("Feedback:", result["feedback"])
    print("Review History:", [msg.content for msg in result["review_history"]])
except Exception as e:
    logger.error(f"Workflow error: {str(e)}")

Example Output:

Feedback: The function lacks a return statement, so it only prints the result; consider adding `return result` for reusability.
Review History: [
    "def add_numbers(a, b):\n    result = a + b\n    print(result)",
    "The function lacks a return statement, so it only prints the result; consider adding `return result` for reusability."
]

Step 6: Simulate an Interactive Review

To make the agent interactive, create a loop for multiple code submissions:

# Initialize state
state = {
    "code_snippet": "",
    "feedback": "",
    "is_clear": False,
    "review_history": [],
    "attempt_count": 0
}

# Interactive loop
print("Welcome to the Code Review Agent! Submit your code or type 'exit' to quit.")
while True:
    user_input = input("Your code (or 'exit'): ")
    if user_input.lower() in ["exit", "quit"]:
        break
    state["code_snippet"] = user_input
    result = app.invoke(state)
    print("Feedback:", result["feedback"])
    if result["is_clear"]:
        print("Feedback is clear! Submit new code or exit.")
    elif result["attempt_count"] >= 3:
        print("Max attempts reached. Please revise and resubmit.")
    state = result  # Update state with new history

Example Interaction:

Welcome to the Code Review Agent! Submit your code or type 'exit' to quit.
Your code: def add_numbers(a, b): result = a + b; print(result)
Feedback: The function should include a return statement like `return result` to make it reusable, and consider adding input validation for `a` and `b`.
Feedback is clear! Submit new code or exit.
Your code: exit

What’s Happening?

The state persists the review_history, enabling context-aware feedback.
Nodes process the code, generate feedback, check clarity, and decide next steps.
Edges create a flow that loops back if feedback is unclear, up to three attempts.
The workflow is robust, with logging and error handling for reliability.

For more on dynamic flows, see Looping and Branching.

Debugging Common Issues

If the agent encounters issues, try these debugging tips:

No Feedback: Verify the OPENAI_API_KEY is set. See Security and API Keys.
Infinite Loop: Check attempt_count in decide_next to ensure the loop limit is enforced. See Graph Debugging.
Missing History: Log review_history in process_code to confirm messages are added.
Unclear Feedback: Refine the prompt in review_code or adjust clarity criteria in check_clarity.

Enhancing the Code Review Agent

Extend the agent with LangChain features:

Tools: Add code analysis tools (e.g., linting via a custom tool) or fetch best practices with SerpAPI Integration. See Tool Usage.
Agents: Enable the agent to choose review strategies dynamically with Agent Integration.
Memory: Enhance context with Memory Integration.
Prompts: Improve feedback quality with tailored prompts using Prompt Templates.

For example, add a node to search for coding best practices with Web Research Chain.

To deploy the agent as an API, see Deploying Graphs.

Best Practices for Code Review Agents

Focused Nodes: Each node should handle one task (e.g., input, review, clarity check). See Workflow Design.
Robust State: Validate state data to avoid errors. Check State Management.
Clear Logging: Use logging to trace issues. See Graph Debugging.
Limit Retries: Cap attempts to prevent endless loops. Check Looping and Branching.
Test Scenarios: Try various code snippets to ensure robust feedback. See Best Practices.

Conclusion

Building a code review agent with LangGraph is a fantastic way to leverage stateful, graph-based workflows for practical AI applications. By structuring the review process with nodes, edges, and a persistent state, you’ve created an AI that analyzes code, provides actionable feedback, and refines its suggestions intelligently. This example is a foundation for more advanced agents with tools, dynamic decisions, or cloud deployment.

To begin, follow Install and Setup and try this code review agent. For more, explore Core Concepts or simpler projects like Simple Chatbot Example. For inspiration, check real-world applications at Best LangGraph Uses. With LangGraph, your code review agent is ready to polish code and impress developers!

External Resources: