Callbacks in LangChain: Taking Control of Your AI Workflows

When you’re building an AI app with LangChain—say, a chatbot pulling answers from a PDF or a system summarizing web data—it’s easy to get caught up in the excitement of seeing it work. But what if you need to know what’s happening behind the scenes? How long is each step taking? Are errors sneaking in? Could you tweak the output or log key details for later? Manually sifting through code or logs is a hassle. That’s where callbacks in LangChain come in, acting like a backstage pass to monitor, debug, and customize your app’s workflow in real-time.

This guide, part of the LangChain Fundamentals series, explains what callbacks are, how they function, and the different types you can use to make your AI projects smarter. We’ll also walk through a practical example to show callbacks in action, with clear, human-friendly insights for beginners and developers. Expect a conversational dive into how callbacks can level up your chatbots, document search engines, or customer support bots. Let’s get rolling!

Why Callbacks Are Your New Best Friend

Callbacks are like having a super-smart assistant who keeps an eye on your LangChain app and chimes in at just the right moments. They let you run custom code during specific events in your workflow—when a chain kicks off, an LLM finishes, or something goes wrong. This means you can track performance, catch errors, or even change outputs without messing up your main code.

They’re a core part of LangChain’s core components, working seamlessly with prompts, chains, agents, memory, tools, and document loaders. Here’s what makes them awesome:

Performance Tracking: Measure how long a RetrievalQA chain takes to fetch data or count LLM tokens.
Easy Debugging: Spot errors or odd outputs instantly, no log-diving required.
Output Tweaking: Modify responses on the fly, like adding extra formatting.
External Connections: Send logs to LangSmith for analysis or ping Slack with updates.

Whether you’re fine-tuning a chatbot or debugging a SQL query generator, callbacks give you the control to make your app shine. Curious about the bigger picture? Check the architecture overview or Getting Started.

How Callbacks Fit Into Your Workflow

Callbacks are event-driven hooks that spring into action at key points in your LangChain app’s workflow. They’re part of the LCEL (LangChain Expression Language), which ties together chains, agents, and other components, supporting both synchronous and asynchronous execution for scalable apps, as explained in performance tuning.

Here’s how they work:

Create a Handler: You write a handler—a chunk of code that defines what happens when events like “chain starts” or “LLM errors” occur.
Attach It: Plug the handler into your chain, agent, or LLM, so LangChain knows when to trigger it.
Catch Events: As your app runs, LangChain calls your handler, passing details like inputs, outputs, or errors.
Take Action: Your handler logs data, modifies outputs, or fires off notifications (e.g., to Slack).
Move On: The workflow continues, incorporating any changes from your callback.

For example, in a RetrievalQA Chain pulling data from a vector store, a callback can log the retrieval time or LLM token usage. This makes callbacks ideal for:

Monitoring a chatbot answering user queries.
Catching errors in a data cleaning agent.
Tracking metrics for visualizing evaluations.

Callbacks keep your code clean while giving you deep insights and control.

The Callback Toolkit: Pick Your Flavor

LangChain offers several callback handlers, each tailored to specific tasks like logging, debugging, or production monitoring. Let’s explore the main ones, their uses, and how to set them up, with examples to make it real.

StdOutCallbackHandler: Your Debugging Buddy

The StdOutCallbackHandler is perfect for quick debugging. It prints event details—like what’s sent to the LLM or what comes back—right to your console, making it a go-to for development. Here’s the rundown:

What It Does: Logs events like chain start/end, LLM calls, tool usage, or errors to your terminal, giving you a live view of your app’s activity.
When to Use It: Debugging a chatbot, checking inputs in a document QA chain, or monitoring document loaders pulling PDFs.
How It Works: Captures messages like “Started chain with input: {query}” or “LLM used 50 tokens” and displays them in real-time.
Setup: Import and add it to your chain or agent. Here’s a quick example:

from langchain.callbacks import StdOutCallbackHandler
from langchain_openai import ChatOpenAI
from langchain_core.prompts import PromptTemplate

# Simple chain
prompt = PromptTemplate(input_variables=["query"], template="Answer: {query}")
llm = ChatOpenAI(model="gpt-4o-mini")
chain = prompt | llm

# Run with callback
result = chain.invoke({"query": "What is AI?"}, config={"callbacks": [StdOutCallbackHandler()]})
print(result.content)

Sample Console Output:

[chain/start] Entering Chain with input: {'query': 'What is AI?'}
[llm/start] Entering LLM with input: Answer: What is AI?
[llm/end] LLM completed with output: AI is the development of systems...
[chain/end] Chain completed with output: AI is the development of systems...
AI is the development of systems...

Example: You’re building a chatbot and notice the LLM’s output is off. The console logs show the exact input it received, helping you fix a typo in the prompt.

This handler is a no-fuss way to see what’s going on while you’re coding.

FileCallbackHandler: Logs That Last

For logs you can revisit later—say, to audit a production app or debug an issue after it happens—the FileCallbackHandler saves event details to a file. It’s like a journal for your app’s activities. Here’s how it works:

What It Does: Writes logs for chain, LLM, tool, and error events to a file, including timestamps and details.
When to Use It: Auditing user interactions in a customer support bot, tracking SQL query generation, or logging web research for compliance.
How It Works: Appends entries like “2025-05-14 12:23: Chain ended, output: {result}” to a file, creating a permanent record.
Setup: Specify a file path and attach it to your chain. Here’s an example:

from langchain.callbacks import FileCallbackHandler
from langchain_openai import ChatOpenAI
from langchain_core.prompts import PromptTemplate
import logging

# Set up file logging
logging.basicConfig(filename="chain.log", level=logging.INFO)
handler = FileCallbackHandler("chain.log")

# Simple chain
prompt = PromptTemplate(input_variables=["query"], template="Answer: {query}")
llm = ChatOpenAI(model="gpt-4o-mini")
chain = prompt | llm

# Run with callback
result = chain.invoke({"query": "What is AI?"}, config={"callbacks": [handler]})
print(result.content)

Sample Log File (chain.log):

2025-05-14 12:23:45,123 - INFO - [chain/start] Entering Chain with input: {'query': 'What is AI?'}
2025-05-14 12:23:45,456 - INFO - [llm/start] Entering LLM with input: Answer: What is AI?
2025-05-14 12:23:46,789 - INFO - [llm/end] LLM completed with output: AI is the development of systems...
2025-05-14 12:23:46,790 - INFO - [chain/end] Chain completed with output: AI is the development of systems...

Example: Your CRM bot logs every user query and response to a file, so you can review interactions if a customer reports a problem.

This handler is great for keeping a record you can check later, especially in production.

LangSmithCallbackHandler: Next-Level Insights

For professional-grade monitoring, the LangSmithCallbackHandler connects your app to LangSmith, a platform that visualizes traces, token usage, and performance metrics. It’s like a control room for your AI. Here’s the breakdown:

What It Does: Sends detailed event data—chain paths, token counts, latency, errors—to LangSmith for analysis and visualization.
When to Use It: Optimizing RAG apps, tracing agent workflows, or evaluating multi-PDF QA systems in production.
How It Works: Captures events like retrieval times or LLM token usage and uploads them to LangSmith, where you can view traces and metrics in a user-friendly dashboard.
Setup: Set up LangSmith credentials, enable tracing, and add the handler. Here’s a snippet:

from langchain.callbacks import LangSmithCallbackHandler
from langchain_openai import ChatOpenAI
from langchain_core.prompts import PromptTemplate

# Simple chain
prompt = PromptTemplate(input_variables=["query"], template="Answer: {query}")
llm = ChatOpenAI(model="gpt-4o-mini")
chain = prompt | llm

# Run with LangSmith callback
handler = LangSmithCallbackHandler()
result = chain.invoke({"query": "What is AI?"}, config={"callbacks": [handler]})
print(result.content)

Sample LangSmith Output: In the LangSmith dashboard, you’d see a trace with a timeline of the chain’s execution, showing “LLM call took 0.5s, used 100 prompt tokens, 20 completion tokens” and highlighting any bottlenecks.

Example: You’re tweaking a document QA chain and notice in LangSmith that retrieval is slow, so you optimize the vector store settings.

This handler is a powerhouse for production apps needing deep insights.

Custom Callback Handler: Build Your Own Magic

When you need something totally unique—like logging to a database, tweaking outputs, or sending alerts to Slack—a Custom Callback Handler lets you write your own rules. Here’s how it goes:

What It Does: Runs your custom code for any LangChain event, from chain starts to LLM errors, giving you full control.
When to Use It: Custom logging for data cleaning agents, modifying outputs in chatbots, or integrating with systems like MongoDB Atlas.
How It Works: You create a class inheriting from BaseCallbackHandler, define methods for events you care about (e.g., on_chain_start, on_llm_error), and LangChain calls them with event data.
Setup: Write your handler and attach it to your chain. Here’s an example that logs timing and errors to a file and sends error alerts to Slack:

from langchain_core.callbacks import BaseCallbackHandler
from datetime import datetime
import logging
import requests

logging.basicConfig(filename="custom.log", level=logging.INFO)

class CustomCallbackHandler(BaseCallbackHandler):
    def __init__(self, slack_webhook=None):
        self.slack_webhook = slack_webhook
        self.start_time = None

    def on_chain_start(self, serialized, inputs, **kwargs):
        self.start_time = datetime.now()
        logging.info(f"{self.start_time}: Chain started with input: {inputs}")

    def on_chain_end(self, outputs, **kwargs):
        end_time = datetime.now()
        duration = (end_time - self.start_time).total_seconds()
        logging.info(f"{end_time}: Chain ended with output: {outputs}, took {duration}s")

    def on_llm_end(self, response, **kwargs):
        tokens = response.llm_output.get("token_usage", {})
        logging.info(f"{datetime.now()}: LLM completed with tokens: {tokens}")

    def on_llm_error(self, error, **kwargs):
        error_msg = f"{datetime.now()}: LLM error: {str(error)}"
        logging.error(error_msg)
        if self.slack_webhook:
            requests.post(self.slack_webhook, json={"text": error_msg})

# Example usage
from langchain_openai import ChatOpenAI
from langchain_core.prompts import PromptTemplate

# Simple chain
prompt = PromptTemplate(input_variables=["query"], template="Answer: {query}")
llm = ChatOpenAI(model="gpt-4o-mini")
chain = prompt | llm

# Run with custom callback
handler = CustomCallbackHandler(slack_webhook="https://hooks.slack.com/services/your/webhook")
result = chain.invoke({"query": "What is AI?"}, config={"callbacks": [handler]})
print(result.content)

Sample Log File (custom.log):

2025-05-14 12:23:45,123 - INFO - 2025-05-14 12:23:45.123456: Chain started with input: {'query': 'What is AI?'}
2025-05-14 12:23:45,456 - INFO - 2025-05-14 12:23:45.456789: LLM completed with tokens: {'prompt_tokens': 100, 'completion_tokens': 20}
2025-05-14 12:23:46,789 - INFO - 2025-05-14 12:23:46.789012: Chain ended with output: AI is the development..., took 1.665556s

Sample Slack Notification (if error occurs):

2025-05-14 12:23:47: LLM error: API timeout

Example: Your code review agent logs execution times to a file and sends error alerts to Slack, so your team knows instantly if something breaks.

Custom handlers let you tailor callbacks to your app’s specific needs, offering endless possibilities.

Let’s Build It: A Document QA System with Callbacks

To see callbacks in action, let’s create a question-answering system that loads a PDF, uses a RetrievalQA Chain to answer questions, and logs events with a custom callback handler. The handler will track timing, token usage, document retrieval, and errors, saving logs to a file and sending error alerts to a Slack webhook.

Step 1: Get Your Environment Ready

Set up your system as outlined in Environment Setup. Install the required packages:

pip install langchain langchain-openai faiss-cpu pypdf requests

Securely set your OpenAI API key, following security and API key management. For this example, assume you have a PDF named “policy.pdf” (e.g., a company handbook).

Step 2: Create the Custom Callback Handler

Here’s our custom handler, logging chain start/end times, retriever document counts, LLM token usage, and errors to a file, with error notifications to a Slack webhook:

from langchain_core.callbacks import BaseCallbackHandler
from datetime import datetime
import logging
import requests

logging.basicConfig(filename="qa_chain.log", level=logging.INFO)

class CustomQACallbackHandler(BaseCallbackHandler):
    def __init__(self, slack_webhook=None):
        self.slack_webhook = slack_webhook
        self.start_time = None

    def on_chain_start(self, serialized, inputs, **kwargs):
        self.start_time = datetime.now()
        logging.info(f"{self.start_time}: QA Chain started with input: {inputs}")

    def on_chain_end(self, outputs, **kwargs):
        end_time = datetime.now()
        duration = (end_time - self.start_time).total_seconds()
        logging.info(f"{end_time}: QA Chain ended with output: {outputs}, took {duration}s")

    def on_retriever_end(self, documents, **kwargs):
        logging.info(f"{datetime.now()}: Retriever returned {len(documents)} documents")

    def on_llm_end(self, response, **kwargs):
        tokens = response.llm_output.get("token_usage", {})
        logging.info(f"{datetime.now()}: LLM completed with tokens: {tokens}")

    def on_llm_error(self, error, **kwargs):
        error_msg = f"{datetime.now()}: LLM error: {str(error)}"
        logging.error(error_msg)
        if self.slack_webhook:
            requests.post(self.slack_webhook, json={"text": error_msg})

This handler logs key events to “qa_chain.log” and sends errors to Slack.

Step 3: Load the PDF Document

Use PyPDFLoader to load the PDF:

from langchain_community.document_loaders import PyPDFLoader

loader = PyPDFLoader("policy.pdf")
documents = loader.load()

This creates Document objects with page_content (text) and metadata (e.g., {"source": "policy.pdf", "page": 1}).

Step 4: Set Up a Vector Store

Store the documents in a FAISS vector store:

from langchain_openai import OpenAIEmbeddings
from langchain.vectorstores import FAISS

embeddings = OpenAIEmbeddings()
vector_store = FAISS.from_documents(documents, embeddings)

Step 5: Define a Prompt Template

Create a Prompt Template to guide the LLM:

from langchain_core.prompts import PromptTemplate

prompt = PromptTemplate(
    template="Based on this context: {context}\nAnswer: {question}\nProvide a concise response in JSON format.",
    input_variables=["context", "question"]
)

Step 6: Set Up an Output Parser

Use an Output Parser for structured JSON:

from langchain_core.output_parsers import StructuredOutputParser, ResponseSchema

schemas = [
    ResponseSchema(name="answer", description="The response to the question", type="string")
]
parser = StructuredOutputParser.from_response_schemas(schemas)

Step 7: Build the RetrievalQA Chain

Combine components into a RetrievalQA Chain with the custom callback:

from langchain_openai import ChatOpenAI
from langchain.chains import RetrievalQA

prompt = PromptTemplate(
    template="Based on this context: {context}\nAnswer: {question}\n{format_instructions}",
    input_variables=["context", "question"],
    partial_variables={"format_instructions": parser.get_format_instructions()}
)

chain = RetrievalQA.from_chain_type(
    llm=ChatOpenAI(model="gpt-4o-mini"),
    chain_type="stuff",
    retriever=vector_store.as_retriever(),
    chain_type_kwargs={"prompt": prompt},
    output_parser=parser,
    callbacks=[CustomQACallbackHandler(slack_webhook="https://hooks.slack.com/services/your/webhook")]
)

Replace the Slack webhook URL with your own (or remove it for testing without Slack).

Step 8: Test the System

Run a question to test the chain and callbacks:

result = chain.invoke({"query": "What is the company’s vacation policy?"})
print(result)

Sample Output:

{'answer': 'Employees receive 15 vacation days annually.'}

Check the “qa_chain.log” file for logs:

2025-05-14 12:23:45,123 - INFO - 2025-05-14 12:23:45.123456: QA Chain started with input: {'query': 'What is the company’s vacation policy?'}
2025-05-14 12:23:45,456 - INFO - 2025-05-14 12:23:45.456789: Retriever returned 3 documents
2025-05-14 12:23:46,789 - INFO - 2025-05-14 12:23:46.789012: LLM completed with tokens: {'prompt_tokens': 150, 'completion_tokens': 20}
2025-05-14 12:23:47,012 - INFO - 2025-05-14 12:23:47.012345: QA Chain ended with output: {'answer': 'Employees receive 15 vacation days annually.'}, took 1.888889s

If an error occurs (e.g., LLM timeout), you’d see a log entry and a Slack notification:

2025-05-14 12:23:48,456 - ERROR - 2025-05-14 12:23:48.456789: LLM error: API timeout

Step 9: Debug and Enhance

If logs show issues—like slow retrieval or high token usage—use LangSmith for deeper analysis with prompt debugging or visualizing evaluations. Refine the prompt with few-shot prompting:

prompt = PromptTemplate(
    template="Based on this context: {context}\nAnswer: {question}\nExamples:\nQuestion: What is the dress code? -> {'answer': 'Business casual'}\nProvide a concise response in JSON format.\n{format_instructions}",
    input_variables=["context", "question"],
    partial_variables={"format_instructions": parser.get_format_instructions()}
)

For issues, check troubleshooting. Enhance with memory for conversational flows or deploy as a Flask API.

Tips to Make Callbacks Work for You

Choose Wisely: Use StdOutCallbackHandler for dev testing, FileCallbackHandler for persistent logs, LangSmithCallbackHandler for production, or Custom for unique tasks.
Log Smart: Focus on key events (e.g., chain start, LLM tokens) to keep logs manageable, aiding dataflow visualization.
Debug Early: Pair callbacks with LangSmith for testing prompts to catch issues fast.
Stay Fast: Use asynchronous execution to keep callbacks from slowing your app.
Secure Logs: Protect sensitive data in logs, following security and API key management.

These tips support enterprise-ready applications and workflow design patterns.

Keep Exploring with Callbacks

To take your callback skills further:

Enhance Chats: Use callbacks with memory in chat-history-chains to monitor chatbot interactions.
Optimize RAG Apps: Track document loaders and vector stores in RAG apps.
Dive into LangGraph: Explore LangGraph for stateful applications with callback monitoring.
Try Projects: Experiment with multi-PDF QA or SQL query generation.
Learn from Others: Study real-world projects for production insights.

Wrap-Up: Callbacks Unlock Your AI’s Potential

Callbacks in LangChain—whether it’s the quick StdOutCallbackHandler, the durable FileCallbackHandler, the pro-grade LangSmithCallbackHandler, or a custom creation—are your key to smarter, more reliable AI apps. They let you monitor, debug, and customize workflows with ease, making your chains, agents, and document loaders work like a well-oiled machine. From logging a chatbot’s every step to optimizing a RAG app, callbacks put you in the driver’s seat.

Start with the document QA example, play with tutorials like Build a Chatbot or Create RAG App, and share your projects with the AI Developer Community or on X with #LangChainTutorial. For more, check the LangChain Documentation and keep building awesome AI!