Notes· 6 min read

Human-in-the-Loop AI: Interrupt Patterns with LangGraph

How to pause an agent mid-graph, hand control to a human, and resume cleanly — without losing the thread.

Human-in-the-Loop AI: Interrupt Patterns with LangGraph

Fully autonomous agents are the goal. They are not always the right tool.

Some actions are irreversible. Sending an email, placing an order, deleting a record, deploying to production — these are steps where a wrong decision has real consequences. Adding a human checkpoint before the action is not a failure of the agentic system. It is responsible system design.

LangGraph has first-class support for human-in-the-loop (HITL) through its interrupt mechanism. The graph pauses, control passes to a human, and the graph resumes with their input — without losing any state.

The interrupt model

  Agent drafts email


  ┌───────────────────┐
  │  PAUSE FOR REVIEW │  ◄── state checkpointed
  └────────┬──────────┘

     ┌─────┴──────┐
     │            │
     ▼            ▼
  APPROVE       REJECT / EDIT
     │            │
     ▼            ▼
  Send email   Revise draft


           PAUSE FOR REVIEW  (loop)

The key property: the graph state is fully preserved during the pause. When the human responds, the graph picks up exactly where it left off — no state is lost, no steps are re-run.

Setting up the interrupt

First, you need a checkpointer. Without one, the graph has nothing to resume from.

from langgraph.checkpoint.sqlite import SqliteSaver
from langgraph.graph import StateGraph
from typing import TypedDict, Literal
 
class EmailAgentState(TypedDict):
    task: str
    draft: str
    human_decision: Literal["approve", "reject", "edit"] | None
    human_feedback: str | None
    final_email: str | None
    sent: bool
 
checkpointer = SqliteSaver.from_conn_string("./hitl_agent.db")

Writing the interrupt node

from langgraph.types import interrupt
 
def human_review_node(state: EmailAgentState) -> EmailAgentState:
    draft = state["draft"]
 
    # This call pauses the graph and surfaces the draft to the caller
    human_input = interrupt({
        "message": "Please review this email draft before it is sent.",
        "draft": draft,
        "options": ["approve", "reject", "edit"],
    })
 
    # Execution resumes here after the human responds
    return {
        "human_decision": human_input["decision"],
        "human_feedback": human_input.get("feedback"),
    }

interrupt() does not raise an exception or return None. It suspends the current execution and serialises the graph state to the checkpointer. When you call graph.invoke() again with the same thread_id and a Command(resume=...), execution continues from the line after the interrupt() call.

The full agent graph

from langgraph.graph import StateGraph, END
from langgraph.types import Command
 
def draft_email_node(state: EmailAgentState) -> EmailAgentState:
    prompt = f"Draft a professional email for this task: {state['task']}"
    response = llm.invoke(prompt)
    return {"draft": response.content}
 
def route_after_review(state: EmailAgentState) -> str:
    decision = state.get("human_decision")
    if decision == "approve":
        return "send"
    if decision == "edit":
        return "revise"
    return "cancel"
 
def revise_draft_node(state: EmailAgentState) -> EmailAgentState:
    feedback = state.get("human_feedback", "")
    prompt = f"Revise this email based on feedback.\n\nOriginal:\n{state['draft']}\n\nFeedback: {feedback}"
    response = llm.invoke(prompt)
    return {"draft": response.content, "human_decision": None}
 
def send_email_node(state: EmailAgentState) -> EmailAgentState:
    # email_service.send(state["draft"])
    return {"final_email": state["draft"], "sent": True}
 
def cancel_node(state: EmailAgentState) -> EmailAgentState:
    return {"sent": False}
 
builder = StateGraph(EmailAgentState)
builder.add_node("draft", draft_email_node)
builder.add_node("human_review", human_review_node)
builder.add_node("revise", revise_draft_node)
builder.add_node("send", send_email_node)
builder.add_node("cancel", cancel_node)
 
builder.set_entry_point("draft")
builder.add_edge("draft", "human_review")
 
builder.add_conditional_edges("human_review", route_after_review, {
    "send": "send",
    "revise": "revise",
    "cancel": "cancel",
})
 
builder.add_edge("revise", "human_review")  # loop back for re-review
builder.add_edge("send", END)
builder.add_edge("cancel", END)
 
graph = builder.compile(checkpointer=checkpointer)

Running the graph with interrupts

The first invocation runs until the interrupt, then stops.

config = {"configurable": {"thread_id": "email-task-001"}}
 
# Phase 1: run until the interrupt
initial_state = {
    "task": "Follow up with the client about the Q2 proposal",
    "draft": "",
    "human_decision": None,
    "human_feedback": None,
    "final_email": None,
    "sent": False,
}
 
result = graph.invoke(initial_state, config=config)
# result["__interrupt__"] contains the interrupt payload
print(result["__interrupt__"][0].value["draft"])

After the human reviews:

from langgraph.types import Command
 
# Phase 2: resume with human decision
resume_command = Command(resume={
    "decision": "edit",
    "feedback": "The tone is too formal. Make it more conversational and mention the specific proposal date (March 15).",
})
 
result = graph.invoke(resume_command, config=config)
# Graph continues: revise → human_review (interrupt again)
 
print(result["__interrupt__"][0].value["draft"])
 
# Phase 3: approve the revised draft
result = graph.invoke(Command(resume={"decision": "approve"}), config=config)
print(result["sent"])  # True

Building the human-facing interface

In a web application, the interrupt payload becomes an API response. The frontend polls or uses websockets to detect the pause state.

# FastAPI endpoint example
from fastapi import FastAPI
from langgraph.types import Command
 
app = FastAPI()
 
@app.post("/agent/start")
async def start_task(task: str, thread_id: str):
    config = {"configurable": {"thread_id": thread_id}}
    result = graph.invoke({"task": task, ...}, config=config)
 
    if "__interrupt__" in result:
        return {
            "status": "awaiting_review",
            "thread_id": thread_id,
            "draft": result["__interrupt__"][0].value["draft"],
        }
    return {"status": "complete", "sent": result["sent"]}
 
@app.post("/agent/respond")
async def human_respond(thread_id: str, decision: str, feedback: str = ""):
    config = {"configurable": {"thread_id": thread_id}}
    resume = Command(resume={"decision": decision, "feedback": feedback})
    result = graph.invoke(resume, config=config)
 
    if "__interrupt__" in result:
        return {
            "status": "awaiting_review",
            "draft": result["__interrupt__"][0].value["draft"],
        }
    return {"status": "complete", "sent": result.get("sent")}

When HITL hurts: the latency cost

Every interrupt adds latency equal to the human response time. For an approval workflow where a human checks a dashboard every 30 minutes, that is 30 minutes of blocked state per step.

Design HITL into your system only where the action is:

  1. Irreversible — sending, deleting, deploying, publishing
  2. High-stakes — financial transactions, customer-facing communications, infrastructure changes
  3. Low-frequency — a step that happens once per workflow, not in a loop

Do not add HITL to tool calls that are read-only, easily reversible, or happening hundreds of times per minute. The human becomes the bottleneck and the agent loses all its speed advantages.

Timeout handling for abandoned reviews

What if the human never responds? Add a timeout that routes the graph to a sensible default.

import asyncio
 
async def run_with_timeout(thread_id: str, timeout_seconds: int = 3600):
    config = {"configurable": {"thread_id": thread_id}}
 
    try:
        async with asyncio.timeout(timeout_seconds):
            # Wait for human response via websocket or polling
            decision = await wait_for_human_response(thread_id)
            resume = Command(resume={"decision": decision})
            return await graph.ainvoke(resume, config=config)
    except asyncio.TimeoutError:
        # Auto-reject after timeout
        resume = Command(resume={"decision": "reject", "feedback": "Auto-rejected due to timeout"})
        return await graph.ainvoke(resume, config=config)

The pattern that matters

HITL is not about making agents safer by making them slower. It is about making irreversible actions explicit. The agent does all the work — research, drafting, decision-making — and the human provides the one thing the agent cannot: authorisation.

The interrupt is not a failure of the agentic system. It is a feature. It is the line in your architecture that says: the machine prepares, the human decides, the machine executes.

Share: