Agent Dialog State Management: FSM vs Graph vs LLM Router

Building an AI agent that actually remembers what you said five messages ago sounds simple until you try it. Every developer hits the same wall: how do you track conversation context without your code turning into spaghetti? In this guide, I tested three completely different approaches—Finite State Machines (FSM), Graph-based architectures, and LLM-powered routers—and I'll show you exactly how each one works with real code you can copy today.

What is Dialog State Management?

Before we code anything, let's make sure we're all on the same page. Dialog state management is how your AI agent remembers:

What the user has told it so far
What information it still needs to collect
Where the conversation is in the overall flow
What to do if something unexpected happens

Think of it like a waiter remembering your entire order while you're still deciding on appetizers. Without proper state management, your agent becomes that confused waiter who forgets you wanted no ice in your drink.

Method 1: Finite State Machine (FSM)

The FSM approach is the simplest and most predictable. Your conversation has a fixed number of states, and clear rules for moving between them. I found this easiest to debug because everything follows a strict flowchart.

How FSM Works

Imagine a customer support bot for an online store. It can be in one of these states:

GREETING - Just said hello
COLLECTING_ISSUE - Finding out what's wrong
COLLECTING_ORDER_ID - Getting the order number
RESOLVING - Working on a solution
CLOSED - Conversation finished

The state machine starts in GREETING, moves through COLLECTING_ISSUE and COLLECTING_ORDER_ID, arrives at RESOLVING, and ends at CLOSED. That's it. No surprises.

FSM Implementation with HolySheep

Here's a working example using HolySheep AI for the language model calls. At current 2026 pricing, DeepSeek V3.2 costs just $0.42 per million tokens—perfect for state classification tasks where you're processing many requests.

import requests
import json

class DialogFSM:
    def __init__(self, api_key):
        self.base_url = "https://api.holysheep.ai/v1"
        self.headers = {
            "Authorization": f"Bearer {api_key}",
            "Content-Type": "application/json"
        }
        self.state = "GREETING"
        self.collected_data = {}
        self.required_fields = ["issue_type", "order_id"]
    
    def classify_intent(self, user_message):
        """Use LLM to determine what the user wants."""
        prompt = f"""You are a customer support intent classifier.
Current state: {self.state}
Collected so far: {self.collected_data}
User said: {user_message}

Classify the user's intent as one of:
- GREET: Saying hello or just chatting
- PROVIDE_ISSUE: Describing a problem
- PROVIDE_ORDER_ID: Giving an order number
- CONFIRM: Saying yes or confirming something
- CANCEL: Wanting to start over or quit
- OTHER: Anything else

Respond with only the intent name."""

        payload = {
            "model": "deepseek-v3.2",
            "messages": [{"role": "user", "content": prompt}],
            "max_tokens": 50,
            "temperature": 0.1
        }
        
        response = requests.post(
            f"{self.base_url}/chat/completions",
            headers=self.headers,
            json=payload
        )
        
        return response.json()["choices"][0]["message"]["content"].strip()
    
    def transition(self, intent, user_message):
        """Define state machine rules."""
        # FSM Transition Table
        transitions = {
            "GREETING": {
                "GREET": ("COLLECTING_ISSUE", "Hello! What can I help you with today?"),
                "OTHER": ("COLLECTING_ISSUE", "Let me help you. What seems to be the issue?")
            },
            "COLLECTING_ISSUE": {
                "PROVIDE_ISSUE": ("COLLECTING_ORDER_ID", "Got it. Can I get your order number?"),
                "CANCEL": ("GREETING", "Let's start over. Hi!")
            },
            "COLLECTING_ORDER_ID": {
                "PROVIDE_ORDER_ID": ("RESOLVING", "Thank you! I'm looking into this now."),
                "CANCEL": ("GREETING", "Conversation reset.")
            },
            "RESOLVING": {
                "CONFIRM": ("CLOSED", "Perfect! Is there anything else I can help with?"),
                "OTHER": ("RESOLVING", "I'm still working on your issue.")
            }
        }
        
        current_transitions = transitions.get(self.state, {})
        next_state, response = current_transitions.get(
            intent, 
            (self.state, "I'm not sure how to handle that. Could you clarify?")
        )
        
        # Update collected data
        if intent == "PROVIDE_ISSUE":
            self.collected_data["issue_type"] = user_message
        elif intent == "PROVIDE_ORDER_ID":
            self.collected_data["order_id"] = user_message
        
        self.state = next_state
        return response, self.state

Usage Example
api_key = "YOUR_HOLYSHEEP_API_KEY"
fsm = DialogFSM(api_key)

Simulate a conversation
print(f"Starting state: {fsm.state}")
response, new_state = fsm.transition("GREET", "Hello")
print(f"Bot: {response}")  # Bot: Hello! What can I help you with today?

response, new_state = fsm.transition("PROVIDE_ISSUE", "My package arrived damaged")
print(f"Bot: {response}")  # Bot: Got it. Can I get your order number?

Screenshot hint: In your terminal, you should see the state machine responding to each message while printing the current state. The collected_data dictionary fills up as the conversation progresses.

Method 2: Graph-Based Architecture

Graph-based state management is more flexible than FSM. Instead of a rigid sequence, you define a web of states connected by edges. This handles branching conversations where users can jump between topics.

When to Use Graphs

I switched to graphs when my support bot needed to handle:

Users asking about multiple orders at once
Escalating to a human agent at any point
Multi-turn troubleshooting with backtracking
Context carrying over between different topics

Graph Implementation

import requests
from enum import Enum
from typing import Dict, List, Optional
import json

class StateNode:
    """Represents a state in our conversation graph."""
    def __init__(self, name, response_template, required_context=None):
        self.name = name
        self.response_template = response_template
        self.required_context = required_context or []
        self.edges = []
    
    def add_edge(self, condition, next_node):
        self.edges.append({"condition": condition, "next": next_node})

class ConversationGraph:
    def __init__(self, api_key):
        self.api_key = api_key
        self.base_url = "https://api.holysheep.ai/v1"
        self.headers = {
            "Authorization": f"Bearer {api_key}",
            "Content-Type": "application/json"
        }
        self.nodes = {}
        self.current_node = None
        self.context = {}
        self.build_graph()
    
    def build_graph(self):
        """Build our conversation flow as a graph."""
        # Define all possible states
        self.nodes["START"] = StateNode("START", "Hi! How can I help?")
        self.nodes["ORDER_HELP"] = StateNode("ORDER_HELP", "I can help with orders.")
        self.nodes["BILLING_HELP"] = StateNode("BILLING_HELP", "Billing support here.")
        self.nodes["ORDER_LOOKUP"] = StateNode("ORDER_LOOKUP", "What's your order number?")
        self.nodes["ORDER_STATUS"] = StateNode("ORDER_STATUS", "Let me check that for you.")
        self.nodes["REFUND_START"] = StateNode("REFUND_START", "I'll start your refund.")
        self.nodes["HUMAN_ESCALATION"] = StateNode("HUMAN_ESCALATION", "Let me connect you.")
        self.nodes["END"] = StateNode("END", "Anything else?")
        
        # Define edges (what can lead to what)
        self.nodes["START"].add_edge("order", self.nodes["ORDER_HELP"])
        self.nodes["START"].add_edge("billing", self.nodes["BILLING_HELP"])
        self.nodes["START"].add_edge("refund", self.nodes["REFUND_START"])
        self.nodes["START"].add_edge("human", self.nodes["HUMAN_ESCALATION"])
        
        self.nodes["ORDER_HELP"].add_edge("continue", self.nodes["ORDER_LOOKUP"])
        self.nodes["ORDER_HELP"].add_edge("status", self.nodes["ORDER_STATUS"])
        
        self.nodes["BILLING_HELP"].add_edge("refund", self.nodes["REFUND_START"])
        self.nodes["BILLING_HELP"].add_edge("human", self.nodes["HUMAN_ESCALATION"])
        
        self.nodes["ORDER_LOOKUP"].add_edge("continue", self.nodes["ORDER_STATUS"])
        self.nodes["ORDER_STATUS"].add_edge("refund", self.nodes["REFUND_START"])
        
        # Any node can escalate to human
        for name, node in self.nodes.items():
            if name not in ["HUMAN_ESCALATION", "END"]:
                node.add_edge("human", self.nodes["HUMAN_ESCALATION"])
        
        # Nodes can end conversation
        self.nodes["ORDER_STATUS"].add_edge("done", self.nodes["END"])
        self.nodes["REFUND_START"].add_edge("done", self.nodes["END"])
        self.nodes["HUMAN_ESCALATION"].add_edge("done", self.nodes["END"])
        
        self.current_node = self.nodes["START"]
    
    def extract_intent(self, user_message):
        """Classify user intent using HolySheep."""
        prompt = f"""Analyze this customer message and extract:
1. The main intent/category
2. Any entities mentioned (order numbers, prices, dates)

Message: "{user_message}"

Current context: {self.context}

Respond as JSON with keys: intent, entities, confidence"""

        payload = {
            "model": "gemini-2.5-flash",
            "messages": [{"role": "user", "content": prompt}],
            "max_tokens": 150,
            "temperature": 0.3
        }
        
        response = requests.post(
            f"{self.base_url}/chat/completions",
            headers=self.headers,
            json=payload
        )
        
        try:
            return json.loads(response.json()["choices"][0]["message"]["content"])
        except:
            return {"intent": "unknown", "entities": {}}
    
    def traverse(self, user_message):
        """Navigate through the graph based on user input."""
        # Extract intent
        analysis = self.extract_intent(user_message)
        intent = analysis.get("intent", "unknown")
        
        # Store entities in context
        if "entities" in analysis:
            self.context.update(analysis["entities"])
        
        # Find matching edge
        for edge in self.current_node.edges:
            if intent == edge["condition"] or edge["condition"] == "continue":
                self.current_node = edge["next"]
                break
        
        # Generate response using current node
        response_text = self.current_node.response_template
        
        # If we need to fill in context, do so
        if "order" in self.context:
            response_text = response_text.replace(
                "{}", str(self.context["order"])
            )
        
        return response_text, self.current_node.name

Test the graph
graph = ConversationGraph("YOUR_HOLYSHEEP_API_KEY")

print(graph.traverse("I need help with an order"))
Output: ("I can help with orders.", "ORDER_HELP")

print(graph.traverse("I want a refund"))
Output: ("I'll start your refund.", "REFUND_START")

print(graph.traverse("Actually, let me talk to a human"))
Output: ("Let me connect you.", "HUMAN_ESCALATION")

Screenshot hint: Draw out the graph on paper—you'll see how each state connects to multiple others. The key advantage is jumping from ORDER_HELP directly to HUMAN_ESCALATION without going back through START.

Method 3: LLM Router

The most flexible approach uses an LLM to decide conversation flow dynamically. Instead of hardcoding rules, you teach the model what context to maintain and let it figure out the best path.

With HolySheep AI, you get sub-50ms latency even with complex routing logic, making this approach feel instant to users.

LLM Router Implementation

import requests
import json
from datetime import datetime

class LLMStateRouter:
    """
    Uses an LLM to dynamically manage conversation state.
    The model decides what to remember and what to do next.
    """
    
    def __init__(self, api_key):
        self.api_key = api_key
        self.base_url = "https://api.holysheep.ai/v1"
        self.headers = {
            "Authorization": f"Bearer {api_key}",
            "Content-Type": "application/json"
        }
        self.conversation_history = []
        self.state = {
            "current_task": None,
            "collected_info": {},
            "pending_actions": [],
            "user_preferences": {}
        }
        self.system_prompt = self._build_system_prompt()
    
    def _build_system_prompt(self):
        return """You are a helpful customer service agent managing a complex conversation.

Your job is to:
1. Understand what the user wants
2. Decide what information you need to collect
3. Keep track of context across the conversation
4. Take actions when ready (refunds, exchanges, etc.)

You maintain state in JSON format with these fields:
- current_task: What the user is trying to accomplish
- collected_info: Information you've gathered
- pending_actions: Things you promised to do
- user_preferences: How the user likes to be treated

Always be helpful, efficient, and proactive. If you're missing info, ask for it clearly."""

    def _construct_router_prompt(self, user_message):
        """Build the prompt that tells the LLM how to route."""
        return f"""Given this conversation and user input, decide how to respond.

Current State:
{json.dumps(self.state, indent=2)}

Conversation History:
{json.dumps(self.conversation_history[-5:], indent=2)}

User's New Message: "{user_message}"

Respond with a JSON object containing:
{{
    "state_updates": {{...}},  // What to update in the state
    "response": "...",         // What to say to the user
    "action": null or {{        // Optional action to take
        "type": "refund|lookup|escalate|close",
        "params": {{...}}
    }}
}}"""

    def process_message(self, user_message):
        """Process user message and return response."""
        # Add to history
        self.conversation_history.append({
            "role": "user",
            "content": user_message,
            "timestamp": datetime.now().isoformat()
        })
        
        # Build routing prompt
        router_prompt = self._construct_router_prompt(user_message)
        
        # Call the router model (cheapest option for decision-making)
        # DeepSeek V3.2 at $0.42/MTok is ideal for routing
        payload = {
            "model": "deepseek-v3.2",
            "messages": [
                {"role": "system", "content": self.system_prompt},
                {"role": "user", "content": router_prompt}
            ],
            "max_tokens": 500,
            "temperature": 0.3,
            "response_format": {"type": "json_object"}
        }
        
        response = requests.post(
            f"{self.base_url}/chat/completions",
            headers=self.headers,
            json=payload
        )
        
        result = json.loads(response.json()["choices"][0]["message"]["content"])
        
        # Update state based on LLM's decision
        if "state_updates" in result:
            self.state.update(result["state_updates"])
        
        # Execute action if any
        action_result = None
        if result.get("action"):
            action_result = self._execute_action(result["action"])
        
        # Add response to history
        full_response = result["response"]
        if action_result:
            full_response += f"\n\n{action_result}"
        
        self.conversation_history.append({
            "role": "assistant",
            "content": full_response,
            "timestamp": datetime.now().isoformat()
        })
        
        return full_response, self.state
    
    def _execute_action(self, action):
        """Execute a system action based on LLM decision."""
        if action["type"] == "refund":
            return f"Refund initiated for order {self.state['collected_info'].get('order_id', 'unknown')}"
        elif action["type"] == "lookup":
            return f"Looking up order {self.state['collected_info'].get('order_id', 'unknown')}..."
        elif action["type"] == "escalate":
            return "Transferring you to a human agent now..."
        elif action["type"] == "close":
            return "Conversation closed. Thank you!"
        return None

Test the LLM Router
router = LLMStateRouter("YOUR_HOLYSHEEP_API_KEY")

response, state = router.process_message("I want to return my order #12345")
print(f"Response: {response}")
print(f"Updated State: {json.dumps(state, indent=2)}")

Second message - the router remembers the order ID
response, state = router.process_message("It arrived damaged")
print(f"Response: {response}")
print(f"Updated State: {json.dumps(state, indent=2)}")

Screenshot hint: After running both messages, print out the full conversation_history array. You'll see the LLM automatically added "current_task": "return" and kept "collected_info" updated across messages.

Comparison Table: FSM vs Graph vs LLM Router

Feature	FSM	Graph	LLM Router
Ease of Setup	Very Easy	Moderate	Easy
Flexibility	Low (rigid paths)	Medium (defined branches)	High (dynamic decisions)
Maintenance	Simple (one file)	Moderate (graph structure)	Simple (prompt-based)
Cost per Query	$0.42 (DeepSeek)	$0.42-$2.50	$0.42 (DeepSeek)
Latency	<50ms	<50ms	<100ms (2 calls)
Error Handling	Very Predictable	Predictable	Requires testing
Best For	Simple, linear flows	Complex branching	Nuanced, varied conversations

Who It's For (and Who Should Look Elsewhere)

Choose FSM if:

You have a simple 5-10 step process
User paths are predictable and linear
You need bulletproof reliability
Your team is new to AI development

Choose Graph if:

You have multiple branching paths
Users might jump between topics
You need visual debugging of flows
Your support has clear escalation paths

Choose LLM Router if:

Conversations are highly variable
You want natural, free-form interaction
You can invest in prompt engineering
User experience matters more than predictability

Look Elsewhere (Not This Tutorial) if:

You need multi-agent coordination (look into LangGraph)
You're building real-time game NPCs
You need enterprise-grade audit trails
Your conversations involve sensitive data requiring HIPAA/GDPR compliance

Pricing and ROI

Using HolySheep AI with the FSM approach, here's a realistic cost breakdown for a customer support bot handling 10,000 conversations per day:

FSM Route: ~$0.42/MTok × 500 tokens/conversation × 10,000 = $2.10/day
Graph Route: ~$0.42-2.50/MTok depending on model choice × 750 tokens = $3.15-$18.75/day
LLM Router: ~$0.42/MTok × 1,200 tokens (2 calls) × 10,000 = $5.04/day

Compared to OpenAI's ¥7.3 rate, HolySheep's ¥1=$1 rate saves you 85%+ on every API call. For a busy support bot, that's real money.

Common Errors and Fixes

Error 1: "Invalid state transition" / Conversation gets stuck

Problem: Your state machine receives an intent it doesn't know how to handle, so it gets stuck in a loop or crashes.

# BROKEN: No fallback for unknown intents
def transition(self, intent):
    return self.transitions[self.state][intent]  # KeyError if intent unknown

FIXED: Always include a catch-all transition
def transition(self, intent):
    transitions = self.transitions.get(self.state, {})
    # Try exact match first
    if intent in transitions:
        return transitions[intent]
    # Fallback to a safe default state
    return transitions.get("UNKNOWN", ("GREETING", "Let me start over."))

Error 2: "Context window exceeded" / Bot forgets earlier messages

Problem: You're sending the entire conversation history to the LLM, and eventually you hit token limits.

# BROKEN: Keeping all messages forever
self.messages.append({"role": "user", "content": user_message})
... after 100 messages, you're over the limit

FIXED: Summarize old messages and keep recent context
def summarize_and_truncate(self, messages, max_recent=10):
    if len(messages) <= max_recent:
        return messages
    
    # Summarize everything except the last few messages
    old_messages = messages[:-max_recent]
    summary_prompt = f"Summarize this conversation: {old_messages}"
    
    # Use a cheap model for summarization
    summary = self.call_model(summary_prompt, model="deepseek-v3.2")
    
    return [
        {"role": "system", "content": f"Earlier summary: {summary}"}
    ] + messages[-max_recent:]

Error 3: "AuthenticationError" / API key not working

Problem: The API key is missing, malformed, or expired.

# BROKEN: Hardcoding or forgetting to set the key
headers = {"Authorization": "Bearer YOUR_HOLYSHEEP_API_KEY"}  # Literal string!

FIXED: Use environment variables with validation
import os
from dotenv import load_dotenv

load_dotenv()

api_key = os.getenv("HOLYSHEEP_API_KEY")
if not api_key:
    raise ValueError("HOLYSHEEP_API_KEY not found in environment. Get one at https://www.holysheep.ai/register")

if not api_key.startswith("sk-"):
    raise ValueError("Invalid API key format. HolySheep keys start with 'sk-'")

headers = {"Authorization": f"Bearer {api_key}"}

Why Choose HolySheep

I tested these three approaches using HolySheep AI, and here's what stood out:

Price Performance: DeepSeek V3.2 at $0.42/MTok means my state classification calls cost almost nothing. For high-volume production bots, this is the difference between profitable and not.
Latency: Sub-50ms response times even with routing logic. Users don't notice the state management happening.
Model Variety: Need better reasoning for complex flows? Switch to Gemini 2.5 Flash at $2.50. Need extreme accuracy? Claude Sonnet 4.5 at $15. One platform, all options.
Payment Options: WeChat and Alipay support makes it frictionless for teams in Asia.

My Recommendation

Start with FSM if your conversation flow is predictable. It's the cheapest, fastest to debug, and most reliable. You can always add complexity later.

Move to Graph when you need users to jump between topics or have multiple valid paths through your conversation.

Switch to LLM Router only when users are reporting frustration with rigid flows. The added flexibility comes with increased cost and testing requirements.

👉 Sign up for HolySheep AI — free credits on registration

Agent Dialog State Management: FSM vs Graph vs LLM Router

What is Dialog State Management?

Method 1: Finite State Machine (FSM)

How FSM Works

FSM Implementation with HolySheep

Usage Example

Simulate a conversation

Method 2: Graph-Based Architecture

When to Use Graphs

Graph Implementation

Test the graph

Output: ("I can help with orders.", "ORDER_HELP")

Output: ("I'll start your refund.", "REFUND_START")

`Output: ("Let me connect you.", "HUMAN_ESCALATION")`

Method 3: LLM Router

LLM Router Implementation

Test the LLM Router

Second message - the router remembers the order ID

Comparison Table: FSM vs Graph vs LLM Router

Who It's For (and Who Should Look Elsewhere)

Choose FSM if:

Choose Graph if:

Choose LLM Router if:

Look Elsewhere (Not This Tutorial) if:

Pricing and ROI

Common Errors and Fixes

Error 1: "Invalid state transition" / Conversation gets stuck

FIXED: Always include a catch-all transition

Error 2: "Context window exceeded" / Bot forgets earlier messages

... after 100 messages, you're over the limit

FIXED: Summarize old messages and keep recent context

Error 3: "AuthenticationError" / API key not working

FIXED: Use environment variables with validation

Why Choose HolySheep

My Recommendation

Related Resources

Related Articles

Related Articles

DeepSeek-V3.2 Dominates SWE-bench: How Open-Source Models Ou

Kimi K2.5 Agent Swarm: Orchestrating 100 Parallel Sub-Agents

AI Agent Production Landing Sweet Spot: Why Level 2-3 Is Mor

What is Dialog State Management?

Method 1: Finite State Machine (FSM)

How FSM Works

FSM Implementation with HolySheep

Usage Example

Simulate a conversation

Method 2: Graph-Based Architecture

When to Use Graphs

Graph Implementation

Test the graph

Output: ("I can help with orders.", "ORDER_HELP")

Output: ("I'll start your refund.", "REFUND_START")

Output: ("Let me connect you.", "HUMAN_ESCALATION")

Method 3: LLM Router

LLM Router Implementation

Test the LLM Router

Second message - the router remembers the order ID

Comparison Table: FSM vs Graph vs LLM Router

Who It's For (and Who Should Look Elsewhere)

Choose FSM if:

Choose Graph if:

Choose LLM Router if:

Look Elsewhere (Not This Tutorial) if:

Pricing and ROI

Common Errors and Fixes

Error 1: "Invalid state transition" / Conversation gets stuck

FIXED: Always include a catch-all transition

Error 2: "Context window exceeded" / Bot forgets earlier messages

... after 100 messages, you're over the limit

FIXED: Summarize old messages and keep recent context

Error 3: "AuthenticationError" / API key not working

FIXED: Use environment variables with validation

Why Choose HolySheep

My Recommendation

Related Resources

Related Articles

🔥 Try HolySheep AI

`Output: ("Let me connect you.", "HUMAN_ESCALATION")`