Multi-Agent Dialog Orchestration: Message Routing và Task Assignment Design

Chào các bạn, mình là Minh, Senior AI Engineer tại một startup e-commerce tại Việt Nam. Hôm nay mình muốn chia sẻ một bài học đắt giá khi triển khai hệ thống Multi-Agent — và tất cả bắt đầu từ một lỗi mà chúng tôi gặp phải lúc 2 giờ sáng.
Bối Cảnh Thực Tế: Kịch Bản Lỗi Đầu Tiên

Đêm đó, hệ thống chăm sóc khách hàng của chúng tôi sụp đổ hoàn toàn. Khách hàng nhận được phản hồi lung tung — chatbot trả lời về shipping trong khi hỏi về thanh toán, hoặc tệ hơn, cùng một câu hỏi được chuyển qua 4 agent khác nhau mà không ai giải quyết được.

Lỗi mà chúng tôi nhận được trên production
ConnectionError: Agent 'payment-agent' timeout after 30000ms
  at Router.route() in /app/services/router.js:147
  at async processMessage() in /app/services/handler.js:89
  
Tiếp theo là một loạt lỗi chuỗi
RuntimeError: Maximum recursion depth exceeded in agent orchestration
  at OrchestratorDelegate.forward() in /app/core/orchestrator.ts:203

Nguyên nhân gốc rễ? Thiếu một hệ thống message routing thông minh và không có cơ chế task assignment rõ ràng giữa các agent. Bài viết hôm nay sẽ hướng dẫn các bạn thiết kế một hệ thống Multi-Agent orchestration hoàn chỉnh, tránh những sai lầm mà chúng tôi đã mắc phải.

1. Tổng Quan Kiến Trúc Multi-Agent

Trước khi đi vào code, chúng ta cần hiểu rõ các thành phần cốt lõi của một hệ thống Multi-Agent orchestration hiệu quả.

1.1 Các Thành Phần Chính


Orchestrator (Bộ điều phối trung tâm): Quyết định agent nào nhận request tiếp theo
Message Router: Phân tích intent và chuyển message đến agent phù hợp
Task Queue: Quản lý hàng đợi task cho từng agent
Agent Pool: Tập hợp các agent chuyên biệt (payment, shipping, catalog, support)
State Manager: Duy trì conversation state across agents



┌─────────────────────────────────────────────────────────────┐
│                     ORCHESTRATOR LAYER                       │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────┐  │
│  │   Intent    │  │   Task      │  │   State            │  │
│  │   Analyzer  │──│   Assigner  │──│   Manager          │  │
│  └─────────────┘  └─────────────┘  └─────────────────────┘  │
└─────────────────────────────────────────────────────────────┘
                              │
        ┌─────────────────────┼─────────────────────┐
        ▼                     ▼                     ▼
┌───────────────┐   ┌───────────────┐   ┌───────────────┐
│  Payment      │   │  Shipping     │   │  Catalog      │
│  Agent        │   │  Agent        │   │  Agent        │
│  (OpenAI)     │   │  (Claude)     │   │  (Gemini)     │
└───────────────┘   └───────────────┘   └───────────────┘
        │                     │                     │
        └─────────────────────┼─────────────────────┘
                              ▼
                    ┌─────────────────┐
                    │  Response       │
                    │  Aggregator     │
                    └─────────────────┘


2. Triển Khai Message Routing System

Đây là phần quan trọng nhất — nơi chúng ta xây dựng bộ não điều phối. Mình sẽ hướng dẫn từng bước với code thực tế sử dụng HolySheep AI API.

2.1 Cài Đặt Cơ Bản


Cài đặt dependencies
npm install @holysheep/ai-sdk axios ioredis uuid
Hoặc sử dụng pip cho Python
pip install holysheep-ai-sdk requests redis


File: config.py
import os

Cấu hình HolySheep AI - Đăng ký tại https://www.holysheep.ai/register
HOLYSHEEP_CONFIG = {
    "base_url": "https://api.holysheep.ai/v1",
    "api_key": "YOUR_HOLYSHEEP_API_KEY",  # Thay bằng key của bạn
    
    # Model mapping cho từng agent
    "agents": {
        "orchestrator": "gpt-4.1",        # $8/MTok - Xử lý logic chính
        "payment": "claude-sonnet-4.5",    # $15/MTok - Chuyên payment
        "shipping": "gpt-4.1",            # $8/MTok - Logistics
        "catalog": "gemini-2.5-flash",     # $2.50/MTok - Tìm kiếm sản phẩm
        "support": "deepseek-v3.2"         # $0.42/MTok - Hỗ trợ chung
    },
    
    # Timeout settings (ms)
    "timeout": {
        "default": 30000,
        "critical": 5000,  # Payment operations
        "background": 60000
    },
    
    # Retry configuration
    "retry": {
        "max_attempts": 3,
        "backoff_ms": [100, 500, 2000]
    }
}

Redis cho state management
REDIS_CONFIG = {
    "host": os.getenv("REDIS_HOST", "localhost"),
    "port": 6379,
    "db": 0,
    "key_prefix": "multiagent:"
}


2.2 Intent Analyzer - Bộ Phân Tích Ý Định

Đây là thành phần quyết định message sẽ được chuyển đến agent nào. Mình đã thử nhiều cách và cuối cùng chọn hướng tiếp cận hybrid: rule-based kết hợp LLM-powered classification.


File: intent_analyzer.py
import requests
import json
from typing import Dict, List, Optional
from dataclasses import dataclass

@dataclass
class IntentResult:
    primary_intent: str
    confidence: float
    entities: Dict[str, any]
    required_agents: List[str]
    suggested_context: Optional[Dict] = None

class IntentAnalyzer:
    def __init__(self, config: dict):
        self.config = config
        self.base_url = config["base_url"]
        self.api_key = config["api_key"]
        
        # Intent taxonomy - định nghĩa các intent và agent tương ứng
        self.intent_mapping = {
            "payment": {
                "keywords": ["thanh toán", "payment", "chuyển khoản", "visa", 
                           "ví điện tử", "e-wallet", "hoàn tiền", "refund"],
                "agent": "payment",
                "critical": True
            },
            "shipping": {
                "keywords": ["ship", "vận chuyển", "giao hàng", "tracking", 
                            "đơn hàng", "delivery", "express"],
                "agent": "shipping",
                "critical": False
            },
            "product_inquiry": {
                "keywords": ["sản phẩm", "product", "giá", "price", "tìm",
                            "mua", "size", "màu", "còn hàng"],
                "agent": "catalog",
                "critical": False
            },
            "support": {
                "keywords": ["khiếu nại", "complaint", "lỗi", "bug", "help",
                            "hỗ trợ", "support", "tư vấn"],
                "agent": "support",
                "critical": False
            }
        }
    
    def classify_intent(self, message: str, context: Dict = None) -> IntentResult:
        """
        Phân tích intent sử dụng HolySheheep AI
        Độ trễ thực tế: ~45-80ms cho classification
        Chi phí: ~$0.00008 cho 1 request (gpt-4.1)
        """
        
        # Bước 1: Quick rule-based check
        message_lower = message.lower()
        for intent_name, intent_config in self.intent_mapping.items():
            if any(kw in message_lower for kw in intent_config["keywords"]):
                return IntentResult(
                    primary_intent=intent_name,
                    confidence=0.95,
                    entities=self._extract_entities(message),
                    required_agents=[intent_config["agent"]],
                    suggested_context={"routing_method": "keyword_match"}
                )
        
        # Bước 2: LLM-powered classification cho cases phức tạp
        prompt = f"""Phân tích message sau và xác định intent chính:
        
Message: "{message}"
Context hiện tại: {context or "Không có"}

Trả về JSON format:
{{
    "primary_intent": "payment|shipping|product_inquiry|support|general",
    "confidence": 0.0-1.0,
    "entities": {{"key": "value"}},
    "required_agents": ["agent_name(s)"],
    "urgency": "low|medium|high|critical"
}}
"""
        
        try:
            response = self._call_llm("orchestrator", prompt)
            result = json.loads(response)
            
            return IntentResult(
                primary_intent=result["primary_intent"],
                confidence=result["confidence"],
                entities=result.get("entities", {}),
                required_agents=result["required_agents"],
                suggested_context={
                    "urgency": result.get("urgency", "medium"),
                    "routing_method": "llm_classification"
                }
            )
            
        except Exception as e:
            # Fallback to general support
            return IntentResult(
                primary_intent="support",
                confidence=0.5,
                entities={},
                required_agents=["support"],
                suggested_context={"fallback": True}
            )
    
    def _call_llm(self, agent_name: str, prompt: str, **kwargs) -> str:
        """Gọi HolySheep AI API với retry logic"""
        model = self.config["agents"].get(agent_name, "gpt-4.1")
        timeout = self.config["timeout"]["default"]
        
        payload = {
            "model": model,
            "messages": [{"role": "user", "content": prompt}],
            "temperature": kwargs.get("temperature", 0.3),
            "max_tokens": kwargs.get("max_tokens", 500)
        }
        
        headers = {
            "Authorization": f"Bearer {self.api_key}",
            "Content-Type": "application/json"
        }
        
        # Retry logic với exponential backoff
        for attempt in range(self.config["retry"]["max_attempts"]):
            try:
                resp = requests.post(
                    f"{self.base_url}/chat/completions",
                    json=payload,
                    headers=headers,
                    timeout=timeout / 1000
                )
                resp.raise_for_status()
                data = resp.json()
                return data["choices"][0]["message"]["content"]
                
            except requests.exceptions.Timeout:
                wait_time = self.config["retry"]["backoff_ms"][attempt] / 1000
                if attempt < self.config["retry"]["max_attempts"] - 1:
                    time.sleep(wait_time)
                    timeout *= 2  # Tăng timeout cho attempt tiếp theo
                else:
                    raise TimeoutError(f"LLM call timeout sau {attempt + 1} attempts")
                    
            except requests.exceptions.HTTPError as e:
                if e.response.status_code == 401:
                    raise PermissionError("Invalid API key - Kiểm tra HOLYSHEEP_API_KEY")
                elif e.response.status_code == 429:
                    time.sleep(5)  # Rate limit
                    continue
                raise
    
    def _extract_entities(self, text: str) -> Dict:
        """Trích xuất entities đơn giản"""
        entities = {}
        
        # Order ID pattern
        import re
        order_match = re.search(r'đơn\s*(?:hàng)?\s*[#:]?\s*([A-Z0-9]{6,})', text, re.I)
        if order_match:
            entities["order_id"] = order_match.group(1)
        
        # Price patterns
        price_match = re.search(r'([\d,.]+)\s*(?:VNĐ|đ|USD|\$)', text)
        if price_match:
            entities["price"] = price_match.group(1)
        
        return entities
```

3. Task Assignment và Agent Pool Management

Sau khi biết message thuộc intent nào, chúng ta cần assign task cho agent phù hợp. Đây là phần mình đã gặp nhiều vấn đề nhất.

3.1 Task Assigner - Phân Bổ Task Thông Minh


File: task_assigner.py
import asyncio
import time
from typing import Dict, List, Optional, Any
from dataclasses import dataclass, field
from enum import Enum
from collections import defaultdict
import redis.asyncio as redis

class TaskPriority(Enum):
    CRITICAL = 1   # Payment, Security
    HIGH = 2       # Shipping status, Order issues
    NORMAL = 3     # General inquiries
    LOW = 4        # Suggestions, Feedback

@dataclass
class Task:
    task_id: str
    agent_name: str
    payload: Dict[str, Any]
    priority: TaskPriority
    timeout_ms: int
    retry_count: int = 0
    created_at: float = field(default_factory=time.time)
    metadata: Dict = field(default_factory=dict)

@dataclass
class AgentStatus:
    name: str
    model: str
    is_available: bool
    current_load: int
    max_concurrent: int
    avg_response_time_ms: float
    success_rate: float
    last_health_check: float

class TaskAssigner:
    def __init__(self, config: dict, redis_client: redis.Redis):
        self.config = config
        self.redis = redis_client
        self.agents: Dict[str, AgentStatus] = {}
        self.task_queue: Dict[str, asyncio.Queue] = defaultdict(asyncio.Queue)
        
        # Initialize agent pool
        self._init_agent_pool()
    
    def _init_agent_pool(self):
        """Khởi tạo agent pool với thông tin từ config"""
        agent_configs = {
            "payment": {
                "max_concurrent": 3,
                "priority": TaskPriority.CRITICAL,
                "timeout": 5000  # 5s cho payment
            },
            "shipping": {
                "max_concurrent": 10,
                "priority": TaskPriority.HIGH,
                "timeout": 15000
            },
            "catalog": {
                "max_concurrent": 20,
                "priority": TaskPriority.NORMAL,
                "timeout": 30000
            },
            "support": {
                "max_concurrent": 15,
                "priority": TaskPriority.NORMAL,
                "timeout": 30000
            }
        }
        
        for agent_name, agent_cfg in agent_configs.items():
            self.agents[agent_name] = AgentStatus(
                name=agent_name,
                model=self.config["agents"][agent_name],
                is_available=True,
                current_load=0,
                max_concurrent=agent_cfg["max_concurrent"],
                avg_response_time_ms=100,
                success_rate=0.95,
                last_health_check=time.time()
            )
    
    async def assign_task(
        self, 
        agent_name: str, 
        payload: Dict,
        priority: TaskPriority = TaskPriority.NORMAL,
        context: Dict = None
    ) -> Task:
        """
        Assign task cho agent cụ thể với smart routing
        
        Chi phí ước tính (sử dụng HolySheep pricing 2026):
        - gpt-4.1: $8/MTok → ~$0.0004/task
        - claude-sonnet-4.5: $15/MTok → ~$0.00075/task
        - deepseek-v3.2: $0.42/MTok → ~$0.00002/task (TIẾT KIỆM 95%!)
        """
        
        task_id = f"task_{int(time.time() * 1000)}"
        
        # Get agent config
        agent = self.agents.get(agent_name)
        if not agent:
            raise ValueError(f"Unknown agent: {agent_name}")
        
        # Check agent availability
        if not agent.is_available:
            # Fallback to another agent or queue
            fallback_agent = self._find_fallback_agent(agent_name, priority)
            if fallback_agent:
                agent = self.agents[fallback_agent]
                agent_name = fallback_agent
        
        # Create task
        task = Task(
            task_id=task_id,
            agent_name=agent_name,
            payload=payload,
            priority=priority,
            timeout_ms=self._get_timeout(priority, agent_name),
            metadata={
                "context": context or {},
                "assigned_at": time.time(),
                "model": agent.model
            }
        )
        
        # Update agent load
        agent.current_load += 1
        
        # Queue task
        await self.task_queue[agent_name].put(task)
        
        # Store task in Redis for tracking
        await self.redis.hset(
            f"{self.config.get('key_prefix', 'multiagent:')}tasks",
            task_id,
            json.dumps({
                "status": "queued",
                "agent": agent_name,
                "priority": priority.value,
                "created_at": task.created_at
            })
        )
        
        return task
    
    async def assign_multi_agent(
        self,
        required_agents: List[str],
        task_data: Dict,
        coordination_mode: str = "sequential"
    ) -> List[Task]:
        """
        Assign task cho nhiều agent cùng lúc
        
        Args:
            required_agents: Danh sách agents cần thiết
            task_data: Dữ liệu task chung
            coordination_mode: 
                - "sequential": Agent này xong mới đến agent kia
                - "parallel": Tất cả chạy song song
                - "fan-out": Một agent broadcast, nhiều agent xử lý
        """
        
        tasks = []
        
        if coordination_mode == "parallel":
            # Tất cả agents chạy song song
            coros = [
                self.assign_task(
                    agent_name=agent,
                    payload={
                        **task_data,
                        "coordination": {
                            "mode": "parallel",
                            "sibling_agents": [a for a in required_agents if a != agent]
                        }
                    },
                    priority=TaskPriority.HIGH
                )
                for agent in required_agents
            ]
            tasks = await asyncio.gather(*coros, return_exceptions=True)
            
        elif coordination_mode == "sequential":
            # Sequential: xử lý từng agent
            accumulated_context = {}
            for agent in required_agents:
                task = await self.assign_task(
                    agent_name=agent,
                    payload={
                        **task_data,
                        "context_from_previous": accumulated_context
                    },
                    priority=TaskPriority.HIGH
                )
                tasks.append(task)
                
                # Wait for this task to complete before next
                result = await self._wait_for_task(task)
                if result and result.get("output"):
                    accumulated_context[agent] = result["output"]
        
        elif coordination_mode == "fan-out":
            # Fan-out pattern: orchestrator → multiple workers
            orchestrator_result = await self._run_orch
Tài nguyên liên quan
📚 Hướng dẫn AI API
💰 Xem giá
📖 Tài liệu nhà phát triển
🚀 Đăng ký miễn phí
Bài viết liên quan
SkyPilot: Triển Khai LLM Trên Đa Đám Mây GPU Với Chi Phí Tối
AI Data Extraction: Tự Động Trích Xuất Dữ Liệu Có Cấu Trúc T
Upstage Solar Pro 2 API 接入教程：韩国开源 LLM
Bối Cảnh Thực Tế: Kịch Bản Lỗi Đầu Tiên

Lỗi mà chúng tôi nhận được trên production

Tiếp theo là một loạt lỗi chuỗi

1. Tổng Quan Kiến Trúc Multi-Agent

1.1 Các Thành Phần Chính

2. Triển Khai Message Routing System

2.1 Cài Đặt Cơ Bản

Cài đặt dependencies

Hoặc sử dụng pip cho Python

File: config.py

Cấu hình HolySheep AI - Đăng ký tại https://www.holysheep.ai/register

Redis cho state management

2.2 Intent Analyzer - Bộ Phân Tích Ý Định

File: intent_analyzer.py

3. Task Assignment và Agent Pool Management

3.1 Task Assigner - Phân Bổ Task Thông Minh

File: task_assigner.py

Tài nguyên liên quan

Bài viết liên quan

🔥 Thử HolySheep AI