Multi-Agent Systems: Orchestration Patterns That Work
A single agent can book a flight. Can it also check weather, compare prices, find hotels, and make a complete travel plan? Eventually, you hit a wall.
The wall is cognitive load. An agent trying to do too much gets confused. It forgets the main goal. It makes contradictory decisions. It's like asking one person to simultaneously be a travel agent, meteorologist, and financial advisor.
That's where multi-agent systems come in. Instead of one powerful agent, you use many specialized agents and an orchestrator to coordinate them.
In my experience building systems at Amazon, we learned that this is incredibly powerful but also incredibly easy to mess up. This article covers the patterns that work.
Why Single Agents Hit Limits
Let's say you want an agent that:
- Searches flights
- Checks weather
- Searches hotels
- Compares prices
- Checks visa requirements
- Books flights and hotels
- Generates an itinerary
A single agent trying to do all this:
Agent prompt:
"You are a travel planning assistant.
You can search flights, check weather, search hotels, etc.
Plan a complete trip to [destination]."
Agent's challenge:
- 7 different domains of expertise
- 7 different error modes
- 7 tools competing for attention
- Huge prompt (all tool descriptions)
- High context usage
- Easy to lose track of the goal
Real problem: When an agent has 20 tools, it often forgets why it's calling them. It searches flights without remembering the original budget. It books hotels without checking if the flight time makes sense.
The Orchestrator-Worker Pattern
The solution: one orchestrator agent that delegates to specialized worker agents.
flowchart TD
U([User]) --> O[Orchestrator Agent]
O --> FW[Flight Worker]
O --> HW[Hotel Worker]
O --> WW[Weather Worker]
O --> IW[Itinerary Worker]
FW -->|flight results| O
HW -->|hotel results| O
WW -->|weather data| O
IW -->|final plan| U
style U fill:#f5f5f4,stroke:#e7e5e4
style O fill:#fafaf9,stroke:#a8a29e
User input: "Plan a trip to Denver for 3 days, budget $2000"
Orchestrator Agent:
"I need to:
1. Find flights (delegate to FlightWorker)
2. Find hotels (delegate to HotelWorker)
3. Check weather (delegate to WeatherWorker)
4. Create itinerary (aggregate results)
Let me start..."
Orchestrator calls FlightWorker:
"Find flights to Denver for 3 days, under $600 total
(leaving $1400 for hotel and activities)"
FlightWorker searches and returns:
"Best option: United 7:30 AM, $280 round-trip"
Orchestrator calls HotelWorker:
"Find hotels in Denver, 2 nights, budget $600"
HotelWorker searches and returns:
"Best option: Marriott downtown, $300/night, $600 total"
Orchestrator calls WeatherWorker:
"What's the weather in Denver April 5-7?"
WeatherWorker returns:
"Sunny, 65°F"
Orchestrator synthesizes:
"Here's your trip:
- Flight: United 7:30 AM ($280)
- Hotel: Marriott downtown ($600)
- Weather: Sunny, 65°F
- Total: $880 (under budget!)
- Free time: $1120 for activities"
How It Works
Orchestrator structure:
class Orchestrator:
def __init__(self):
self.workers = {
"flights": FlightWorker(),
"hotels": HotelWorker(),
"weather": WeatherWorker(),
"itinerary": ItineraryWorker()
}
def plan_trip(self, destination, duration, budget):
# Step 1: Decompose goal
plan = {
"destination": destination,
"duration": duration,
"budget": budget,
"flight_budget": budget * 0.3, # 30% for flights
"hotel_budget": budget * 0.5, # 50% for hotels
"activities_budget": budget * 0.2 # 20% for activities
}
# Step 2: Delegate to workers
flights = self.workers["flights"].search(
destination,
budget=plan["flight_budget"]
)
hotels = self.workers["hotels"].search(
destination,
duration,
budget=plan["hotel_budget"]
)
weather = self.workers["weather"].check(destination)
# Step 3: Aggregate results
itinerary = self.workers["itinerary"].create(
flights=flights,
hotels=hotels,
weather=weather,
budget=plan
)
return itinerary
Each worker is a simple, focused agent:
class FlightWorker:
def search(self, destination, budget):
# Only does one thing: find flights
# Small prompt, single purpose, easy to test
return api.search_flights(destination, budget)
class HotelWorker:
def search(self, destination, duration, budget):
# Only does one thing: find hotels
return api.search_hotels(destination, duration, budget)
Key insight: Workers don't need to be LLM-based. They can be simple functions that call APIs. Only the orchestrator needs reasoning.
The Critic Pattern
Sometimes you want multiple agents to solve the same problem independently, then critique each other's solutions.
User: "How should I structure my React component?"
Expert 1 (Performance Agent):
"Use React.memo to prevent re-renders.
Split into smaller components.
Use useMemo for expensive calculations."
Expert 2 (Maintainability Agent):
"Create a custom hook to isolate logic.
Use descriptive component names.
Keep components under 100 lines."
Expert 3 (Accessibility Agent):
"Add ARIA labels.
Ensure keyboard navigation.
Use semantic HTML."
Critic Agent reviews all three:
"Great suggestions. I recommend:
1. Use React.memo (Expert 1)
2. Create custom hook (Expert 2)
3. Add ARIA labels (Expert 3)
4. Keep components under 100 lines (Expert 2)
Note: Performance optimization via React.memo should
come after ensuring accessibility (Expert 3 priority)."
Implementation
class ExpertPanel:
def __init__(self):
self.experts = [
PerformanceExpert(),
MaintainabilityExpert(),
AccessibilityExpert()
]
self.critic = CriticAgent()
def evaluate(self, problem):
# Get independent opinions
opinions = []
for expert in self.experts:
opinion = expert.analyze(problem)
opinions.append({
"expert": expert.name,
"opinion": opinion
})
# Have critic synthesize
synthesis = self.critic.synthesize(problem, opinions)
return synthesis
class CriticAgent:
def synthesize(self, problem, opinions):
# Prompt: evaluate all opinions
prompt = f"""
Problem: {problem}
Opinions:
{json.dumps(opinions, indent=2)}
Synthesize these opinions into a single recommendation.
Explain tradeoffs. Prioritize when opinions conflict.
"""
return llm.generate(prompt)
When to use: Complex problems with multiple perspectives. Technical decisions with tradeoffs.
Execution Patterns: Sequential vs. Parallel
Sequential Execution
One worker finishes, next worker starts:
FlightWorker (starts)
↓
(searches, takes 2s)
↓
FlightWorker (finishes)
HotelWorker (starts)
↓
(searches, takes 2s)
↓
HotelWorker (finishes)
Total time: 4s
Pros: Simple, easy to debug, one thing at a time
Cons: Slow, orchestrator waits on each worker
Parallel Execution
Multiple workers run simultaneously:
FlightWorker (starts) ─┐
│
HotelWorker (starts) ──┼─ Run in parallel
│
WeatherWorker (starts) ─┘
Wait for all to finish
Total time: 2s (instead of 6s)
Pros: Fast, utilizes resources
Cons: Harder to debug, harder to handle failures
Implementation
from concurrent.futures import ThreadPoolExecutor, as_completed
class OrchestratorParallel:
def plan_trip(self, destination, duration, budget):
with ThreadPoolExecutor(max_workers=3) as executor:
# Submit tasks
flight_task = executor.submit(
self.workers["flights"].search,
destination,
budget * 0.3
)
hotel_task = executor.submit(
self.workers["hotels"].search,
destination,
duration,
budget * 0.5
)
weather_task = executor.submit(
self.workers["weather"].check,
destination
)
# Wait for all to complete
flights = flight_task.result()
hotels = hotel_task.result()
weather = weather_task.result()
# Now aggregate
return self.aggregate(flights, hotels, weather)
Real choice: Use parallel when workers are independent (flights and hotels). Use sequential when one depends on another (check weather, then choose activity).
Agent-to-Agent Communication: A2A Protocol
How do agents talk to each other? There needs to be a standard format.
Google published A2A (Agent-to-Agent), a protocol for agents to call each other:
Agent A calls Agent B:
Request:
{
"agent_id": "agent_a",
"target_agent": "agent_b",
"task": "search_hotels",
"parameters": {
"city": "Denver",
"nights": 2,
"budget": 600
},
"context": {
"user_preferences": {"luxury": false},
"constraints": {"must_have_wifi": true}
}
}
Agent B responds:
Response:
{
"status": "success",
"result": {
"hotels": [
{
"name": "Marriott",
"price": 300,
"rating": 4.5,
"has_wifi": true
}
],
"search_metadata": {
"results_count": 1,
"search_time_ms": 245
}
}
}
Benefits:
- Standard format (all agents use same protocol)
- Includes context (Agent B knows user preferences)
- Includes metadata (helps orchestrator debug)
- Easy to extend (can add custom fields)
Failure Modes of Multi-Agent Systems
Just because you split agents doesn't mean it works. Here are common failure modes:
1. Lost Context
Orchestrator gives each worker a piece of the goal. Worker forgets the original goal.
Original goal: "Plan a trip under $2000"
Orchestrator tells FlightWorker:
"Find flights under $600"
FlightWorker searches:
→ Finds United flight for $500
→ Doesn't check if this leaves enough budget for hotel
→ Recommends it anyway
Orchestrator tells HotelWorker:
"Find hotel under $1000"
HotelWorker searches:
→ Books 5-star hotel for $900/night
→ Total trip cost: $500 + $900*2 = $2300
→ Over budget!
Mitigation: Pass full context to workers, not just partial requests:
# Instead of this:
flights = flight_worker.search(
destination="Denver",
budget=600
)
# Do this:
flights = flight_worker.search(
destination="Denver",
budget=600,
context={
"total_budget": 2000,
"other_costs": 1100,
"reason": "User planning a 3-day trip"
}
)
2. Orchestrator Bottleneck
The orchestrator becomes the bottleneck. Every decision goes through it.
If orchestrator has to:
- Check worker results
- Validate outputs
- Handle worker failures
- Retry workers
- Aggregate results
Then orchestrator becomes complex and slow.
Mitigation: Let workers be autonomous. Only orchestrate high-level goals:
# Heavy orchestration (bad)
class OrchestratorBad:
def book_trip(self, destination):
flights = self.search_flights(destination)
# Check flights
if not flights:
return {"error": "No flights"}
# Check weather
weather = self.check_weather(destination)
# Book only if good weather
if weather.rain_probability < 0.3:
booking = self.book_flight(flights[0])
return booking
else:
return {"error": "Bad weather"}
# Light orchestration (good)
class OrchestratorGood:
def book_trip(self, destination):
# Delegate everything to workers
plan = self.trip_planner_agent.plan(destination)
return plan
# TripPlannerAgent handles all logic internally
3. Worker Conflicts
Two workers make conflicting decisions.
FlightWorker finds flight at 11 PM (cheap!)
HotelWorker finds hotel 30 miles from airport
ItineraryWorker realizes: can't get from airport to hotel at midnight
Result: Invalid plan
Mitigation: Have workers share constraints:
constraints = {
"flight_arrival_time": "between 6 AM and 10 PM",
"hotel_distance_from_airport": "< 20 miles",
"total_travel_time": "< 2 hours"
}
flights = flight_worker.search(destination, constraints)
hotels = hotel_worker.search(destination, constraints)
4. Cascading Failures
One worker fails. Orchestrator doesn't know how to recover.
FlightWorker fails (API down)
Orchestrator stops
Entire trip planning fails
Should have fallen back to:
- Show cached flights
- Show flights from yesterday
- Ask user to try again
Mitigation: Build explicit fallback logic:
def search_flights_with_fallback(destination, budget):
try:
return flight_api.search(destination, budget)
except APIDownError:
# Fallback 1: Use cached results from last hour
cached = cache.get_flights(destination)
if cached:
return cached
# Fallback 2: Use historical data
historical = db.get_avg_flights(destination)
return historical
# Fallback 3: Inform user
raise UserError(
"Can't search flights right now. "
"Try again in a few minutes."
)
When NOT to Use Multi-Agent Systems
Multi-agent sounds powerful. But it adds complexity. Sometimes a single agent is better.
Use single agent if:
- Problem is simple (1-3 tools)
- Tools are tightly coupled (can't separate)
- Low latency is critical (orchestration adds overhead)
- Domain is narrow (expert focus possible)
Use multi-agent if:
- Problem is complex (5+ domains)
- Tools are independent
- Accuracy matters more than latency
- Different tools have different requirements
Real rule: Don't split agents until you have to.
Practical Example: Customer Support System
Here's a real multi-agent system:
class CustomerSupportOrchestrator:
def __init__(self):
self.agents = {
"intent": IntentClassifier(), # What does user want?
"faq": FAQAgent(), # Is this a common question?
"account": AccountAgent(), # Access account data?
"billing": BillingAgent(), # Billing issue?
"technical": TechnicalAgent(), # Technical issue?
"resolver": ResolverAgent() # Final resolution
}
def handle_support_request(self, user_input):
# Step 1: Classify intent
intent = self.agents["intent"].classify(user_input)
# Returns: "billing", "technical", "refund", etc.
# Step 2: Try FAQ
faq_result = self.agents["faq"].search(user_input)
if faq_result["confidence"] > 0.9:
return {"type": "faq", "answer": faq_result["answer"]}
# Step 3: Gather context (parallel)
with ThreadPoolExecutor(max_workers=2) as executor:
if intent in ["billing", "refund"]:
billing_task = executor.submit(
self.agents["billing"].get_history,
user_id
)
billing_context = billing_task.result()
else:
billing_context = None
if "account" in user_input.lower():
account_task = executor.submit(
self.agents["account"].get_details,
user_id
)
account_context = account_task.result()
else:
account_context = None
# Step 4: Resolve
resolution = self.agents["resolver"].resolve(
user_input,
intent=intent,
billing_context=billing_context,
account_context=account_context
)
return resolution
This system:
- Classifies the problem once
- Gathers only needed context
- Delegates resolution to specialized agent
- Scales easily (add more agents)
- Maintains focus (each agent has one job)
Key Takeaways
Multi-agent systems work when:
- Single orchestrator coordinates specialized workers
- Workers are autonomous and focus on one domain
- Communication is standard (via A2A or similar)
- Execution is efficient (parallel when possible)
- Failures are handled (explicit fallbacks)
Common patterns:
- Orchestrator-Worker: One coordinator, many specialists
- Critic Pattern: Multiple experts, one critic
- Parallel Execution: Run independent workers simultaneously
- Sequential with Feedback: Later workers learn from earlier results
Don't split agents until you need to. But when problems get complex, multi-agent systems are often the only way forward.
Start simple. Add agents as needed.