4.1 — Multi-Agent Architecture

Why Multiple Agents?

Single agents hit limits: the context window fills up, they lose focus on long tasks, and they can’t be experts in everything. Multi-agent systems split work across specialized agents, each with a clear scope and appropriate tools.

Architecture Patterns

Pattern	Structure	Example
Parallel workers	Multiple agents do independent tasks simultaneously; results are combined	5 agents each research a different competitor, results merged into one report
Pipeline / Assembly line	Agent A’s output feeds into Agent B, which feeds into Agent C	Agent 1 writes code → Agent 2 reviews it → Agent 3 writes tests
Supervisor + workers	One agent delegates tasks to specialized workers and synthesizes results	A “project manager” agent assigns coding, testing, and documentation to specialized agents
Swarm	Agents communicate peer-to-peer without central coordination	Market analysis swarm: scanner, news analyst, risk manager each contribute to a decision

Key Design Decisions

Before building a multi-agent system, answer these five questions:

What does each agent do? Assign clear, non-overlapping responsibilities. Overlap = confusion and conflicts.
How do they communicate? Shared files, message passing, or a shared database — choose one model and be consistent.
What can each agent access? Apply the principle of least privilege — each agent gets only the tools and permissions it needs, nothing more.
Where does a human intervene? Which decisions require human judgment? Define the handoff points before you build, not after something goes wrong.
What happens when one fails? Design for graceful degradation, not cascade failure. One broken agent should not take down the whole system.

Vocabulary

Term	Definition
Subagent	An agent spawned by another agent to handle a specific subtask
Delegation	Assigning a task from one agent to another
Fan-out / Fan-in	Splitting work across multiple agents (fan-out), then combining results (fan-in)
Graceful degradation	System continues working (possibly with reduced capability) when a component fails, rather than crashing entirely
Principle of least privilege	Each component gets only the minimum access it needs to do its job — nothing more
Idempotent	An operation that produces the same result whether you run it once or ten times. Safe to retry.
Race condition	When two processes try to change the same thing at the same time, producing unpredictable results

Next: 4.2 — System Design | Phase overview: Phase 4