Pattern Composition Playbook
Patterns become useful when they compose into a system that someone can operate. The goal is not to use as many patterns as possible. The goal is to put each responsibility in the right place: workflow owns flow, policy owns permission, tools own side effects, retrieval owns evidence, the loop owns bounded uncertainty, evals own proof, and observability owns learning from failure.
The easiest way to damage an agentic system is to compose patterns by vibe. Add memory because the agent forgot something. Add a second agent because the task feels large. Add reflection because the answer was weak. Add tools because the model needs data. Each decision may sound reasonable, but the result can become a system with no owner, no stop condition, no evidence boundary, and no way to replay failure.
Composition should start with ownership.
The Composition Question
Before adding a pattern, ask what problem it owns.
| Pressure | Pattern To Consider | What Must Own The Boundary |
|---|---|---|
| The task has known steps. | Prompt chain or deterministic workflow. | Workflow code. |
| The next step depends on observations. | Agent loop. | Loop controller and stop rules. |
| The answer needs evidence. | RAG, semantic recall, or knowledge-bound agent. | Retrieval and source policy. |
| The model must call tools. | Tool use or MCP-first tool use. | Tool manifest, schema, permission, and audit. |
| The action is risky. | Policy enforcement and human approval gate. | Runtime policy and durable workflow. |
| Quality can be judged better than generated. | Evaluator-optimizer or reflection. | Rubric, evidence checks, and revision budget. |
| Work needs separate context or permissions. | Supervisor-worker or parallel agents. | Coordinator, worker contracts, and merge policy. |
| Failure must be replayed. | Durable workflow, observability, and eval feedback loop. | Runtime, trace store, and eval suite. |
If no pattern owns a concrete problem, do not add it.
For concrete end-to-end examples, read Vertical Slice Examples after this chapter. The slices show the same composition rule applied to support, coding, and research workflows.
A Default Composition
For many production systems, the default composition is:
- route the request by task type, risk, and capability;
- load durable state, policy context, and caller identity;
- assemble a small working set with approved evidence;
- run a bounded agent loop only where uncertainty exists;
- execute tools through typed schemas and permission checks;
- enforce policy before side effects or memory writes;
- pause for approval when risk requires it;
- evaluate trajectory, evidence, output, and policy behavior;
- record traces, costs, decisions, tool calls, and stop reasons;
- convert production failures into regression evals.
That sequence is not a framework. It is a responsibility map. A simple system may skip several steps. A high-risk system may need all of them.
Composition Scorecard
Before accepting a composition, score each added pattern against the responsibility it claims to own.
| Check | Pass Condition |
|---|---|
| Job | The pattern solves a named workload pressure, not a vague desire for flexibility. |
| Owner | One component or team owns the pattern’s behavior and failure response. |
| Boundary | The pattern has clear inputs, outputs, authority, and non-responsibilities. |
| Risk | The pattern’s new failure modes are named. |
| Control | There is a software-owned limit, policy, gate, or fallback for each major risk. |
| Eval | At least one blocking eval proves the boundary works. |
| Trace | A production trace can show the pattern’s decision and effect. |
| Removal | The team knows what would break if the pattern were removed. |
If a pattern fails job, owner, or boundary, remove it from the composition. If it fails eval or trace, keep it out of production until the proof exists.
Composition 1: Support Refund Investigation
Use this when the agent investigates a refund but must not directly issue money.
| Responsibility | Pattern |
|---|---|
| Intake and routing | Routing and handoffs. |
| Evidence | Semantic recall and typed business tools. |
| Investigation | Bounded agent loop. |
| Tool execution | Tool use with narrow read tools and draft-only write tools. |
| Safety | Policy enforcement before refund actions. |
| Human control | Approval gate for high-value or exception refunds. |
| Quality | Evaluator checks evidence, policy, and recommendation. |
| Operations | Trace, replay, and incident-to-eval feedback. |
The important boundary is financial authority. The agent may investigate, cite policy, and draft a refund request. It should not issue the refund. The refund action belongs to a policy-backed workflow, usually with approval.
async function handleRefundCase(input: RefundCase) {
const route = routeSupportCase(input);
const evidence = await collectRefundEvidence(input.orderId, route.region);
const investigation = await refundAgent.investigate({
caseId: input.caseId,
evidenceRefs: evidence.refs,
budget: { maxSteps: 6, maxToolCalls: 8, timeoutMs: 45000 }
});
const policy = enforcePolicy({
actorRole: input.agentRole,
capability: 'refund',
riskLevel: investigation.riskLevel
});
if (policy.decision === 'require_approval') {
return approvals.request({
proposedAction: investigation.proposedRefund,
evidenceRefs: investigation.evidenceRefs,
policyRefs: investigation.policyRefs
});
}
if (policy.decision !== 'allow') return policy;
return refunds.draftRefundRequest(investigation);
}
This is agentic, but only in the investigation. The workflow owns routing, policy, approval, and side effects.
Composition 2: Knowledge-Bound Policy Assistant
Use this when the assistant answers from approved policy, documentation, or compliance sources.
| Responsibility | Pattern |
|---|---|
| Source eligibility | Policy enforcement and knowledge-bound agent. |
| Evidence retrieval | Semantic recall and RAG. |
| Context control | Context budgets and working sets. |
| Answer shape | Structured output. |
| Unsupported questions | Refusal or human escalation. |
| Quality | Citation coverage and missing-evidence evals. |
| Operations | Source freshness monitoring and trace review. |
The critical boundary is evidence. The assistant should not answer because the model “knows” something. It should answer because the system retrieved eligible sources and the answer cites them.
{
"answer_status": "answered",
"answer": "The request can be approved only if damage evidence is attached.",
"citations": ["refund_policy.v3#damaged-items"],
"evidence_refs": ["src_refund_policy_2026_04"],
"missing_evidence": []
}
When evidence is missing, the correct output is not a weaker answer. It is missing_evidence, conflicting_evidence, refused, or needs_human.
Composition 3: Multi-Agent Research And Review
Use this when a task benefits from separated workstreams and independent review.
| Responsibility | Pattern |
|---|---|
| Decomposition | Supervisor-worker. |
| Worker isolation | Scoped context and tool permissions. |
| Parallel work | Parallel agents when work is independent. |
| Review | Evaluator-optimizer or dedicated reviewer worker. |
| Merge | Supervisor merge policy. |
| Accountability | One final owner and stop reason. |
| Operations | Per-worker traces and merge-decision replay. |
The critical boundary is final ownership. Multiple agents can produce evidence, drafts, or critiques. One component must own final synthesis.
type ResearchAssignment = {
workerRole: 'source_finder' | 'technical_reviewer' | 'risk_reviewer';
objective: string;
scopedContextRefs: string[];
allowedTools: string[];
expectedOutputSchema: string;
acceptanceCriteria: string[];
};
If every worker sees the same context and tools, you probably do not have a useful multi-agent system. You have duplicated model calls.
What Not To Compose
Some combinations are risky unless there is a strong boundary:
| Risky Composition | Why It Fails | Control |
|---|---|---|
| Agent loop plus broad tools. | The loop can amplify a bad tool decision. | Narrow tools, policy, approval, stop rules. |
| RAG plus memory writes. | Retrieved errors become durable facts. | Memory write review and source metadata. |
| Evaluator plus no evidence. | The evaluator scores confidence without proof. | Citation and trajectory checks. |
| Multi-agent plus shared tool surface. | Every worker can cause the same damage. | Per-worker tool scopes. |
| Human approval plus vague request. | Humans approve without knowing the action. | Typed approval request and exact-action binding. |
| Policy only in prompts. | The model can ignore or reinterpret policy. | Runtime enforcement before execution. |
Composition is not about more parts. It is about better boundaries.
Design Review
Before approving a composed system, ask:
- What owns the goal?
- What owns state?
- What owns evidence?
- What owns tool permissions?
- What owns policy?
- What owns approval?
- What owns evaluation?
- What owns final answer or action?
- What owns replay and rollback?
- What production failure becomes a new eval?
If the answer is “the prompt” for any high-risk responsibility, the architecture is not ready.
Design Rule
Compose patterns only when each added pattern has a job, a boundary, an owner, and an eval that can fail.