Pattern Composition Playbook

Patterns become useful when they compose into a system that someone can operate. The goal is not to use as many patterns as possible. The goal is to put each responsibility in the right place: workflow owns flow, policy owns permission, tools own side effects, retrieval owns evidence, the loop owns bounded uncertainty, evals own proof, and observability owns learning from failure.

The easiest way to damage an agentic system is to compose patterns by vibe. Add memory because the agent forgot something. Add a second agent because the task feels large. Add reflection because the answer was weak. Add tools because the model needs data. Each decision may sound reasonable, but the result can become a system with no owner, no stop condition, no evidence boundary, and no way to replay failure.

Composition should start with ownership.

Pattern composition playbook

The Composition Question

Before adding a pattern, ask what problem it owns.

Pressure	Pattern To Consider	What Must Own The Boundary
The task has known steps.	Prompt chain or deterministic workflow.	Workflow code.
The next step depends on observations.	Agent loop.	Loop controller and stop rules.
The answer needs evidence.	RAG, semantic recall, or knowledge-bound agent.	Retrieval and source policy.
The model must call tools.	Tool use or MCP-first tool use.	Tool manifest, schema, permission, and audit.
The action is risky.	Policy enforcement and human approval gate.	Runtime policy and durable workflow.
Quality can be judged better than generated.	Evaluator-optimizer or reflection.	Rubric, evidence checks, and revision budget.
Work needs separate context or permissions.	Supervisor-worker or parallel agents.	Coordinator, worker contracts, and merge policy.
Failure must be replayed.	Durable workflow, observability, and eval feedback loop.	Runtime, trace store, and eval suite.

If no pattern owns a concrete problem, do not add it.

For concrete end-to-end examples, read Vertical Slice Examples after this chapter. The slices show the same composition rule applied to support, coding, and research workflows.

A Default Composition

For many production systems, the default composition is:

route the request by task type, risk, and capability;
load durable state, policy context, and caller identity;
assemble a small working set with approved evidence;
run a bounded agent loop only where uncertainty exists;
execute tools through typed schemas and permission checks;
enforce policy before side effects or memory writes;
pause for approval when risk requires it;
evaluate trajectory, evidence, output, and policy behavior;
record traces, costs, decisions, tool calls, and stop reasons;
convert production failures into regression evals.

That sequence is not a framework. It is a responsibility map. A simple system may skip several steps. A high-risk system may need all of them.

Composition Scorecard

Before accepting a composition, score each added pattern against the responsibility it claims to own.

Check	Pass Condition
Job	The pattern solves a named workload pressure, not a vague desire for flexibility.
Owner	One component or team owns the pattern’s behavior and failure response.
Boundary	The pattern has clear inputs, outputs, authority, and non-responsibilities.
Risk	The pattern’s new failure modes are named.
Control	There is a software-owned limit, policy, gate, or fallback for each major risk.
Eval	At least one blocking eval proves the boundary works.
Trace	A production trace can show the pattern’s decision and effect.
Removal	The team knows what would break if the pattern were removed.

If a pattern fails job, owner, or boundary, remove it from the composition. If it fails eval or trace, keep it out of production until the proof exists.

Composition 1: Support Refund Investigation

Use this when the agent investigates a refund but must not directly issue money.

Responsibility	Pattern
Intake and routing	Routing and handoffs.
Evidence	Semantic recall and typed business tools.
Investigation	Bounded agent loop.
Tool execution	Tool use with narrow read tools and draft-only write tools.
Safety	Policy enforcement before refund actions.
Human control	Approval gate for high-value or exception refunds.
Quality	Evaluator checks evidence, policy, and recommendation.
Operations	Trace, replay, and incident-to-eval feedback.

The important boundary is financial authority. The agent may investigate, cite policy, and draft a refund request. It should not issue the refund. The refund action belongs to a policy-backed workflow, usually with approval.

async function handleRefundCase(input: RefundCase) {
  const route = routeSupportCase(input);
  const evidence = await collectRefundEvidence(input.orderId, route.region);

  const investigation = await refundAgent.investigate({
    caseId: input.caseId,
    evidenceRefs: evidence.refs,
    budget: { maxSteps: 6, maxToolCalls: 8, timeoutMs: 45000 }
  });

  const policy = enforcePolicy({
    actorRole: input.agentRole,
    capability: 'refund',
    riskLevel: investigation.riskLevel
  });

  if (policy.decision === 'require_approval') {
    return approvals.request({
      proposedAction: investigation.proposedRefund,
      evidenceRefs: investigation.evidenceRefs,
      policyRefs: investigation.policyRefs
    });
  }

  if (policy.decision !== 'allow') return policy;
  return refunds.draftRefundRequest(investigation);
}

This is agentic, but only in the investigation. The workflow owns routing, policy, approval, and side effects.

Composition 2: Knowledge-Bound Policy Assistant

Use this when the assistant answers from approved policy, documentation, or compliance sources.

Responsibility	Pattern
Source eligibility	Policy enforcement and knowledge-bound agent.
Evidence retrieval	Semantic recall and RAG.
Context control	Context budgets and working sets.
Answer shape	Structured output.
Unsupported questions	Refusal or human escalation.
Quality	Citation coverage and missing-evidence evals.
Operations	Source freshness monitoring and trace review.

The critical boundary is evidence. The assistant should not answer because the model “knows” something. It should answer because the system retrieved eligible sources and the answer cites them.

{
  "answer_status": "answered",
  "answer": "The request can be approved only if damage evidence is attached.",
  "citations": ["refund_policy.v3#damaged-items"],
  "evidence_refs": ["src_refund_policy_2026_04"],
  "missing_evidence": []
}

When evidence is missing, the correct output is not a weaker answer. It is missing_evidence, conflicting_evidence, refused, or needs_human.

Composition 3: Multi-Agent Research And Review

Use this when a task benefits from separated workstreams and independent review.

Responsibility	Pattern
Decomposition	Supervisor-worker.
Worker isolation	Scoped context and tool permissions.
Parallel work	Parallel agents when work is independent.
Review	Evaluator-optimizer or dedicated reviewer worker.
Merge	Supervisor merge policy.
Accountability	One final owner and stop reason.
Operations	Per-worker traces and merge-decision replay.

The critical boundary is final ownership. Multiple agents can produce evidence, drafts, or critiques. One component must own final synthesis.

type ResearchAssignment = {
  workerRole: 'source_finder' | 'technical_reviewer' | 'risk_reviewer';
  objective: string;
  scopedContextRefs: string[];
  allowedTools: string[];
  expectedOutputSchema: string;
  acceptanceCriteria: string[];
};

If every worker sees the same context and tools, you probably do not have a useful multi-agent system. You have duplicated model calls.

What Not To Compose

Some combinations are risky unless there is a strong boundary:

Risky Composition	Why It Fails	Control
Agent loop plus broad tools.	The loop can amplify a bad tool decision.	Narrow tools, policy, approval, stop rules.
RAG plus memory writes.	Retrieved errors become durable facts.	Memory write review and source metadata.
Evaluator plus no evidence.	The evaluator scores confidence without proof.	Citation and trajectory checks.
Multi-agent plus shared tool surface.	Every worker can cause the same damage.	Per-worker tool scopes.
Human approval plus vague request.	Humans approve without knowing the action.	Typed approval request and exact-action binding.
Policy only in prompts.	The model can ignore or reinterpret policy.	Runtime enforcement before execution.

Composition is not about more parts. It is about better boundaries.

Design Review

Before approving a composed system, ask:

What owns the goal?
What owns state?
What owns evidence?
What owns tool permissions?
What owns policy?
What owns approval?
What owns evaluation?
What owns final answer or action?
What owns replay and rollback?
What production failure becomes a new eval?

If the answer is “the prompt” for any high-risk responsibility, the architecture is not ready.

Design Rule

Compose patterns only when each added pattern has a job, a boundary, an owner, and an eval that can fail.