Structured Output

Structured output constrains model responses to typed data that software can validate and consume.

Source and downloads

Repository source

Download code bundle

Intent

The Structured Output Pattern constrains model responses to typed data that software can validate, route, store, and test. It is the boundary between natural language reasoning and deterministic application logic.

Use When

Model output controls a tool call, workflow branch, policy decision, or database write.
Downstream code needs stable fields rather than prose.
You need regression tests for model-assisted behavior.

Avoid When

The output is purely creative prose for human reading.
A deterministic parser already handles the input safely.
The schema is so broad that it no longer constrains behavior.

Architecture

Use this diagram to read Structured Output as a system boundary, not only a code shape. The key ownership question is: the caller or a small application service owns task state until a runtime pattern is introduced.

Structured output validation architecture

System Shape

Pattern boundary: a narrow agent function, class, or service boundary accepts input plus context and returns a typed answer, action, or decision.
State owner: the caller or a small application service owns task state until a runtime pattern is introduced.
Primary artifact: structured-output-pattern/ contains the runnable reference implementation and examples.
Operational promise: Structured output constrains model responses to typed data that software can validate and consume.

Core Protocol

Accept a bounded input, goal, or task request.
Assemble the minimum useful instructions, context, state, and tool descriptions.
Run the model or deterministic helper behind a typed boundary.
Validate the result before returning it to users, tools, or durable state.
Record enough evidence to explain the output later.

Implementation Notes

Define schemas close to the code that consumes them.
Validate every model response before use, even when the provider offers structured output support.
Prefer enums for routing decisions and discriminated unions for multi-action outputs.
Log validation failures and repair attempts as first-class evaluation data.
Keep the validated output close to the next runtime action. A valid object should still pass policy, approval, and state checks before it triggers side effects.

Failure Modes

Schemas that mirror prose and provide little safety.
Silent coercion of missing or invalid fields.
Prompt-only formatting rules with no validator.
Overly strict schemas that cause brittle failures on harmless variation.
A valid object carries unsupported values that violate domain or policy rules.
Repair loops hide repeated model failures and increase cost without a stop condition.

Evaluation Strategy

Evaluate syntax, semantics, and downstream safety separately. Schema validity proves that the output can be parsed. It does not prove that the values are correct or safe to execute.

Test missing required fields, extra fields, invalid enums, wrong types, and malformed nested objects.
Test values that pass schema validation but violate domain constraints, such as a refund above the order total.
Test unsupported evidence references and contradictory fields.
Test one repair attempt, repeated repair failure, and the final refusal or escalation path.
Test schema-version changes against stored fixtures and downstream consumers.
Assert that invalid output never reaches tools, policy decisions, or durable state.

Use deterministic validators for structure and domain invariants. Use human or model review only for fields that require judgment.

type StructuredOutputEvalCase = {
  caseId: string;
  modelOutput: unknown;
  expected: {
    schemaValid: boolean;
    domainValid: boolean;
    actionAllowed: boolean;
    repairAttempts: number;
    finalStatus: "accepted" | "repaired" | "refused" | "needs_human";
  };
};

Measure first-pass schema validity, domain-validity rate, repair success rate, repair attempts per accepted output, unsafe acceptance rate, false rejection rate, and schema-version compatibility.

For the shared eval case contract and release-gate method, see Evaluation-Driven Agent Development.

Production Checklist

Define the input, context, output, and error contract.
Keep prompts, schemas, and tool descriptions versioned.
Add deterministic tests for the smallest useful behavior.
Log model decisions without leaking secrets or private user data.
Define human escalation for ambiguous, high-risk, or policy-blocked work.
Keep the source bundle, generated chapter, tests, and deployment artifact in the same release.

Code Walkthrough

Read the excerpt as the smallest executable expression of the pattern. The surrounding chapter explains the design constraints; the code shows where those constraints become concrete interfaces, state, validation, or control flow.

Source Code

These excerpts show the implementation shape. The complete code is available in the download bundle and repository source.