Lab 02 separates planning from execution so the planner proposes steps while bounded code runs operations and records results.

Section
Hands-On Labs
Type
Lab
Level
Hands-on
Read
5 min
Effort
45-90 min lab
BuilderStudent

Lab 02 - Build an Agent Loop with Planning

Download the Lab 02 planning loop guided exercise worksheet, lab completion worksheet, and lab production readiness worksheet before you start.

Objective

Separate planning from execution. The planner decides the steps; the executor runs bounded operations and records results. This gives you the control structure behind many agent loops without making every step autonomous.

What You Will Use

  • Language: TypeScript, with a Python mirror
  • Framework/runtime: framework-neutral planner and executor
  • Framework-agnostic lesson: planning and execution are separate responsibilities even when a framework packages them together.
  • Pattern chapters: Agent Loop, Planning and Execution
  • Source folder: planning-pattern/
  • Download: planning-and-execution.zip
  • Main files:
    • planning-pattern/typescript/src/planner.ts
    • planning-pattern/typescript/src/executor.ts
    • planning-pattern/typescript/src/run.ts

Exercise Time Budget

These estimates assume dependencies are already installed.

Exercise Time Output
Setup and baseline test 5-8 min Passing planner/executor test output.
Run the baseline plan trace 10-12 min Plan steps, result map, and stop signal.
Change input and inspect failures 10-15 min Evidence for changed input plus unsupported or missing-input behavior.
Write the stop-condition rule 10-15 min A caller-facing stop reason and production handling note.
Complete the review gate 5 min Worksheet notes for execution boundary and production gap.

Setup

From the repository root:

npm install

Run It

Run the deterministic test:

npm run plan:test

Run the CLI path:

npm run plan:run -- "Compute average of [1,2,3,4]"

Run the Python mirror:

npm run plan:py

Inspect The Code

Open planning-pattern/typescript/src/planner.ts and inspect how a task becomes a list of steps. Then open planning-pattern/typescript/src/executor.ts and inspect how execution turns those steps into named results.

Look for these boundaries:

  • the plan object
  • step IDs
  • deterministic executor functions
  • result map
  • failure surface

Change One Thing

Change the task text:

npm run plan:run -- "Compute average of [10,20,30]"

Then inspect whether the deterministic fallback still produces the expected plan shape.

Expected Result

The test should print:

Planning test OK

The CLI should print a plan, progress events, and a computed result:

Plan: {
  steps: [
    { id: 's1', description: 'Load numbers [1,2,3,4]' },
    { id: 's2', description: 'Compute average' }
  ],
  rationale: 'synthetic'
}
Progress 0 s1
Progress 50 s2
Progress 100 done
Results: { s1: [ 1, 2, 3, 4 ], s2: 2.5 }

After changing the task to [10,20,30], the result should be:

Results: { s1: [ 10, 20, 30 ], s2: 20 }

If you extend the planner, keep the executor deterministic and testable.

Use this flow as the acceptance model for the lab. The planner may choose the steps, but execution remains bounded, traceable, and replayable.

sequenceDiagram participant User participant Planner participant Executor participant State participant Trace User->>Planner: Submit task Planner-->>Executor: Plan with step IDs loop For each bounded step Executor->>Executor: Run deterministic operation Executor->>State: Store result or error Executor->>Trace: Record progress and checkpoint end alt All steps complete Executor-->>User: Final result and stop reason else Step fails Executor->>State: Store structured failure Executor-->>User: Failure result with replay state end

Guided Exercises

Use these exercises to turn the happy path into a reviewable loop trace.

Exercise Time What To Do Evidence To Save
Baseline plan trace 10 min Run npm run plan:test and the CLI command for [1,2,3,4]. Plan steps, progress events, result map, and stop signal.
Input-change trace 8 min Run the CLI command for [10,20,30]. The changed s1 input and changed s2 average.
Unsupported-step trace 8 min Inspect the negative cases in planning-pattern/typescript/test/planning.spec.ts. A structured unsupported_step failure instead of null.
Missing-input trace 8 min Inspect the test case that runs Compute average without loading numbers first. A structured missing_numbers failure.
Production stop rule 10 min Write the stop reason you would expose to a caller. success, unsupported_step, missing_numbers, or budget_exhausted.
flowchart TD A["Task request"] --> B["Planner returns step IDs"] B --> C{"Every step supported?"} C -->|"yes"| D["Executor writes result map"] C -->|"no"| E["Structured failure result"] D --> F{"Stop reason known?"} E --> G["Replay failure from plan and trace"] F -->|"yes"| H["Lab pass"] F -->|"no"| I["Add stop reason before production"]

Stop-Condition Exercise

The executor now returns a failure object for unsupported or malformed work. Use this as the model for production stop conditions:

await executePlan([
  { id: "s9", description: "Send refund directly" }
]);

Expected evidence:

{
  "s9": {
    "status": "failed",
    "error_type": "unsupported_step",
    "step_id": "s9",
    "description": "Send refund directly"
  }
}

This failure is valuable because it is replayable. A production loop should never make the reviewer infer whether a plan stopped because it finished, failed validation, exhausted budget, or hit policy.

Lab Review Gate

Before moving on, verify the control loop:

Check Evidence
Planning is separate from execution planner.ts produces steps; executor.ts runs bounded operations.
Steps are identifiable Each step can be named, traced, and connected to a result.
The executor stays deterministic The same supported plan produces the same result.
Failure has a surface Unsupported or malformed tasks can return a structured failure path.
The stop condition is explicit The lab can explain why the run ended.

Record the plan shape, result map, and one failure path in the lab completion worksheet.

Production Extension

Add loop controls before using this pattern in production:

  • maximum steps
  • maximum retries
  • stop reason
  • checkpoint after each step
  • structured error result
  • human review for high-risk steps

Planning is useful only when execution is bounded and inspectable.

Production Bridge

Use this table when adapting the lab to a product workflow:

Lab Concept Production Version
Plan object Versioned plan schema with owner and risk class.
Step IDs Workflow step IDs with trace spans and retry policy.
Result map Durable state with checkpoint, status, and error class.
Deterministic executor Tool or workflow node with timeout, idempotency, and policy gate.
CLI run Request envelope with actor, tenant, budget, trace ID, and stop reason.

The first production milestone is a replayable plan. If a failure cannot be replayed from state and trace, the loop is still a demo.

Cross-Framework Mapping

  • In LangGraph, planning and execution can be separate nodes connected through shared graph state.
  • In Mastra AI, the same split can appear as a workflow that coordinates agent and tool steps.
  • In AutoGen-style systems, a manager agent may propose a plan while executor functions perform bounded work.
  • In CrewAI, a flow can own the sequence while crews or agents handle delegated tasks.