Lab 01 builds the smallest useful tool boundary: validate a proposed call, execute through code, label trust, and return structured results.

Section
Hands-On Labs
Type
Lab
Level
Hands-on
Read
4 min
Effort
45-90 min lab
BuilderStudent

Lab 01 - Build a Tool-Using Agent

Download the lab completion worksheet and lab production readiness worksheet before you start.

Objective

Build the smallest useful tool boundary: a runtime receives a proposed tool call, validates the route and arguments, executes the tool behind code, marks trusted and untrusted outputs, and returns a structured result.

What You Will Use

  • Language: TypeScript
  • Framework/runtime: minimal custom runtime with an AutoGen-style example path
  • Framework-agnostic lesson: the model may propose tool use, but software owns validation, execution, and error handling.
  • Pattern chapter: Tool Use
  • Source folder: tool-using-agent-pattern/
  • Download: tool-use.zip
  • Main file: tool-using-agent-pattern/typescript/src/tool_runtime.ts
  • Demo file: tool-using-agent-pattern/typescript/src/run_demo.ts

Exercise Time Budget

These estimates assume dependencies are already installed.

Exercise Time Output
Setup and baseline run 5 min Passing tool-runtime command output.
Inspect the tool boundary 8 min Notes on route validation, trusted data, and untrusted policy text.
Change one route or tool case 8-10 min A visible success, denial, or controlled error.
Review the production gap 5-7 min One missing control to carry into the readiness worksheet.

Setup

From the repository root:

npm install

This lab runs without a model key. It exercises the tool runtime directly so you can inspect the software boundary before adding model proposals.

Run It

npm run tool-using-agent

Expected output:

{
  "order": {
    "status": "ok",
    "tool": "read_order",
    "trust": "trusted_system"
  },
  "policy": {
    "status": "ok",
    "tool": "search_refund_policy",
    "trust": "untrusted_content"
  }
}

The actual output also includes tool data and evidenceRef values. Record those fields in the lab completion worksheet.

Inspect The Code

Open tool-using-agent-pattern/typescript/src/tool_runtime.ts and find:

  • ToolName: the exposed tool names.
  • ToolContext: the runtime context for route, approvals, timeouts, and attempts.
  • disclosedTools(route): the route-based tool disclosure boundary.
  • validateProposal(proposal, context): the validation boundary.
  • execute(proposal, context): the execution boundary.

Then open tool-using-agent-pattern/typescript/src/run_demo.ts and find the two demo proposals: read_order and search_refund_policy.

The important design point is that a model may propose a tool call, but code owns disclosure, validation, execution, trust marking, and evidence references.

Change One Thing

Change the route in run_demo.ts from:

route: "refund_investigation",

to:

route: "order_status",

Run the lab again.

Expected Result

The read_order call should still succeed because order_status may use that tool. The search_refund_policy call should be denied because that route exposes only read_order.

Record the denial as the intentional failure path. It proves that tool availability depends on the route, not on what the model asks for.

Use this flow as the acceptance model for the lab. The model proposes a call, but the runtime owns disclosure, validation, execution, trust marking, and trace evidence.

sequenceDiagram participant Model participant Runtime participant Disclosure as Tool disclosure participant Policy as Validation policy participant Tool participant Trace Model->>Runtime: Propose tool name and arguments Runtime->>Disclosure: Resolve tools allowed for route Disclosure-->>Runtime: Allowed tool list Runtime->>Policy: Validate route, tool, args, attempts alt Tool call allowed Runtime->>Tool: Execute behind code boundary Tool-->>Runtime: Data plus trust label Runtime->>Trace: Record evidenceRef and result Runtime-->>Model: Structured tool result else Tool call denied Runtime->>Trace: Record structured denial Runtime-->>Model: Structured policy error end

Lab Review Gate

Before moving on, verify the boundary instead of only checking the happy path:

Check Evidence
Tool disclosure is narrow order_status exposes read_order; refund_investigation exposes refund-investigation tools.
Tool execution is code-owned ToolRuntime executes read_order and search_refund_policy; model text does not.
Invalid route/tool use is controlled A hidden or disallowed tool returns a structured denial.
Trust is explicit Order data is trusted_system; policy text is untrusted_content.
Evidence is traceable Successful tool results include evidenceRef.
The production gap is visible The lab names missing authorization integration, trace export, dashboards, deployment, and incident handling.

Record the command, output, and failure behavior in the lab completion worksheet.

Production Extension

Replace prefix routing with a typed tool request:

  • tool name
  • JSON schema for arguments
  • authorization check
  • timeout
  • structured success or error result
  • trace ID for the run

Do not give the model broad access to arbitrary functions. Expose narrow tools with explicit permissions.

Production Bridge

Use this table when adapting the lab to a real product tool:

Lab Concept Production Version
ToolName union Versioned tool manifest with owner, permission, timeout, and side-effect class.
disclosedTools(route) Route, actor, tenant, and risk-based tool disclosure.
validateProposal(...) Schema, policy, approval, budget, and state validation.
trust field Data classification that separates trusted system data from untrusted content.
evidenceRef Trace-linked evidence reference for audit, replay, and evals.
Demo console output Structured response plus trace event, eval record, and dashboard signal.

The first production milestone is not adding more tools. It is proving that one tool can be called, denied, traced, and tested safely.

Cross-Framework Mapping

  • In LangGraph, this boundary usually appears as a tool node or callable bound to graph state.
  • In Mastra AI, it maps to a typed tool exposed through an agent or workflow.
  • In AutoGen-style systems, it maps to a function/tool call proposed by an assistant and executed by the runtime.
  • In CrewAI, it maps to tools assigned to an agent role, with the flow or crew configuration limiting access.