Section guide

Evaluation, Security, and Trust

Evaluation, threat modeling, sandboxing, policy trust, and UX controls for systems that affect users.

Use when behavior quality, risk, safety, human trust, or release gates matter.

Start with Evaluation-Driven Agent Development
Finish with Agent UX and Human Trust
Reader outcome

Leave with eval, threat model, sandbox, and trust controls that can block release.

Reusable artifact

An eval set, threat model, sandbox boundary, and release-blocking trust controls.

Reading order

4 chapters
  1. 01
    Evaluation-Driven Agent Development Guide · Advanced 10-20 min read Reviewer · Security

    Evaluation-Driven Agent Development shows how to make evals part of design, implementation, release, and incident learning.

  2. 02
    Agent Threat Model Guide · Advanced 10-20 min read Reviewer · Security

    The threat model chapter identifies attack paths specific to agents: prompt injection, unsafe tools, memory poisoning, data leaks, and overbroad authority.

  3. 03
    Agent Security and Sandboxing Guide · Advanced 10-20 min read Reviewer · Security

    Security and Sandboxing explains how to restrict execution, isolate tools, protect data, and keep model proposals behind policy gates.

  4. 04
    Agent UX and Human Trust Guide · Advanced 10-20 min read Reviewer · Security

    Agent UX and Human Trust shows how to present capability, uncertainty, approvals, reversibility, and evidence so users can judge the system.