Section guide
Evaluation, Security, and Trust
Evaluation, threat modeling, sandboxing, policy trust, and UX controls for systems that affect users.
Use when behavior quality, risk, safety, human trust, or release gates matter.
Leave with eval, threat model, sandbox, and trust controls that can block release.
An eval set, threat model, sandbox boundary, and release-blocking trust controls.
Reading order
4 chapters- 01 Evaluation-Driven Agent Development Guide · Advanced 10-20 min read Reviewer · Security
Evaluation-Driven Agent Development shows how to make evals part of design, implementation, release, and incident learning.
- 02 Agent Threat Model Guide · Advanced 10-20 min read Reviewer · Security
The threat model chapter identifies attack paths specific to agents: prompt injection, unsafe tools, memory poisoning, data leaks, and overbroad authority.
- 03 Agent Security and Sandboxing Guide · Advanced 10-20 min read Reviewer · Security
Security and Sandboxing explains how to restrict execution, isolate tools, protect data, and keep model proposals behind policy gates.
- 04 Agent UX and Human Trust Guide · Advanced 10-20 min read Reviewer · Security
Agent UX and Human Trust shows how to present capability, uncertainty, approvals, reversibility, and evidence so users can judge the system.