Capstone - Research RAG Agent
Build a research agent that answers from approved sources, cites evidence, refuses unsupported claims, and records enough trace data to debug retrieval failures.
This capstone is evidence heavy. The main risk is not that the model cannot write an answer. The main risk is that the answer looks convincing while using stale, forbidden, or missing evidence.
Problem
Product, support, and engineering teams often need answers from internal documents. A research agent can reduce search time, but it must respect source access, freshness, citations, and memory rules.
Non-Goals
- Do not answer from unapproved sources.
- Do not treat retrieved text as trusted instructions.
- Do not store private facts in memory without retention and correction rules.
- Do not cite documents the agent did not actually use.
Pattern Composition
| Concern | Pattern |
|---|---|
| context packet | Context Engineering |
| retrieval | Semantic Recall and RAG |
| evidence boundary | Knowledge-Bound Agents |
| memory | Memory-Augmented Agent |
| policy | Policy Enforcement |
| quality | Observability and Evals |
Architecture
Read this diagram as an evidence boundary. Retrieval can find candidates, but only source filtering, context assembly, and grounding evals decide what the answer may cite.
Runnable Assets
Run the deterministic capstone implementation:
npm run capstones:demo
npm run capstones:test
Inspect:
capstone-projects-runtime/typescript/src/capstones.tscapstone-projects-runtime/typescript/test/capstones.spec.ts
Downloadable evidence:
- Sample trace JSON
- Sample eval report
- Capstone review scorecard
- Framework selection ADR template
- Production readiness worksheet
Expected runtime signal:
research-rag-agent: pass
stop: answered_with_citation
trace events: 6
The test suite treats these as release evidence:
| Evidence | Runtime Check |
|---|---|
| The context packet includes the current approved source | current_source_used |
| The context packet excludes the stale source | stale_source_rejected |
| The context packet excludes the forbidden source | forbidden_source_omitted |
| The answer cites the source it used | citation_faithfulness |
Native example:
native-framework-examples/langgraph-research-rag/(download)
Context Packet
| Field | Required Rule |
|---|---|
question |
Store normalized question and original user wording separately. |
actor |
Include tenant, role, and source access scope. |
sources |
Include source ID, title, freshness, ACL result, and citation label. |
evidence |
Include only approved passages needed for the answer. |
memory |
Include user or project memory only when policy allows it. |
instructions |
Separate system instructions from retrieved content. |
omissions |
Record sources omitted because of access, freshness, or relevance. |
Capstone Review Gate
Before treating this capstone as production-grade, verify the evidence boundary:
| Check | Evidence |
|---|---|
| Source access runs before context assembly | Forbidden sources are omitted before answer synthesis. |
| Freshness is enforced | Stale sources cannot override current approved sources. |
| Citations are faithful | Every cited claim maps to evidence in the context packet. |
| Missing evidence fails closed | The agent refuses or escalates instead of guessing. |
| Memory writes are governed | Durable memory requires source IDs, confidence, retention, and correction path. |
Record the result in the capstone review scorecard and production readiness worksheet.
Production Bridge
Use this table when turning the capstone into a service:
| Capstone Artifact | Production Version |
|---|---|
| Context packet | Versioned evidence contract with ACLs, freshness, omitted-source notes, and token budget. |
| Source filter | Access-control service with audit trail and policy version. |
| Citation eval | Blocking release gate for unsupported, stale, forbidden, or missing evidence. |
| Memory rule | Governed write path with retention, deletion, correction, consent, and tenant scope. |
| Runbook | Fallback that disables synthesis and returns approved source lists only. |
The first production milestone is an answer path that can explain what it cited, what it omitted, and why.
Native Framework Mapping
| Framework | Best Mapping |
|---|---|
| LangGraph | Graph nodes for classify question, retrieve, filter, answer, cite, evaluate, and escalate. Stores handle long-term memory. |
| Mastra | Agent handles answer synthesis; workflow owns retrieval, source filtering, evals, memory policy, and trace export. |
| AutoGen | Researcher and reviewer agents can collaborate, but source access and citation checks stay in software. |
| CrewAI | Flow owns evidence packet and acceptance; crew can split research and review tasks. |
| Mini-runtime | Direct context builder plus retrieval client, source policy function, answer validator, and trace events. |
Trace Example
{
"trace_id": "tr_research_2077",
"question": "Can the support refund agent issue money?",
"events": [
{ "span": "policy", "decision": "allow", "scope": "support_docs" },
{ "span": "retrieval", "query": "support refund agent issue money", "top_k": 5 },
{ "span": "source_filter", "allowed": 1, "stale": 1, "forbidden": 1 },
{ "span": "context_packet", "evidence_refs": ["refund-policy-v4"] },
{ "span": "model", "prompt": "research-answer-v1", "status": "succeeded" },
{ "span": "eval", "case_id": "research_rag_release_gate", "status": "pass" }
]
}
Eval Report Example
| Eval | What It Checks | Blocking Rule |
|---|---|---|
current_source_used |
refund-policy-v4 reaches the context packet |
no answer without current approved evidence |
stale_source_rejected |
refund-policy-v2 stays out of the context packet |
no stale policy answer |
forbidden_source_omitted |
finance-private-notes stays out of the context packet |
no forbidden source in context |
citation_faithfulness |
cited sources support the answer | no unsupported citation |
| missing evidence refusal | agent refuses when approved evidence is absent | no fabricated answer |
| memory write | memory writes follow retention rules | no sensitive memory write |
Example fixture:
{
"case_id": "stale_refund_policy_rejected",
"question": "Can the agent issue refunds directly?",
"retrieved_sources": ["refund-policy-v2", "refund-policy-v4"],
"expected": {
"must_cite": ["refund-policy-v4"],
"must_not_cite": ["refund-policy-v2"],
"answer_contains": "may draft but not issue refunds"
}
}
ADR Example
# ADR-022: Research agent answers only from approved current sources
## Status
Accepted
## Decision
The research agent may answer only from sources that pass access control, freshness, and citation checks. If approved evidence is missing, stale, or conflicting, the agent escalates instead of guessing.
## Rollback
Disable answer synthesis and keep retrieval-only search results available while the source filter or citation evaluator is repaired.
Runbook Example
service: research-rag-agent
owner: knowledge-platform
kill switch: disable answer synthesis
fallback: return ranked source list only
trace dashboard: knowledge/research-agent/traces
eval suite: evals/research-rag
incident trigger: unsupported answer, forbidden source exposure, stale source citation
post-incident action: add retrieval fixture and citation eval before re-enable
Release Checklist
- Source ACLs run before context assembly.
- Retrieved content is separated from instructions.
- Answers cite only evidence in the context packet.
- Missing evidence produces refusal or escalation.
- Memory writes have retention, deletion, and correction rules.
- Evals cover stale, forbidden, missing, and conflicting sources.
Native Slice Completion Standard
The native LangGraph slice is complete when it proves these outcomes:
| Requirement | Evidence |
|---|---|
| source access runs before retrieval output enters context | check_access node and policy trace |
| stale and forbidden sources stay out of the evidence packet | filter_sources node and omitted source list |
| answer cites only approved current evidence | answer_with_citations output and citation eval |
| missing approved evidence escalates | conditional edge to escalate |
| release gate blocks bad grounding | evaluate_answer failures for stale or forbidden citations |
The slice should remain small. Its job is to prove source policy, context assembly, citation faithfulness, and escalation before adding real vector stores or model calls.