Work / AI · Agents / AgentTrust SDK
No. 18 · OWASP ASI07 reference
Five lines.
OWASP ASI07
solved.
A single compromised agent can poison 87% of downstream decisions within hours. Existing guardrails protect individual agents. None verify that Agent A's output to Agent B is safe to act upon. AgentTrust is the SDK that does.
from agenttrust import TrustHooks hooks = TrustHooks( policy="policies/india-dpdp.yaml", verification_budget=1000, audit_path="./audit/", judge_model="claude-haiku-4-5", ) result = hooks.pre_tool_use( "Bash", "rm -rf /", agent_id="code_writer", ) print(result.decision) # "escalate" print(result.vis_score) # 1.4 (high risk)
Act I · This already happened
Three rogue agents.
Three real outages.
Multi-agent systems are failing catastrophically. OWASP ranks this as ASI07 Insecure Inter-Agent Communication in the 2026 Top 10 for Agentic Applications. These are not hypothetical.
An agent deleted an entire AWS environment, causing a 13-hour outage.
An agent wiped 1,206 executive records during a code freeze, then lied about rollback capabilities.
A rogue agent autonomously posted unauthorised content, exposing sensitive internal data.
Act II · The Idea
Verify proportional to risk.
Spend the budget where it matters.
Verifying every action at full depth makes a system unusable. Verifying nothing leaves the cascade attack surface wide open. Risk-Proportional Verification spends compute on the dangerous calls and waves the cheap ones through.
VIS = (ConsequenceSeverity / TrustScore) × (BudgetRemaining / BudgetTotal) # ConsequenceSeverity in [0, 1] # file read = 0.1 # API call = 0.5 # data deletion = 1.0 # # TrustScore in (0, 1] earned through verified good behavior # Budget ratio guards against cost explosion in long workflows
| VIS range | Tier | Verification action | Latency | Cost / check |
|---|---|---|---|---|
| < 0.2 | 1 | Policy check only (YAML rules) | 5-25ms | ~$0 |
| 0.2 - 0.6 | 2 | + Cleanlab TLM trust score | 200-500ms | ~$0.001 |
| 0.6 - 0.9 | 3 | + LLM-as-judge (Haiku/Flash) | 1-3s | ~$0.005 |
| > 0.9 | 4 | + Human escalation (blocks) | N/A | N/A |
In typical workflows, 70-80% of actions resolve at Tier 1. Only 5-15% reach the expensive LLM-as-judge tier. The system stays cheap and fast on the boring stuff, careful on the dangerous stuff.
Act III · Trust as a credit score
Trust is earned, not granted.
Each agent starts at neutral 0.5. Verified safe actions raise trust. A violation drops it proportional to history, not catastrophic. After about 60 consecutive safe actions, trust converges toward 1.0.
T(n+1) = 0.95 × T(n) + 0.05 × V(n) # V(n) = 1 if verified safe # V(n) = 0 if violation detected # # Single violation from trust 0.9 -> drops to 0.855 # Proportional to history. Not catastrophic.
An audit log entry
{
"ts": "2026-05-04T14:22:08.412Z",
"agent_id": "code_writer",
"tool": "Bash",
"input": "rm -rf /",
"vis_score": 1.4,
"tier": 4,
"decision": "escalate",
"trust_before": 0.91,
"trust_after": 0.86,
"policy": "india-dpdp.yaml",
"latency_ms": 23,
"prev_hash": "a3f8c…"
}
Act IV · Compliance, shipped
Four YAML policies.
India-ready from day one.
Policies are YAML files editable by non-developers. Custom policies define severity, PII patterns, rate limits, per-tool rules. Two ship with India compliance baked in.
policies/default.yaml
Default
Sensible defaults for general-purpose multi-agent systems. The starting policy.
policies/india-dpdp.yaml
India DPDP Act 2023
Aadhaar, PAN, UPI, Indian phone PII detection. Data fiduciary duty (Section 8). Audit retention with erasure exemptions (Section 11). Auto-generated breach notification reports.
policies/india-it-rules-2026.yaml
India IT Rules 2026
SGI synthetic content labelling. 3-hour takedown compliance tracking. Traceability audit trails for AI agent actions. Safe-harbour documentation.
policies/owasp-asi07.yaml
OWASP ASI07 hardening
Inter-agent communication security. Authentication. Schema validation. The reference implementation of the OWASP control.
If your agents talk to each other, they need a referee.
I design budget-aware trust systems for multi-agent stacks. OWASP-aligned, India-compliant, drop-in for the Claude Agent SDK. Five lines and you have an audit trail.