Work / AI · Agents / AgentTrust SDK

No. 18 · OWASP ASI07 reference

Five lines.
OWASP ASI07
solved.

A single compromised agent can poison 87% of downstream decisions within hours. Existing guardrails protect individual agents. None verify that Agent A's output to Agent B is safe to act upon. AgentTrust is the SDK that does.

main.py
from agenttrust import TrustHooks

hooks = TrustHooks(
    policy="policies/india-dpdp.yaml",
    verification_budget=1000,
    audit_path="./audit/",
    judge_model="claude-haiku-4-5",
)

result = hooks.pre_tool_use(
    "Bash", "rm -rf /",
    agent_id="code_writer",
)

print(result.decision)    # "escalate"
print(result.vis_score)   # 1.4 (high risk)
VIS > 0.9 · Tier 4 · Human escalation
5Lines of Python to integrate
<50msMedian verification overhead
70-80%Actions resolved at policy-only
87%Harmful actions reduction (TrustBench)
4Policy templates included

Act I · This already happened

Three rogue agents.
Three real outages.

Multi-agent systems are failing catastrophically. OWASP ranks this as ASI07 Insecure Inter-Agent Communication in the 2026 Top 10 for Agentic Applications. These are not hypothetical.

Amazon Kiro

An agent deleted an entire AWS environment, causing a 13-hour outage.

Replit

An agent wiped 1,206 executive records during a code freeze, then lied about rollback capabilities.

Meta · Sev-1

A rogue agent autonomously posted unauthorised content, exposing sensitive internal data.

Act II · The Idea

Verify proportional to risk.
Spend the budget where it matters.

Verifying every action at full depth makes a system unusable. Verifying nothing leaves the cascade attack surface wide open. Risk-Proportional Verification spends compute on the dangerous calls and waves the cheap ones through.

verification investment score · vis(action, agent)
VIS = (ConsequenceSeverity / TrustScore)
      × (BudgetRemaining / BudgetTotal)

# ConsequenceSeverity in [0, 1]
#   file read       = 0.1
#   API call        = 0.5
#   data deletion   = 1.0
#
# TrustScore in (0, 1]   earned through verified good behavior
# Budget ratio guards against cost explosion in long workflows
VIS rangeTierVerification actionLatencyCost / check
< 0.21Policy check only (YAML rules)5-25ms~$0
0.2 - 0.62+ Cleanlab TLM trust score200-500ms~$0.001
0.6 - 0.93+ LLM-as-judge (Haiku/Flash)1-3s~$0.005
> 0.94+ Human escalation (blocks)N/AN/A

In typical workflows, 70-80% of actions resolve at Tier 1. Only 5-15% reach the expensive LLM-as-judge tier. The system stays cheap and fast on the boring stuff, careful on the dangerous stuff.

Act III · Trust as a credit score

Trust is earned, not granted.

Each agent starts at neutral 0.5. Verified safe actions raise trust. A violation drops it proportional to history, not catastrophic. After about 60 consecutive safe actions, trust converges toward 1.0.

trust update rule · per action
T(n+1) = 0.95 × T(n) + 0.05 × V(n)

# V(n) = 1 if verified safe
# V(n) = 0 if violation detected
#
# Single violation from trust 0.9 -> drops to 0.855
# Proportional to history. Not catastrophic.

An audit log entry

audit/2026-05-04T14-22.jsonl · append-only · sha-256 chained
{
  "ts": "2026-05-04T14:22:08.412Z",
  "agent_id": "code_writer",
  "tool": "Bash",
  "input": "rm -rf /",
  "vis_score": 1.4,
  "tier": 4,
  "decision": "escalate",
  "trust_before": 0.91,
  "trust_after": 0.86,
  "policy": "india-dpdp.yaml",
  "latency_ms": 23,
  "prev_hash": "a3f8c…"
}

Act IV · Compliance, shipped

Four YAML policies.
India-ready from day one.

Policies are YAML files editable by non-developers. Custom policies define severity, PII patterns, rate limits, per-tool rules. Two ship with India compliance baked in.

policies/default.yaml

Default

Sensible defaults for general-purpose multi-agent systems. The starting policy.

policies/india-dpdp.yaml

India DPDP Act 2023

Aadhaar, PAN, UPI, Indian phone PII detection. Data fiduciary duty (Section 8). Audit retention with erasure exemptions (Section 11). Auto-generated breach notification reports.

policies/india-it-rules-2026.yaml

India IT Rules 2026

SGI synthetic content labelling. 3-hour takedown compliance tracking. Traceability audit trails for AI agent actions. Safe-harbour documentation.

policies/owasp-asi07.yaml

OWASP ASI07 hardening

Inter-agent communication security. Authentication. Schema validation. The reference implementation of the OWASP control.

If your agents talk to each other, they need a referee.

I design budget-aware trust systems for multi-agent stacks. OWASP-aligned, India-compliant, drop-in for the Claude Agent SDK. Five lines and you have an audit trail.