Content-inspection engine

The content-inspection engine is the part of the gateway that actually reads an agent's payload and decides how risky it is. It is payload-aware: rather than matching on the tool name alone, it inspects the arguments and content flowing through the action.

Signal families

Each detector contributes one or more signals. Signals are grouped into families:

Secrets — API keys, tokens, private keys, and other credentials that should never leave a trusted boundary.
PII — emails, phone numbers, government IDs, and similar personal data.
Destructive — operations that delete, drop, overwrite, or otherwise cause irreversible change.
Injection — prompt-injection and instruction-override attempts embedded in content the agent is processing.

From signals to a verdict

Detected signals roll up into a risk score. The score, combined with the policy that applies to this agent and tool, produces the verdict:

Verdict	Meaning
`allow`	No material risk detected.
`redact`	Sensitive spans are rewritten before the action proceeds.
`block`	The action is denied and recorded with its signals.

Write-time redaction

When a payload contains sensitive content but the action is otherwise legitimate, the engine can redact at write time — replacing secrets or PII with placeholders so the action can proceed without leaking the underlying data. The original spans never reach the downstream tool.

Auditing

Every decision is stored with the signals that produced it. When you need to answer "why was this blocked?" weeks later, the trail is already there.

Content-inspection engine

Signal families

From signals to a verdict

Write-time redaction

Auditing

On this page