Point Phoenix Audit at any production AI agent. It runs an adversarial test battery drawn from HarmBench, OWASP LLM Top 10, MITRE ATLAS and CARES, clusters the failures into root causes, generates a hardening recipe — and delivers a cryptographically signed audit report. In roughly 90 seconds.
Surface-level failures rarely have surface-level causes. Phoenix Audit watches your agent's internal execution while each adversarial test lands, then collapses independent failures into the single underlying defect — and ships the fix as a real merge request, regression tests included.
Any AI agent reachable over the internet — Google ADK via A2A, LangChain, CrewAI, OpenAI Agents SDK, or any HTTP endpoint. Pick a regulatory framework: EU AI Act, NIST AI RMF, HIPAA, or SOC 2 + AI.
Six adversarial tests, each citing its industry-standard source — HarmBench, OWASP LLM01, MITRE ATLAS, CARES — so a regulator sees provenance, not invention. Audit-mode headers keep side effects dry-run.
Phoenix traces the agent's internal execution per test. The Judge clusters independent failures into root cause clusters — each with the exact trace spans that prove it.
A hardening recipe lands as a markdown patch and an optional GitLab merge request with regression tests. The audit report is signed against your Cloud KMS key and filed in your audit registry.
Voice agents, support copilots, prior-authorization agents, web-automation agents — if it answers over the internet, it can be audited.
| Framework | Support | Transport | Clustering |
|---|---|---|---|
| Google ADK | Tier 1 · native | A2A protocol | Full root-cause clustering |
| LangChain / LangGraph | Tier 2 | HTTP + OpenInference | Full root-cause clustering |
| CrewAI | Tier 2 | HTTP + OpenInference | Full root-cause clustering |
| OpenAI Agents SDK | Tier 2 | HTTP + OpenInference | Full root-cause clustering |
| Custom HTTP agent | Tier 3 · black-box | HTTP endpoint | Per-test findings, no clustering |
Your first signed audit report is 90 seconds away. No payment during the judging window.