PROJECT AIR™ · EVIDENCE-GRADE INFRASTRUCTURE

Evidence-grade infrastructure
for accountable AI agents.

Cryptographic chain-of-custody. Court-supportable records. Rekor-anchored proof. Every action your agents take, bound to a workload identity, anchored on a public transparency log, independently verifiable by anyone.

View on GitHub Read the docs

air trace

$ air handoff verify ea_chain.jsonl coach_chain.jsonl

See it run

Recorded from the real CLI. Press play.

pip install projectair then air demo. Signed chain generated, verified, 10 of 10 OWASP Agentic detectors firing, EU AI Act Article 72 report emitted. No cloud. No account. Offline.

pip install projectair Admissibility architecture Verify our ops chain View source (MIT)

captured from projectair 0.7 · Layer 4 Wave 1 · Fulcio-anchored · Rekor index 1466351923

Why Now

Prevention is crowded. Evidence-grade infrastructure for AI agents does not exist yet.

16,200

AI security incidents in 2025 (+49% YoY)

Pillar Security · 2025

73%

of production AI deployments have prompt injection vulnerabilities

OWASP / Lakera · 2025 GenAI Security Readiness Report

14%

of orgs ship AI agents with full security approval

PwC · 2025 AI Agent Survey

Aug 2

2026: EU AI Act enforcement. Article 12 and Article 72 require audit trails and post-market monitoring.

EU AI Act · Article 113

Prevention tools exist. Lakera catches prompt injection. NeMo Guardrails filters outputs. Bedrock Guardrails wraps model calls. But prevention is probabilistic, and autonomous agents still go off-script in production.

Project AIR records every action your agents take, signs it, anchors it on a public transparency log, and produces court-supportable evidence that security, legal, compliance, and insurance teams can use directly.

Incidents

Real breaches. Real patterns. What AIR would have caught.

Every incident below has a public post-mortem. Every one maps to an OWASP Top 10 for Agentic Applications signature. Project AIR ships all 10 OWASP Top 10 for Agentic Applications detectors (ASI01 through ASI10), plus 3 OWASP LLM Top 10 categories (LLM01, LLM04, LLM06) and 1 AIR-native forensic-chain-integrity check.

Incident

What broke

ASI

What AIR would have detected

ForcedLeak (Salesforce Agentforce)

2025

Indirect prompt injection via trusted CRM records steered the agent to exfiltrate sensitive lead data.

ASI01

Goal hijack signature on the step that ingested the external instruction, with the offending input preserved in signed evidence.

Drift (Salesloft breach)

2025

Third-party OAuth tokens harvested from a connected integration, used to pivot into downstream SaaS systems.

ASI04

Credential misuse signature on tool invocations that used a session outside the agent's baseline identity.

GitHub Copilot YOLO mode

2025

Auto-approved tool calls amplified an injected instruction into destructive shell execution.

ASI02

Tool misuse signature on baseline deviation the first time the agent invoked a destructive shell verb.

ServiceNow Now Assist

2025

Prompt injection via user-supplied ticket fields escalated read scope and leaked records.

ASI05

Privilege escalation as a data-scope violation at the step that accessed out-of-scope records.

litellm proxy auth bypass

2024

Auth bypass let unauthorized callers issue LLM requests that silently skipped policy and audit layers.

ASI09

Audit-trail tampering: replayed events fail signature checks, isolating the unsigned and missing hops.

Claude Mythos jailbreak

2025

Narrative role-framing prompt pushed the model outside its safety stance, producing disallowed content.

ASI01

Goal hijack as a baseline response-pattern deviation, with the jailbreak prompt preserved in evidence.

ForcedLeak (Salesforce Agentforce)

2025

ASI01

What broke

Indirect prompt injection via trusted CRM records steered the agent to exfiltrate sensitive lead data.

What AIR would have detected

Goal hijack signature on the step that ingested the external instruction, with the offending input preserved in signed evidence.

Drift (Salesloft breach)

2025

ASI04

What broke

Third-party OAuth tokens harvested from a connected integration, used to pivot into downstream SaaS systems.

What AIR would have detected

Credential misuse signature on tool invocations that used a session outside the agent's baseline identity.

GitHub Copilot YOLO mode

2025

ASI02

What broke

Auto-approved tool calls amplified an injected instruction into destructive shell execution.

What AIR would have detected

Tool misuse signature on baseline deviation the first time the agent invoked a destructive shell verb.

ServiceNow Now Assist

2025

ASI05

What broke

Prompt injection via user-supplied ticket fields escalated read scope and leaked records.

What AIR would have detected

Privilege escalation as a data-scope violation at the step that accessed out-of-scope records.

litellm proxy auth bypass

2024

ASI09

What broke

Auth bypass let unauthorized callers issue LLM requests that silently skipped policy and audit layers.

What AIR would have detected

Audit-trail tampering: replayed events fail signature checks, isolating the unsigned and missing hops.

Claude Mythos jailbreak

2025

ASI01

What broke

Narrative role-framing prompt pushed the model outside its safety stance, producing disallowed content.

What AIR would have detected

Goal hijack as a baseline response-pattern deviation, with the jailbreak prompt preserved in evidence.

Incident analysis based on public reporting. ASI mappings reflect AIR's detection signatures against the OWASP Top 10 for Agentic Applications 2026.

How It Works

Three product surfaces. One mission.

CLI, SDK, and Cloud are distinct tools for distinct workflows. They share one evidence chain.

Surface 01 MIT · OSS

air

The CLI. Ingest any agent trace. Detects all 10 OWASP Top 10 for Agentic Applications categories, plus 3 OWASP LLM Top 10 categories and 1 AIR-native chain-integrity check. Outputs forensic timelines with signed evidence hashes.

$ pip install projectair

$ air trace my-app.log

Surface 02 MIT · OSS

airsdk

The Python SDK. Drop-in instrumentation for LangChain, OpenAI, Anthropic, LlamaIndex, Gemini, and Google ADK. Every agent action written as an AgDR record with BLAKE3 hash and Ed25519 signature (with opt-in ML-DSA-65 post-quantum signing), ready to anchor on Sigstore Rekor.

from airsdk import AIRCallbackHandler

handler = AIRCallbackHandler(key="...")

agent = AgentExecutor(callbacks=[handler])

Surface 03 Coming Soon

AIR Cloud

Hosted chain-of-custody. Multi-tenant dashboards. Court-supportable evidence packs. Where security, legal, compliance, and insurance teams actually work.

›Real-time agent dashboard + incident workflows
›Datadog, Splunk, Vanta integrations
›EU AI Act and California SB 53 exports
›Insurance-ready forensic evidence packs

Structural Verification

They check messages. We check missions.

Intent Capsules are the signed promise. Structural Verification is the proof the promise was kept. A deterministic symbolic floor that cannot be prompt-injected.

The problem

Per-call guardrails check individual messages. Content classifiers check individual outputs. But nobody checks whether the trajectory of an entire agent session served its declared intent. Reading ~/.ssh/id_rsa is not inherently malicious. Posting to an external URL is not inherently malicious. Doing both in a "refactor the auth module" session is exfiltration.

The solution

Five deterministic checks over the causal graph: SV-SECRET (undeclared secret access), SV-NET (undeclared network egress), SV-SCOPE (filesystem scope violations), SV-ENTITY (unauthorized entity access), SV-EXFIL (causal exfiltration path). The symbolic floor is the guarantee. No LLM in the verification path.

air verify-intent

$ air verify-intent chain.jsonl

Intent: "Refactor the auth module"

Source: INTENT_DECLARATION

Checking 14 steps...

SV-SECRET step 5: ~/.ssh/id_rsa

secret_access not declared

SV-NET step 7: POST attacker.com

not in allowed_network

SV-EXFIL #5 → #7: causal path

secret read → network egress

FAILED BY AIR (2 critical, 1 high)

Read the full post

Dogfooded

We run Project AIR on our own production infrastructure.

Every API request to this site is recorded as a signed AgDR chain using the same airsdk library you install from PyPI. Each chain is anchored to public Sigstore Rekor every 60 seconds and published as redacted JSONL. The trust contract is identical: signed in-process at the moment of action, not reconstructed from logs.

Trust model

Records signed in-process by the same Lambda that handles your request. Not tailed from logs. Not reconstructed by a batch job.

Default-deny redaction

Published JSONL replaces every non-whitelisted payload field with a BLAKE3 hash. Method, path, status code pass through. Everything else is hashed.

Verify it yourself

curl the manifest. Look up the Rekor log index on search.sigstore.dev. Zero Vindicara infrastructure in the verification path.

View the live ops chain

Complementary, Not Competitive

What Project AIR is not.

Project AIR is the evidence-grade infrastructure layer. It does not replace the tools below. It feeds them with cryptographically signed, court-supportable records of what your agents actually did.

Not a guardrail

That is Lakera.

Not a red-teaming tool

That is Garak.

Not a governance platform

That is Credo AI.

Not compliance SaaS

That is Vanta.

Not observability

That is Arize.

AIR is

The evidence-grade infrastructure layer that feeds all of the above.

Built on Vindicara

The engine underneath.

AIR does not replay traces in isolation. It runs on top of Vindicara's existing runtime security engine, which is what turns detections into actionable evidence and containment.

If you have read the Vindicara spec, these components are familiar. They are no longer the headline. They are the substrate AIR sits on.

Policy engine

Detects violations in real time and feeds them into AIR's forensic chain as signed evidence events.

MCP scanner

Finds vulnerable tool configurations before incidents. Post-incident, provides the risk baseline AIR replays against.

Agent IAM

Enforces containment when AIR triggers an incident. Scopes, suspends, or revokes an agent in one API call.

Compliance engine

Produces audit-ready evidence inputs from the forensic log. EU AI Act Article 72 templates populate in one command; counsel and compliance teams complete the filing.

Try the engine. Live API. No signup.

Input (prompt)

Output (model response)

Select a sample and hit Evaluate

Live API response will appear here

Standards Alignment

AIR speaks the frameworks your auditor already does.

OWASP

Top 10 for Agentic Applications 2026

All 10 Agentic ASIs (ASI01 through ASI10) shipped in projectair 0.3.0. Additional detectors cover OWASP LLM01, LLM04, LLM06, and an AIR-native chain-integrity check.

AgDR

AI Decision Records

BLAKE3 content hashing, Ed25519 signatures, opt-in ML-DSA-65 (FIPS 204) post-quantum signing, forward-chained hash integrity, UUIDv7 for monotonic ordering.

EU AI Act

Articles 12 & 72

Audit trail retention and post-market monitoring evidence. Exportable as conformity artifacts.

HIPAA

Security Rule (2026 NPRM)

Cryptographic evidence for 45 CFR 164.312 audit controls, integrity controls, and person authentication. Auth0-verified clinician identity in the chain.

California

SB 53

Frontier model transparency and critical incident disclosure, with forensic evidence attached.

NIST

AI RMF

Map, Measure, Manage, and Govern functions backed by runtime evidence rather than policy PDFs.

Your next incident is already on its way.
Make sure you can prove what happened.

View on GitHub Talk to us

$ pip install projectair

Evidence-grade infrastructure for accountable AI agents.

Recorded from the real CLI. Press play.

Prevention is crowded. Evidence-grade infrastructure for AI agents does not exist yet.

Real breaches. Real patterns. What AIR would have caught.

Three product surfaces. One mission.

air

airsdk

AIR Cloud

They check messages. We check missions.

We run Project AIR on our own production infrastructure.

What Project AIR is not.

The engine underneath.

Try the engine. Live API. No signup.

AIR speaks the frameworks your auditor already does.

Your next incident is already on its way. Make sure you can prove what happened.

Evidence-grade infrastructure
for accountable AI agents.

Your next incident is already on its way.
Make sure you can prove what happened.