FREE & OPEN SOURCE

Claude Agent Auditor
Your agent has no guardrails.

Most Claude Code setups run with no deny rules, no session logging, and rules that cover only 2 of 4 LLM failure modes. One scan surfaces every gap. Free, zero dependencies, read-only.

pip install claude-agent-auditor

Then run claude-agent-auditor --open in your project directory

8
Checks run
3
Report pages
4
Failure modes tested
$0
Cost

See a live sample report

Generated from a real Claude Code workspace. Three pages, each with a different view of your agent architecture.

What the auditor finds

CRITICAL

Unconstrained Autonomy

bypassPermissions or dontAsk mode with zero deny or ask rules. Claude can delete files, push code, and send messages without any confirmation. This is the most common P0 issue.

CRITICAL

No Agent Tracing

Multi-step Task tool runs with no PostToolUse hook. When an agent sub-task fails, there's no trace. You can't debug what you can't see.

WARNING

Missing Session Logging

No Stop hook configured. Every architectural decision, plan, and conclusion from the session is lost when it ends. Critical context vanishes between conversations.

WARNING

Rule Coverage Gaps

Rules covering hallucination prevention but nothing on context window limits, or domain knowledge rules with no control constraints. Uncovered failure modes fail silently.

INFO

Overlapping Rules

Two rules with 70%+ keyword overlap loaded every session. They dilute each other and waste context budget. The auditor identifies every redundant pair.

INFO

No Memory Preservation

PreCompact hook absent. Claude compacts long sessions without extracting key decisions first. Context that took hours to build gets summarized away.

WARNING

MCP Servers Without Deny Rules

MCP tools expand Claude's action surface. When servers are configured with no deny rules, every MCP tool runs unchecked. The auditor flags this combination.

CRITICAL

Secrets Exposure

API keys, tokens, or passwords accidentally dropped into rules files or settings JSON that could end up in version control. The auditor scans for common patterns.

Requirements

  • Python 3.10+ (check with python --version)
  • Zero dependencies — pure Python stdlib
  • Does NOT need to be installed inside your Claude Code workspace
  • Read-only analysis — never modifies your files

How it works

1

Install (from anywhere)

pip install claude-agent-auditor
2

Point it at your project

claude-agent-auditor /path/to/your/project --open
3

Review the three-page report

Page 1: Current State — score, issues, and what's misconfigured. Page 2: Recommendations — prioritized fixes with copy-paste hook JSON. Page 3: Projected Results — your score after all fixes applied.

4

Ask Claude to fix it

Feed the report to your Claude Code instance: "Read agent-audit/recommendations.html and implement the P0 recommendations." Review the changes before accepting.

The four LLM failure modes

Based on the Stanford CS230 study guide on building with LLMs. Every technique — RAG, fine-tuning, agentic workflows — exists to solve one of these four problems. Your rules should cover all four.

Domain Knowledge Gaps

Base models lack proprietary data, recent events, and internal docs. Solved by RAG, memory systems, and domain-specific rules.

Fix: Memory system, RAG, domain rules

Context Window Limits

Can't hold arbitrarily long history. Requires explicit architectural choices: handoffs, compaction, summarization.

Fix: PreCompact hook, handoff rules, MEMORY.md

Hallucinations

Generates plausible-sounding but incorrect output with confidence. Needs explicit verification rules before asserting facts.

Fix: Verification rules, claim-checking, grounding

Difficulty of Control

Hard to get consistent, scoped behavior. Requires deny rules, ask rules, and explicit scope constraints.

Fix: Deny rules, ask rules, scope limits

Six observability hooks, three priority tiers

If you don't have traces, you can't debug your agent system. The auditor checks for all six hooks and flags missing ones with copy-paste implementation snippets.

PriorityHookWhat it captures
CRITICALPostToolUse: TaskEvery agent sub-task — the backbone of multi-agent tracing
CRITICALStopSession decisions before they're lost on conversation end
IMPORTANTPreCompactCritical context preserved before compaction runs
IMPORTANTSessionStartSession initialization and context restore
USEFULPostToolUse: Write|EditEvery file change with path and timestamp
USEFULPostToolUse: BashEvery command executed — audit trail for automation

Also run the Claude Workspace Optimizer

The optimizer checks context efficiency: MEMORY.md visibility, rule bloat, token budget. The agent auditor checks safety and architecture. They cover different ground — run both.

pip install claude-workspace-optimizer

Frequently asked questions

Does this send my data anywhere?+
No. The tool runs entirely on your machine. No API calls, no data collection, no telemetry. Your code never leaves your computer.
Does it work with any Claude Code project?+
Yes. It scans .claude/settings.json, rules/, hooks, and skills. It also reads ~/.claude/ for global settings. Works with any project that uses Claude Code.
What's the architecture score based on?+
The score (0-100) weighs autonomy risk, observability hook coverage, rule architecture coverage of the four failure modes, and agent setup quality. A score under 60 means there are P0 issues to fix before running agents autonomously.
Can it automatically fix the issues?+
The auditor generates a three-page report — it doesn't modify your files. The Recommendations page includes copy-paste JSON for every missing hook. You can also feed the report to your Claude Code instance and ask it to implement the fixes.
How is this different from the Claude Workspace Optimizer?+
Different tools, different concerns. The Workspace Optimizer checks context efficiency (MEMORY.md visibility, context bloat, rule tiering). The Agent Auditor checks safety and architecture (autonomy risk, observability hooks, rule coverage, agent patterns). Run both.

Share with your team

Know someone running Claude Code agents without deny rules or session logging? This tool is free. Share it.

pip install claude-agent-auditor

MIT license. Open source. View on GitHub

Disclaimer: This tool is provided as-is with no warranty. Oaken AI and its contributors accept zero responsibility for any changes made to your workspace based on this tool's output. The report contains recommendations, not instructions. Always review changes before applying them. Back up your workspace before making modifications.

Buy me a coffee

Fuel more free open-source AI tools

BUILT BY OAKEN AI

Need more than an architecture scan?

Oaken AI builds production multi-agent systems for businesses. Architecture, hooks, rules, RAG memory, and infrastructure — everything in this report, built for your stack.