AI Skill Safety Auditor
AIWhen employees install AI tools on personal machines, there is no review process, no chain of accountability, and usually no one watching. I built Skill Safety Auditor to audit Claude Code skills for security risks before they run on your machine.
See the toolWhy Claude Code Skills Can Be a Security Risk
Most digital tools come with a paper trail: a license agreement, a review process, some chain of accountability. Claude Code skills have none of that. They're markdown files, and if you are not careful, they can declare shell access, read your environment variables, and send data off-device before you've opened a single line.
The danger isn't the technology. It's assuming someone already checked.
Security conversations about AI focus on platforms and models. The individual is the point of vulnerability: their machine, their credentials, their judgment call at the moment of install. That judgment call is what I mean by personal AI governance, the decisions each person makes about which tools to trust with access to their machine. When employees work from home on personal machines, organizational controls don't reach that far. Nobody is making those decisions for them.
How Skill Safety Auditor Works
Skill Safety Auditor is a free tool that checks any Claude Code skill file against 14 security criteria: shell access, credential exposure, prompt injection, and source provenance. It returns a severity-graded report with plain-language remedies. Run it before installing a skill, or after, to audit skills already on disk.
Structured reports
Reports use three severity levels: red for critical, amber for warnings, green for passing checks. A critical finding means don't install. Warnings come with step-by-step remedies written in plain language, delivered one step at a time, with a check-in after each so no one is handed a wall of instructions to work through alone.
Three modes
It runs in three modes: remotely, to check a skill before you download it; against a local file, to audit after downloading but before installing; and against your live system, to check skills already on disk. No account required. Results in under 60 seconds.
Accountability by design
The live site was built around the same problem the tool solves. Putting my name, photo, and professional background on the page was a deliberate product decision. Security tooling from an anonymous source asks users to trust the tool without giving them a reason to trust the person behind it. A named author is accountable in a way a tool alone cannot be.
What a Skill Safety Audit Looks Like
This is what a real audit looks like, run against a deliberately constructed test skill. Findings are grouped by severity. The full report on GitHub includes step-by-step remedies for every finding, written in the auditor's exact output format.
14 Checks Across Four Categories
Built from these first principles: what can a skill actually do, what would a malicious actor exploit, and what can be verified without executing anything?
The configuration section of a skill file declares what tools and system access Claude is allowed to use. These checks flag access that goes beyond what the skill needs.
- Shell / Bash tool access
- File write access
- Credential-adjacent tools
- MCP server access
Some skills include executable code files. These checks scan those files for dangerous behaviour: reading your credentials, making network connections, or sending data off your machine.
- Executable file presence (.py, .sh, .js)
- Outbound network calls
- Environment variable access
These checks look for hidden instructions in the skill's text that could quietly change how Claude behaves, claim permissions it doesn't have, or do something other than what the skill says it does.
- Hidden or encoded instruction blocks
- Permission escalation attempts
- Conditional context triggers
- Instructions unrelated to stated purpose
These checks verify that the skill links back to a real, active, public repository with an identifiable author, basic signals that someone is accountable for what they published.
- Public repository verification
- Repository activity and maintenance status
- Author identity signals
How Skill Safety Auditor Was Built
Identified the gap
I wanted to learn how to build a Claude Code skill, so I built one I'd actually use. Neither the official Claude Code plugin directory nor Claude's built-in skill library included an audit tool. I decided those two facts mattered beyond my own machine.
Designed the taxonomy
I built the audit framework from scratch: four check categories, three severity tiers, 14 individual checks, each with plain-language remedies written for someone who wants to install a useful tool safely, not someone who knows what a threat model is.
Built and iterated
I built the skill using Claude Code's skill-creator framework, then refined the remedy walkthrough to work conversationally rather than as a static checklist. The auditor asks one question at a time and checks in before moving on.
Validated with a live audit
I ran a real audit against a public skill to confirm the checks produced accurate findings and the output was clear to someone seeing it for the first time.
Proved it with a synthetic test
To show the auditor works, I built a test skill with deliberate vulnerabilities modelled on the EICAR standard (European Institute for Computer Antivirus Research). It's a synthetic file with known issues that anyone can run through the auditor themselves to verify the output before installing anything.
Took it public
Once the skill worked, I made a judgement call. If it solves a real problem for me, it solves it for others. I published to GitHub under an MIT license, designed and built a landing page to explain the risk to non-technical users, and used the tool on itself to signal trust. When I decided to learn plugin architecture, I extended the project rather than starting a new one. Each phase built on the last.
The Auditor Audits Itself
A tool that asks people to question what they install needs to earn their trust first.
What Worked
- Spotted a gap, filled it. No audit tool existed in the Claude Code skill ecosystem. Shipping one before the problem was widely recognized gave it a head start on trust.
- Designed for non-experts. Every check produces guidance for someone who wants to install a tool safely, not someone who knows what a threat model is. No security background required.
- Users finish what they start. The auditor walks through findings one at a time and checks in before moving on. People complete it because it doesn't drop a wall of results and disappear.
- Published under my name. A security tool from an unknown source asks users to trust it on faith. Putting my name on it means there's someone accountable if it's wrong.
What It Taught
- Open source is accountability, not just access. A security tool provided by an anonymous source is asking users to trust it on faith. Publishing the source that includes a public test means anyone can verify what the tool actually does before installing it. If a tool is asking others to think critically about what they install, it has to go through the same scrutiny, particulary with AI tools that do not have conventional software licences, user agreements, of privacy clauses.
- A working artifact beats a description. I could have written about why AI skill safety matters and how it relates to the larger picture of AI governance. Instead I shipped something anyone can test for themselves, learnng by doing, then run.
- Building without a title changed how I think about building. This project happened between roles. No team, no budget, no institutional backing. It also surfaced a habit I hadn't examined which is that I built privately by default. Taking an idea from first line of code to design to public tool in one stretch made that visible to me. Sharing doesn't have to be a phase that happens after the work. It's how the work gets done.