AI Skill Safety Auditor
AIClaude Code lets you install AI-powered tools from simple text files. Those files can request permission to run commands, read your saved credentials, and send data to external servers — all before you've looked at what you're actually installing. I built and published a free tool that audits these files for risky behaviour before they run on your machine.
Try it free Why Claude Code Skills Are a Security Risk
Most digital tools you install come with a paper trail: a license agreement, a review process, some chain of accountability. Claude Code skills have none of that. They're markdown files, and if you are not careful, they can declare shell access, read your environment variables, and send data off-device before you've opened a single line.
The risk is not understanding the risk. Adoption is outpacing governance.
For organizations retooling around AI, governance conversations focus on platforms and systems. But the attack surface here is the individual: their machine, their credentials, their judgment call at the moment of install. Depending on your organization, particularly in work settings where employees or contractors work from home and bring their own machines, controls don't reach that far.
How Skill Safety Auditor Works
Skill Safety Auditor is a free tool that checks any Claude Code skill file against 14 security criteria — shell access, credential exposure, prompt injection, and source provenance — then returns a severity-graded report with plain-language remedies. Run it before installing a skill, or after, to audit skills already on disk.
Skill Safety Audit checks 14 things: what system access the skill is requesting, whether any bundled scripts make outbound network calls, whether the prompt contains hidden instructions, and whether the source traces back to an active, real repository.
The output is a structured report with three severity levels, based on the standard red, amber, and green alerts. A critical finding means don't install. Warnings come with step-by-step remedies written in plain language, delivered one step at a time, with a check-in after each so no one is handed a wall of instructions to work through alone.
You can run it before installing a skill, to review it before it ever touches your machine, or after, to audit skills already on disk.
What a Skill Safety Audit Looks Like
This is what a real audit looks like. Findings are grouped by severity. Critical results include a conversational step-by-step remedy, one step at a time, with check-ins between each.
tools: section in the frontmatter. Do you see Bash listed? Let me know and I'll walk you through next steps.14 Checks Across Four Categories
Built from these first principles: what can a skill actually do, what would a malicious actor exploit, and what can be verified without executing anything?
- Shell / Bash tool access
- File write access
- Credential-adjacent tools
- MCP server access
- Executable file presence (.py, .sh, .js)
- Outbound network calls
- Environment variable access
- Hidden or encoded instruction blocks
- Permission escalation attempts
- Conditional context triggers
- Instructions unrelated to stated purpose
- Public repository verification
- Repository activity and maintenance status
- Author identity signals
How Skill Safety Auditor Was Built
Identified the gap
I began by surveying the skills ecosystem and reviewed research on AI tooling attack surfaces to confirm the gap was real. Speaking to colleagues who use AI tools regularly also revealed they weren't aware of the risks.
Designed the taxonomy
Based on what I was looking for in a tool, I built the audit framework from scratch: four check categories, three severity tiers, 14 individual checks, each with plain-language remedies designed for non-technical readers.
Built and iterated
I built the skill using Claude Code's skill-creator framework, testing against edge cases and refining the remedy walkthrough to work conversationally rather than as a static checklist.
Validated with a live audit
After the build I ran a real audit against a public skill to confirm the checks worked correctly, the reporting was accurate, and the output was useful to someone seeing it for the first time.
Credibility through best practices
To demonstrate authenticity on the live website, I created and exposed fictional issues in a test file, based on standards established by the European Institute for Computer Antivirus Research (EICAR).
The Auditor Audits Itself
Convincing potential users to trust me while encouraging critical thinking at the same time was an opportunity to demonstrate public accountability in real time.
What Worked
- Seeing around corners. I identified an unaddressed security attack surface in a fast-growing ecosystem, then built and shipped a fix before it spread.
- Empathy as a design constraint. The remedies were written for someone who just wants to install a useful tool safely. Every check produces guidance a first-time user can act on without needing to understand what a threat model is.
- Product over tooling. I turned what could have been a static checklist (and create yet more jobs to be done) into a conversational, adaptive experience that responds to what the user finds and guides them through it.
What It Taught
- Ecosystem governance is a product category. Most AI safety work happens at the platform layer through model training, usage policies, and content filters. The open ecosystem of community-built and vibe-built tools operates one level down where governance infrastructure is largely absent.
- Trust is the hardest design problem. The tool asks users to think critically about what they install, which means it has to earn their trust while teaching them not to trust blindly. That tension shaped every decision.
- Credibility requires working transparently. I built the Skill Safety Audit because I needed it. And it's part of my toolkit now. Publishing the skill, and showing how others can test it before installing, was a transparent way to demonstrate credibility while providing a useful public service.