May 09, 2026Security

ToolFence

Microsoft just disclosed that a single prompt injection in Semantic Kernel can escalate to full RCE. ToolFence is a runtime firewall for AI agent tool calls that blocks this at the source.

Verdict

7/10

Effort

1-2 weeks

The Idea

ToolFence is a lightweight runtime firewall for AI agent tool calls. It sits between your AI agent (Claude Code, Cursor, custom agents) and the MCP servers or tools they connect to, enforcing permission policies on every invocation. Define allow/deny rules per tool, block sensitive data from leaving your environment, rate-limit tool calls, and log every action for audit. Ship as an npm package with a one-line wrapper around your MCP client, plus a hosted dashboard for viewing logs and configuring policies. Think of it as iptables for AI agents, configured in YAML and designed for developers, not network engineers.

Why Now

Microsoft disclosed CVE-2026-26030 on May 7, revealing that a single prompt injection in Semantic Kernel could escalate to full remote code execution on the host machine. Three AI coding agents (including Claude Code) were caught leaking secrets through prompt injection in a single VentureBeat investigation. Gartner projects 40% of enterprise apps will integrate AI agents by end of 2026, but only 14.4% deploy with full security approval. Snyk's ToxicSkills audit found 13.4% of community-published agent skills contain critical vulnerabilities. The gap between adoption velocity and security readiness is widening every week.

How to Build

Start with a Node.js middleware package that wraps the MCP client. Policy configuration lives in a YAML file alongside your project, defining per-tool rules: which tools can be called, what arguments are allowed (regex matching), rate limits, and data patterns that must be blocked from outbound payloads. The core engine is pattern matching plus a call interceptor. Add a Next.js dashboard for viewing the audit log, setting up Slack alerts, and managing policies across multiple agents. Use Supabase or Postgres for log storage. Distribute the core as an open-source npm package to drive adoption, then monetize the hosted dashboard.

Revenue Model

Open-source core (free npm package) drives adoption and trust. Hosted dashboard at $29/month per workspace covers log retention, alerting, and team sharing. Enterprise tier at $149/month adds SSO, compliance report exports, and unlimited agent connections. Target 500 free users converting at 8-10% within six months, reaching $1,500-2,000 MRR. The open-source play builds moat through community contributions and ecosystem integrations that proprietary tools cannot replicate.

Effort

The npm middleware package is a 3-5 day build: call interceptor, policy parser, pattern matching, logging. The dashboard is another 5-7 days: log viewer, policy editor, alert configuration. Documentation and landing page round out the two weeks. The technical complexity is moderate. The real challenge is not in the middleware (which is straightforward proxy logic) but in covering edge cases across different MCP server implementations and keeping up with protocol changes as the spec evolves.

Reddit Signal

A thread on r/ClaudeAI drew 20+ replies after a developer posted "Nobody checks what's inside Claude Code skills before installing them," with respondents acknowledging they had no solution beyond manually reading source code. VentureBeat reported three major AI coding agents leaked secrets through a single prompt injection, with all three vendors quietly paying bug bounties without publishing CVEs. Snyk's February 2026 ToxicSkills study scanned 3,984 skills from ClawHub and skills.sh, finding 534 (13.4%) contained at least one critical security issue including malware distribution and data exfiltration.

Risk

The moat is thin. Palo Alto, Datadog, and Snyk all have adjacent products and could extend into this space quickly. MCP itself is still maturing, meaning protocol changes could break the middleware. Developer adoption of a new middleware layer requires overcoming inertia. The biggest risk is timing: if a major incident drives enterprises to mandate agent security, they will buy from established vendors, not indie tools. The counter-play is open-source credibility and developer-first UX that enterprise tools struggle to match.

Verdict

The AI agent security gap is real and growing faster than the solutions. ToolFence targets the specific, underserved layer between "scan your prompts" (which existing tools cover) and "enterprise security platform" (which costs six figures). The timing is strong: Microsoft's CVE, the VentureBeat investigation, and Snyk's ToxicSkills research all landed in the past 90 days. A solo operator can ship the core middleware in a week and start building trust. The long-term play is becoming the default open-source agent firewall before enterprise vendors absorb the space.

⚡

Bottom Line

The timing is exceptional with Microsoft's Semantic Kernel RCE disclosure, Snyk's ToxicSkills findings, and 88% of enterprises reporting AI agent security incidents. Runtime tool-call enforcement is a genuine gap between prompt-level testing and enterprise platforms. Open-source core plus hosted dashboard is the right GTM for a solo operator. Moat risk is real but speed-to-community matters more at this stage.