SAFE MODE

Local guardrails for AI agents.

Safe Mode is a local MCP proxy that sits between your AI editor and your tools. 15 detection engines, automatic rollback, and phone approvals — no cloud required.

npx safemode init
Open sourceNo account requiredWorks offline

HOW IT WORKS

One command. Every MCP call governed.

AI Editor
Safe Mode Proxy
15 Engines
MCP Servers

Safe Mode detects your MCP clients, patches their configs to route through the proxy, and runs every tool call through the detection engine pipeline. Original configs are backed up to ~/.safemode/backup/.

SUPPORTED CLIENTS

Claude Desktop
Claude Code
Cursor
VS Code
Windsurf

Plus any MCP-compatible client.

FEATURES

15 Detection Engines

Counters, content scanners, ML security models, and rule engines. Under 25ms total latency.

5 Presets, 100+ Knobs

Start with coding, personal, strict, trading, or yolo. Tune every parameter individually.

Time Machine

Automatic file snapshots before every write. Instant rollback to any restore point.

Phone Approvals

Route high-risk actions to Telegram or Discord. Approve or deny from your phone.

Fully Local

No cloud required. No telemetry. No data leaves your machine unless you opt in.

Cloud Upgrade Path

Connect to TrustScope for +12 cloud engines, Agent DNA, compliance evidence, and team dashboards.

PRESETS

Pick a posture. Tune from there.

coding

Default

Balanced for software development. Scoped writes, terminal guards, budget caps.

Budget cap: $20/session

personal

For Claude Desktop personal use. No terminal, no git, filesystem scoped.

Budget cap: $10/session

strict

Maximum restrictions. Everything requires approval. Read-only filesystem.

Budget cap: $5/session

trading

Circuit breakers and hard caps for financial operations.

Budget cap: $50/session

yolo

Maximum autonomy. Catch catastrophes only. For sandboxed environments.

Budget cap: $100/session

Switch anytime with safemode preset <name>. Every knob is individually overridable.

GRANULAR CONTROL

19 categories. 92 knobs. Three levels.

Every knob is allow, approve, or block. Presets set sensible defaults. Override any knob individually.

CategoryKnobsControls
Terminal10command exec, destructive commands, sudo, package installs, daemons
Filesystem8read, write, delete, symlinks, permissions
Git6commit, push, force push, branch delete, rebase
Network5HTTP, WebSocket, DNS, domain allowlist/blocklist
Database5read, write, delete, schema change, admin
Financial5payments, transfers, subscriptions, refunds
API5read, write, delete, admin, rate limit
Communication5email, messages, notifications, calendar
Cloud Infrastructure5instances, storage, network, IAM
Container4create, delete, image pull, volume mount
Package Management4install, uninstall, update, publish
Deployment4staging, production, rollback, scale
Data4export, import, backup, transform
Scheduling3cron, timers, scheduled tasks
Authentication3credential read/write, sessions
Monitoring3logs, metrics, alerts
Browser3navigate, form submit, download
Physical3IoT commands, hardware, sensors
Customyour own rules

allow

Action proceeds without interruption

approve

Action paused until human approval

block

Action denied completely

2 knobs are hardcoded and cannot be overridden: destructive_commands and pipe_to_shell are always blocked.

DETECTION ENGINES

15 engines. Under 25ms.

Counter Engines (1-8)

Loop KillerOscillation DetectorVelocity LimiterAction GrowthCost ExposureLatency SpikeError RateThroughput Drop

Content Scanners (9-10)

PII ScannerSecrets Scanner

Security Scanners (11-12)

Prompt InjectionJailbreak Detector

Rule Engines (13-15)

Command FirewallBudget CapAction Label Mismatch

ML security models (prompt injection & jailbreak) are optional. Enable with safemode ml --enable (~85MB download).

CLI

$ safemode init          # detect clients, pick preset, patch configs
$ safemode start         # start proxy
$ safemode status        # show engines, clients, preset
$ safemode summary       # session stats + restore points
$ safemode restore       # rollback to snapshot
$ safemode scan          # security scan current directory
$ safemode doctor        # health check + diagnostics
$ safemode activity -f   # live activity feed

Local first. Cloud when you need it.

Safe Mode works entirely offline. When you're ready for team visibility, compliance evidence, and advanced detection, connect to TrustScope with safemode connect.

+12 cloud engines

PII cloud, anomaly detection, Agent DNA, behavioral fingerprinting

Team dashboard

Centralized visibility across all agents and developers

Compliance evidence

Signed receipts, hash chains, and framework-mapped exports

Try it in sixty seconds.

One command. No account. No cloud. Just guardrails.

npx safemode init