SAFE MODE

Local guardrails for AI agents.

Safe Mode is a local MCP proxy that sits between your AI editor and your tools. 15 detection engines, automatic rollback, and phone approvals — no cloud required.

npx safemode init

Read the Docs View on GitHub

Open sourceNo account requiredWorks offline

HOW IT WORKS

One command. Every MCP call governed.

AI Editor

Safe Mode Proxy

15 Engines

MCP Servers

Safe Mode detects your MCP clients, patches their configs to route through the proxy, and runs every tool call through the detection engine pipeline. Original configs are backed up to ~/.safemode/backup/.

SUPPORTED CLIENTS

Claude Desktop

Claude Code

Cursor

VS Code

Windsurf

Plus any MCP-compatible client.

FEATURES

15 Detection Engines

Counters, content scanners, ML security models, and rule engines. Under 25ms total latency.

5 Presets, 100+ Knobs

Start with coding, personal, strict, trading, or yolo. Tune every parameter individually.

Time Machine

Automatic file snapshots before every write. Instant rollback to any restore point.

Phone Approvals

Route high-risk actions to Telegram or Discord. Approve or deny from your phone.

Fully Local

No cloud required. No telemetry. No data leaves your machine unless you opt in.

Cloud Upgrade Path

Connect to TrustScope for +12 cloud engines, Agent DNA, compliance evidence, and team dashboards.

PRESETS

Pick a posture. Tune from there.

coding

Default

Balanced for software development. Scoped writes, terminal guards, budget caps.

Budget cap: $20/session

personal

For Claude Desktop personal use. No terminal, no git, filesystem scoped.

Budget cap: $10/session

strict

Maximum restrictions. Everything requires approval. Read-only filesystem.

Budget cap: $5/session

trading

Circuit breakers and hard caps for financial operations.

Budget cap: $50/session

yolo

Maximum autonomy. Catch catastrophes only. For sandboxed environments.

Budget cap: $100/session

Switch anytime with safemode preset <name>. Every knob is individually overridable.

GRANULAR CONTROL

19 categories. 92 knobs. Three levels.

Every knob is allow, approve, or block. Presets set sensible defaults. Override any knob individually.

Category	Knobs	Controls
Terminal	10	command exec, destructive commands, sudo, package installs, daemons
Filesystem	8	read, write, delete, symlinks, permissions
Git	6	commit, push, force push, branch delete, rebase
Network	5	HTTP, WebSocket, DNS, domain allowlist/blocklist
Database	5	read, write, delete, schema change, admin
Financial	5	payments, transfers, subscriptions, refunds
API	5	read, write, delete, admin, rate limit
Communication	5	email, messages, notifications, calendar
Cloud Infrastructure	5	instances, storage, network, IAM
Container	4	create, delete, image pull, volume mount
Package Management	4	install, uninstall, update, publish
Deployment	4	staging, production, rollback, scale
Data	4	export, import, backup, transform
Scheduling	3	cron, timers, scheduled tasks
Authentication	3	credential read/write, sessions
Monitoring	3	logs, metrics, alerts
Browser	3	navigate, form submit, download
Physical	3	IoT commands, hardware, sensors
Custom	—	your own rules

allow

Action proceeds without interruption

approve

Action paused until human approval

block

Action denied completely

2 knobs are hardcoded and cannot be overridden: destructive_commands and pipe_to_shell are always blocked.

DETECTION ENGINES

15 engines. Under 25ms.

Counter Engines (1-8)

Loop KillerOscillation DetectorVelocity LimiterAction GrowthCost ExposureLatency SpikeError RateThroughput Drop

Content Scanners (9-10)

PII ScannerSecrets Scanner

Security Scanners (11-12)

Prompt InjectionJailbreak Detector

Rule Engines (13-15)

Command FirewallBudget CapAction Label Mismatch

ML security models (prompt injection & jailbreak) are optional. Enable with safemode ml --enable (~85MB download).

CLI

$ safemode init          # detect clients, pick preset, patch configs
$ safemode start         # start proxy
$ safemode status        # show engines, clients, preset
$ safemode summary       # session stats + restore points
$ safemode restore       # rollback to snapshot
$ safemode scan          # security scan current directory
$ safemode doctor        # health check + diagnostics
$ safemode activity -f   # live activity feed

Local first. Cloud when you need it.

Safe Mode works entirely offline. When you're ready for team visibility, compliance evidence, and advanced detection, connect to TrustScope with safemode connect.

+12 cloud engines

PII cloud, anomaly detection, Agent DNA, behavioral fingerprinting

Team dashboard

Centralized visibility across all agents and developers

Compliance evidence

Signed receipts, hash chains, and framework-mapped exports

Try it in sixty seconds.

One command. No account. No cloud. Just guardrails.

npx safemode init

Read the Docs Compare Plans