AI Agent Security Best Practices

AI agents are autonomous software with network access, file system access, and often the ability to execute code. That makes security non-optional. This guide covers osModa's layered security model: trust tiers for privilege control, osmoda-egress for network restriction, osmoda-keyd for key isolation, NixOS immutable configuration for system integrity, and the SHA-256 audit ledger for forensic accountability.

Last updated: May 2026

Security layers

Trust tiers — Three levels of agent privilege: Tier 0 (unrestricted), Tier 1 (sandboxed + declared capabilities), Tier 2 (max isolation + network restricted).
osmoda-egress — Domain allowlisting proxy. Tier 2 agents can only connect to explicitly approved domains. All other traffic is blocked and logged.
osmoda-keyd — ETH + SOL wallet management with key isolation and policy-gated signing. Keys never exposed to agent processes.
NixOS immutability — Declarative system config, atomic rollback. If compromised, revert the entire OS to a known-good state instantly.
Audit ledger — SHA-256 hash-chained log of every mutation. Tamper-evident forensic trail for incident response and compliance.

The Trust Model: Tier 0, Tier 1, Tier 2

The fundamental security question for any AI agent is: how much should this agent be allowed to do? osModa answers this with three trust tiers. Each tier defines a boundary of what the agent can access, execute, and connect to.

Tier 0 — Unrestricted

The osModa platform agent itself operates at Tier 0. It has full root SSH access, unrestricted network connectivity, and access to all 83 built-in tools. This tier is for infrastructure automation that you fully trust and control.

Use cases:
  - Deployment automation (SafeSwitch, nixos-rebuild)
  - System maintenance (log rotation, cleanup, backups)
  - Infrastructure orchestration (multi-server coordination)
  - Internal tools that only your team triggers

Access: Full root, all tools, unrestricted network
Risk: Compromise = full system access. Use only for trusted code.

Tier 1 — Sandboxed with Declared Capabilities

Tier 1 agents run in a sandbox. You explicitly declare which tools and network endpoints they can access. The agent cannot exceed its declared capabilities. This is the right tier for most production agents — they need some system access but should not have root.

Use cases:
  - Customer-facing chatbots that need database access
  - Research agents that scrape specific domains
  - Agents running third-party frameworks (LangGraph, CrewAI)
  - Agents that call external APIs with your credentials

Access: Declared tools only, declared network endpoints
Risk: Limited blast radius. Cannot access undeclared resources.

Tier 2 — Maximum Isolation

Tier 2 is for untrusted workloads. The agent is maximally sandboxed: network access is restricted to domains allowlisted via osmoda-egress, filesystem access is minimal, and the agent cannot interact with system services or other agents. Even if the agent is fully compromised, the damage is contained.

Use cases:
  - Agents processing untrusted user input
  - Agents executing user-provided code or scripts
  - Third-party agent plugins you do not fully trust
  - Agents in regulated environments (HIPAA, SOC 2)

Access: Allowlisted domains only, minimal filesystem, no system services
Risk: Minimal blast radius. Compromise contained to sandbox.

osmoda-egress: Domain Allowlisting Proxy

The most dangerous thing an AI agent can do is send your data somewhere it should not go. Prompt injection attacks can trick agents into exfiltrating secrets, customer data, or proprietary information to attacker-controlled servers. osmoda-egress prevents this by acting as a domain allowlisting proxy.

When osmoda-egress is active (automatically enabled for Tier 2 agents, optional for Tier 1), all outbound network traffic from the agent is routed through the proxy. Only traffic to explicitly allowlisted domains passes through. Everything else is blocked and logged in the audit ledger.

Example: allowlisting for a customer support agent

# osmoda-egress allowlist configuration
# Only these domains are reachable by the agent

Allowed domains:
  - api.openai.com          # LLM API calls
  - api.anthropic.com       # LLM API calls
  - your-database.rds.amazonaws.com  # Database
  - api.stripe.com          # Payment processing
  - hooks.slack.com         # Alert notifications

Blocked (examples of what gets caught):
  - pastebin.com            # Blocked: data exfiltration vector
  - attacker-server.com     # Blocked: prompt injection target
  - *.ngrok.io              # Blocked: tunnel services

All blocked attempts are logged in the audit ledger
with full request details for forensic analysis.

osmoda-egress is the single most effective defense against data exfiltration. Even if an agent is fully compromised via prompt injection, it cannot send data anywhere that is not on your allowlist. The blocked traffic logs in the audit ledger alert you that something tried to reach a forbidden destination.

osmoda-keyd: Key Isolation and Policy-Gated Signing

If your agents interact with blockchain networks (ETH, SOL), osmoda-keyd manages wallet keys with strict isolation. Private keys are stored in osmoda-keyd's isolated process memory — they are never exposed to agent processes and never written to the agent's filesystem.

Agents request signing operations through osmoda-keyd, which enforces policy-gated controls. You define what types of transactions an agent is allowed to sign, and osmoda-keyd refuses any transaction that violates the policy.

osmoda-keyd security model:

Key isolation:
  - Private keys stored in osmoda-keyd process memory only
  - Never exposed to agent processes
  - Never written to agent-accessible filesystem
  - Keys encrypted at rest

Policy-gated signing:
  - Define allowed transaction types per agent
  - Maximum transaction value limits
  - Allowed recipient addresses (allowlist)
  - Rate limiting (max transactions per hour)

Every signing request and its result (approved/denied)
is recorded in the audit ledger.

Even if a DeFi agent is compromised, osmoda-keyd prevents unauthorized transactions. The agent cannot extract the private key, cannot sign transactions that violate the policy, and every attempt (successful or denied) is logged for forensic review.

NixOS: Immutable Configuration and Atomic Rollback

Traditional Linux distributions (Ubuntu, Debian) allow mutation at any time — install a package, change a config, apply a patch. Over time, the system drifts from its intended state. On NixOS, the entire system configuration is declared in a single file and applied atomically.

Security implications:

No configuration drift

The system state matches the declared configuration exactly. There are no mystery packages, no forgotten config changes, no security patches that were applied to one server but not another. What you declare is what you get.

Instant rollback if compromised

Every configuration change creates a new generation. If a compromise is detected, you can instantly roll back to a known-good generation using SafeSwitch: nixos-rebuild switch --rollback. This reverts the entire system — packages, services, configurations — not just a single application.

Reproducible security posture

If you need to rebuild a server from scratch (after a serious compromise), the NixOS configuration file reproduces the exact same system state. There are no manual steps to forget, no packages to reinstall by hand, no configurations to recreate from memory.

Incident response with NixOS:

# 1. Detect: audit ledger shows unauthorized change
# 2. Contain: identify the compromised generation
nixos-rebuild list-generations

# 3. Roll back to last known-good generation
nixos-rebuild switch --rollback

# 4. Verify: system is back to known-good state
nixos-rebuild list-generations

# 5. Investigate: audit ledger shows exactly what changed
# Every config change, package install, service modification
# is recorded with timestamps and hash chain

Audit Ledger: Forensic Trail of Every Mutation

The SHA-256 hash-chained audit ledger is the last line of defense. Even if every other security measure is bypassed, the audit ledger records what happened. Each entry is cryptographically chained to the previous one — modifying or deleting any entry breaks the chain and is immediately detectable.

For security-relevant events, the audit ledger captures:

Authentication events:
  - SSH login (success and failure, source IP, key fingerprint)
  - SSH key additions and revocations
  - Dashboard login sessions

Agent actions:
  - Every tool invocation (file read/write, shell exec, API call)
  - Network connections (outbound and blocked by osmoda-egress)
  - Signing requests to osmoda-keyd (approved and denied)

System changes:
  - NixOS configuration changes (nixos-rebuild)
  - Package installations and removals
  - Service start/stop/crash/recovery
  - Trust tier modifications

Data operations:
  - Vector memory writes and searches
  - File creations, modifications, and deletions
  - MCP server registrations and tool connections

When investigating a security incident, the audit ledger tells you the complete story: what was the attack vector, what did the attacker access, what data was touched, and what changes were made — all with cryptographic proof of integrity. This is the forensic evidence that compliance auditors and incident responders need. See the Monitoring guide for how to access and query the ledger in practice.

Production Security Checklist

Before putting an agent into production, verify these security configurations:

[ ] Assign appropriate trust tier (0, 1, or 2) for each agent
[ ] Configure osmoda-egress domain allowlist for Tier 1/2 agents
[ ] Store secrets in secure env files with chmod 600
[ ] Verify osmoda-watch is supervising all agent processes
[ ] Configure osmoda-keyd policies if agents handle crypto wallets
[ ] Set up alerting via Telegram/Discord/Slack for crash events
[ ] Review audit ledger for unexpected events after first 24 hours
[ ] Verify NixOS configuration is committed and rollback-ready
[ ] Test rollback: switch to previous generation and back
[ ] Restrict SSH keys to only team members who need access

Frequently Asked Questions

Can an AI agent be hacked?

Any software can be compromised. AI agents have additional attack surfaces: prompt injection can cause agents to execute unintended actions, and agents with unrestricted network access can exfiltrate data. osModa mitigates these risks with trust tiers (limit what agents can do), osmoda-egress (limit where agents can connect), and the audit ledger (detect what happened if something goes wrong).

What is the difference between Tier 1 and Tier 2 isolation?

Tier 1 runs in a sandbox with declared capabilities — the agent can access specific tools and endpoints you configure. Tier 2 adds maximum isolation: network access is restricted to domains explicitly allowlisted via osmoda-egress, filesystem access is minimal, and the agent cannot interact with system services. Tier 2 is for agents handling completely untrusted input.

Does osmoda-egress block all outbound traffic by default?

osmoda-egress is the domain allowlisting proxy. When enabled for a Tier 2 agent, it restricts outbound network access to only the domains you explicitly allowlist. Traffic to non-allowlisted domains is blocked and logged. This prevents data exfiltration even if an agent is compromised via prompt injection.

How does osmoda-keyd protect wallet keys?

osmoda-keyd manages ETH and SOL wallets with key isolation. Private keys are stored separately from agent processes and are never exposed directly. Signing operations go through osmoda-keyd with policy-gated controls — you define what transactions an agent is allowed to sign, and osmoda-keyd enforces those policies.

Can I use osModa for HIPAA or SOC 2 compliance?

osModa provides infrastructure features that map to specific compliance controls: the SHA-256 hash-chained audit ledger for tamper-evident logging (SOC 2 CC7.2, HIPAA 164.312(b)), trust tiers for access control (SOC 2 CC6.1, HIPAA 164.312(a)), and NixOS atomic rollback for change management (SOC 2 CC8.1). See our compliance guides for detailed control mapping.

What happens if NixOS configuration is tampered with?

NixOS configuration is declarative and versioned. Every change creates a new generation, and the previous generation remains intact. If configuration is tampered with, you can immediately roll back to a known-good generation via SafeSwitch. The audit ledger records every configuration change, making unauthorized modifications detectable.

Security Built Into Every Server

osModa servers ship with trust tiers, osmoda-egress, osmoda-keyd, and the SHA-256 audit ledger pre-configured. NixOS immutable config and atomic rollback are the default. From $29/month.

Deploy Your Agent Now View Pricing

Explore More Guides

First Agent Agent Monitoring Multi-Agent Architecture MCP Server Setup Cost Optimization NixOS Basics