Clawdbot
AI-Administered Server Infrastructure
What This Is
This document describes a production infrastructure where AI agents are not just tools — they are operators. Two AI agents collaboratively manage a three-machine environment: an office server, a GPU compute node, and a web hosting platform. Together, they handle system administration, user communication, email, calendar management, security monitoring, and more.
Production since early 2026. This is not a demo or proof-of-concept — it serves real users, hosts public websites, runs research workloads, and manages itself with minimal human intervention.
What makes this different
Most AI-in-operations setups use AI as an assistant: a human asks a question, the AI answers. This system inverts that model:
- Dakota — the user-facing agent, runs 24/7 on Telegram. Reads emails, manages calendars, performs research, writes code, sends voice messages — autonomously.
- Claude Code — the system administrator. Deploys containers, diagnoses issues, applies patches, maintains documentation — triggered by events, not prompts.
- A request broker enables asynchronous collaboration with human approval for consequential actions.
The human operator remains the decision-maker for anything consequential. But the day-to-day operational burden has shifted dramatically.
Architecture Overview
The infrastructure spans three machines with distinct roles, connected via SSH and managed by AI agents.
High-level: The Three Machines
AI Agent Stack
~20 Docker containers"] end subgraph azure_web ["Azure Web Server"] WEB["LibreChat + 6 Websites
Skill Gateway
41 Docker containers"] end subgraph azure_gpu ["Azure A100 VM"] GPU["vLLM + NVIDIA A100 80GB
Model Inference
Research Projects"] end USERS <-->|"Telegram Bot API
Gmail API
Cloudflare Tunnel"| AGENTS AGENTS -.->|SSH| WEB AGENTS -.->|SSH| GPU USERS -->|HTTPS| WEB
Detail: Inside the Office Server
System Administrator] subgraph agent_stack ["AI Agent Stack (OpenClaw)"] OC[OpenClaw Gateway] DAK[Dakota Agent
Claude Opus 4.5] SB[Sandbox Container] LG[LLM Guard] GM[Graphiti
Knowledge Graph] end subgraph support ["Supporting Services"] RB[Request Broker] GN[Gmail Notify] CA[Calendar Aggregator] end subgraph ops ["Operations"] WZ[Wazuh SIEM] MAINT[Maintenance Timers] end end TG <-->|Bot API| OC EMAIL <--> GN EMAIL <--> RB OC <--> ANTH CC <-->|Bridge CLI| OC OC --> LG OC --> GM DAK --> SB
The Three Machines
| Machine | Role | Hardware | AI Agent |
|---|---|---|---|
| clawdbot (office) | AI agent hub, service gateway | Intel i7 NUC, 64 GB RAM, 1.8 TB NVMe | Dakota + Claude Code |
| Azure Web Server | Public web hosting, SaaS | Xeon 16 vCPU, 32 GB RAM | — (managed remotely) |
| Azure A100 VM | GPU inference & research | AMD EPYC 24 vCPU, 216 GB RAM, A100 80GB | — (managed remotely) |
The Physical Setup
The heart of the system is an ASUS NUC 13 mini PC — a compact, silent machine in an office. Despite its small footprint, it runs the entire AI agent stack: ~20 Docker containers, two AI agents, security monitoring, and acts as the control plane for two Azure VMs.
Intel i7-1360P
64 GB RAM
1.8 TB NVMe"] WS["Windows 11 Workstation
(SSH access)"] end subgraph Azure ["Azure Cloud"] WEBVM["Web Server
16 vCPU, 32 GB RAM
41 Docker containers"] GPUVM["A100 VM
24 vCPU, 216 GB RAM
80 GB GPU VRAM"] end subgraph External ["External Services"] ANTH["Anthropic API
(Claude Opus 4.5)"] TGAPI["Telegram Bot API"] GAPI["Google APIs
(Gmail, Calendar)"] MSAPI["Microsoft Graph API
(Calendar, Email)"] EL["ElevenLabs
(Text-to-Speech)"] GROQ["Groq
(Speech-to-Text)"] end WS <-->|SSH| NUC NUC <-->|SSH| WEBVM NUC <-->|SSH| GPUVM NUC <--> ANTH NUC <--> TGAPI NUC <--> GAPI NUC <--> MSAPI NUC <--> EL NUC <--> GROQ
Why a NUC and not the cloud?
Cost and control. The office server runs 24/7 on a home internet connection. Cloud VMs are used only where their specific capabilities are needed: GPU for AI inference, and a public IP with high bandwidth for web hosting. The NUC handles everything else at a fraction of the cost.
Containerization Philosophy
Every service runs in Docker, with one exception (Kodi media center). Containers are configured with:
- Read-only root filesystems where possible
- All capabilities dropped (
cap_drop: ALL), with only necessary ones added back - Non-root execution inside containers
- Memory, CPU, and PID limits to prevent resource exhaustion
- Isolated Docker networks — services can only communicate where explicitly configured
Dakota: The User-Facing AI Agent
Dakota runs inside the OpenClaw gateway — an open-source AI agent framework (Node.js) that connects LLMs to messaging platforms with sandboxed tool access. OpenClaw handles the full agent lifecycle: message routing, tool execution, session management, and provider integration.
How Dakota Works
Capabilities
| Capability | Implementation | Description |
|---|---|---|
| Text conversation | Claude Opus 4.5 | Natural language interaction |
| Voice messages | Groq Whisper + ElevenLabs | Receives and sends voice |
| Gmail API | Reads and sends autonomously | |
| Calendar | MS Graph + Google Calendar | 14 calendars, 3 accounts |
| Web research | Perplexity + Playwright | Search and browser automation |
| Code execution | Sandboxed exec | Python, shell in isolation |
| Memory | Graphiti + FalkorDB | Temporal knowledge graph |
| Sub-agents | OpenClaw sessions | Delegates heavy tasks |
| Image generation | Azure OpenAI | DALL-E / Sora on request |
| Elevated access | Allowlisted CLIs | Claude Code, Codex, Gemini |
The Sandbox Model
Dakota operates in a double-containerized environment:
(Node.js)"] subgraph sandbox ["Sandbox Container"] TOOLS["Sandbox Tools:
exec, read, write, edit,
browser, python3, gh, curl"] FS["Sandboxed Filesystem
/workspace"] end end SHARED["Shared Directory
(host filesystem)"] ELEVATED["Elevated Exec
(host binaries)"] end OC_PROC --> sandbox OC_PROC -->|"allowlisted
commands only"| ELEVATED sandbox <-->|"bind mount"| SHARED
- Sandbox tools (default): Isolated container, no network, dropped capabilities, memory limits
- Elevated tools (explicit): Host-level access only from allowlisted senders with permission checks
Even if Dakota's prompt is compromised, the blast radius is contained to the sandbox. Host-level access requires passing through application-layer permission checks that cannot be bypassed by prompt injection.
Claude Code: The System Administrator
Claude Code runs on the host as the system administrator. Full host access: Docker, SSH, configuration files, deployments.
How Claude Code Operates
Not a long-running daemon — invoked on demand:
| Trigger | Mechanism | Use Case |
|---|---|---|
| Interactive SSH | Human runs claude | Major changes, debugging |
| Telegram bot | systemd service | Quick admin from mobile |
| Dakota request | systemd path watcher | Tasks she can't do herself |
| Email from operator | systemd path watcher | Deploy via email |
| Scheduled maintenance | systemd timers | Daily/weekly/monthly checks |
Each invocation is a fresh Claude Code instance with appropriate tool permissions scoped to the task.
Multi-Agent Collaboration
The two AI agents serve different roles, have different access levels, and communicate through structured protocols.
(User Agent) participant FS as Shared Filesystem participant SD as systemd
(Path Watcher) participant CC as Claude Code
(Sysadmin) participant OP as Operator
(Email) Note over D: Dakota needs a system change
she can't do herself D->>FS: Write request JSON FS->>SD: inotify triggers dakota-request.path SD->>CC: Spawn claude -p (scoped permissions) CC->>CC: Evaluate request
(risk assessment, feasibility) CC->>OP: Email: "Dakota requests X. Approve?" OP->>OP: Reply via email (natural language) Note over CC: Gmail poller detects reply CC->>CC: Interpret intent
(approve / deny / modify) alt Approved CC->>CC: Execute the change CC->>D: Bridge message: "Done, here's what happened" else Denied CC->>D: Bridge message: "Denied because..." else Questions CC->>OP: Follow-up email end
The Bridge Protocol
Claude Code can message Dakota via a bridge CLI. One-way: CC → Dakota. Every message is HMAC-SHA256 authenticated with a shared secret and 5-minute replay window. This prevents:
- Telegram messages from impersonating bridge commands
- Replay attacks on previously sent instructions
- External parties from injecting messages into the agent pipeline
The Request Broker
When Dakota needs host access:
- Dakota writes a structured JSON request to shared directory
- systemd detects via inotify
- Claude Code instance spawned to evaluate
- Claude Code emails operator with summary
- Operator replies in natural language
- Claude Code interprets and acts
- Dakota notified via bridge
Request Broker State Machine
The Web Platform
The Azure web server hosts public-facing services: 41 Docker containers across 17 isolated networks.
| Category | Services |
|---|---|
| AI Chat Platforms | LibreChat (govgpt.nl, edugpt.nl, chat.civiqs.ai) |
| Skill Gateway | 8 microservices: office, content, analytics, research, video |
| Public Websites | 6+ sites including portfolio and client projects |
| Automation | n8n, Flowise |
| Security | Authelia (SSO/2FA), Wazuh SIEM, GeoIP blocking, fail2ban (7 jails) |
| Research | Firecrawl, SearXNG, Qdrant |
GPU Infrastructure
The Azure A100 VM serves as the GPU compute node for model inference and research.
| Component | Specification |
|---|---|
| GPU | NVIDIA A100 80GB PCIe |
| CPU | AMD EPYC 7V13 (24 vCPUs) |
| RAM | 216 GB DDR4 |
| Storage | 248 GB root + 1 TB data (685 GB model cache) |
| Software | vLLM 0.12.0, CUDA 12.4 |
Cached Models
| Model | Size | Purpose |
|---|---|---|
| Mistral-Large-Instruct-2411 | 229 GB | General instruction following |
| Qwen2.5-72B-Instruct | 136 GB | Multilingual instruction following |
| QwQ-32B | 62 GB | Reasoning tasks |
| DeepSeek-R1-Distill-Qwen-14B | 28 GB | Efficient reasoning |
| EuroLLM models | 18-43 GB | EU-language specialized |
Security Architecture
Security is designed as defense-in-depth: seven layers, from hard boundaries to probabilistic defenses. The key principle is that no single layer is trusted to be sufficient.
Code-level Telegram sender checks
Elevated exec only from allowlisted users"] L2["Sandbox Isolation
Read-only root, no network, caps dropped
Memory / CPU / PID limits"] end subgraph probabilistic ["Probabilistic Defenses"] L3["LLM Guard
ML-based injection detection (DeBERTa)
PII redaction, malicious URL scanning"] L4["Telegram Allowlist
Only authorized users can interact"] L5["Bridge Authentication
HMAC-SHA256 signed, 5-min replay window"] end subgraph behavioral ["Behavioral Defenses (speed bumps)"] L6["Content Markers
External content wrapped in trust markers
80 regex patterns for known injections"] L7["Prompt Instructions
Identity anchoring, trust tiers"] end
Threat Model
The system acknowledges the Lethal Trifecta (Simon Willison's formulation): AI agent with private data + untrusted content + external actions. Dakota sits squarely in this trifecta.
Research reference: The October 2025 multi-lab paper (OpenAI, Anthropic, DeepMind) demonstrated that adaptive attacks bypass all published prompt-level defenses >90% of the time. This is why hard boundaries (code-level checks, sandbox isolation) are considered essential — prompt-level defenses alone are insufficient.
What Happens When LLM Guard Flags Something
LLM Guard operates in enforce mode — flagged content is blocked, not just logged:
- The incoming message is rejected before reaching the Claude API
- Dakota does not see or respond to the flagged message
- The event is logged and triggers a Wazuh SIEM alert (level 10+)
- The operator receives notification via email and Telegram
LLM Guard Technical Details
- Model: DeBERTa (Microsoft language model fine-tuned for classification)
- Execution: Runs locally on CPU via ONNX — no external API calls
- Threshold: 0.8 (raised from 0.6 to reduce false positives on voice transcriptions)
- Failure mode: Fail-open — Dakota continues if LLM Guard goes down (hard boundaries still active)
SIEM and Monitoring
The entire infrastructure is monitored by Wazuh SIEM (deployed on both office server and web server):
- File integrity monitoring on critical configuration directories
- Custom alert rules for AI-specific events: sandbox execution, elevated command usage, LLM Guard injection flags, bridge authentication failures
- Active response: automatic IP blocking on SSH brute force
- Daily digest reports emailed to the operator
- Real-time alerts for high-severity events (level 10+) via email and Telegram
Automated Operations
Three-tier automated maintenance on systemd timers:
dangling images/volumes] W2[Orphaned sandbox cleanup] W3[Stale file reports] W4[Log rotation] W5[Image age report] W6[Pending apt updates] end subgraph monthly ["Monthly (15th, 03:00)"] M1[Database maintenance
VACUUM, OPTIMIZE] M2[Full disk breakdown] M3[Container resource audit] M4[Security posture review] M5[Package inventory diff] M6[Uptime/load report] end daily -->|"Alert-only
(email if issues)"| OP[Operator] weekly -->|"Always sends
report"| OP monthly -->|"Always sends
report"| OP
Session Health Watchdog
AI session corruption is the hardest operational challenge. When the Anthropic API returns malformed responses during streaming (truncated JSON, merged SSE events), session history can become permanently corrupted.
The system runs a session health watchdog every 5 minutes that:
- Scans the tail of all session files for corruption patterns
- Backs up corrupted sessions before modifying them
- Truncates to the last valid boundary
- Automatically restarts the OpenClaw container on repair
Secret Rotation
A quarterly/semi-annual/annual rotation schedule covers all secrets, with:
- A systemd timer that checks which secrets are due each month
- Automated email reminders to the operator
- A documented post-rotation checklist
Data Flows
Telegram Message Flow
Calendar Aggregation Flow
(6 calendars)"] MS2["Personal Outlook
(5 calendars)"] GC["Google Calendar"] end subgraph Aggregator ["Calendar Aggregator Service"] POLL[Poller] MERGE[Merge & Deduplicate] JSON["Output: availability.json"] end subgraph Consumer ["Consumer"] DAK2[Dakota reads via
shared filesystem] end MS1 -->|"MS Graph API
+ MSAL"| POLL MS2 -->|"MS Graph API
+ MSAL"| POLL GC -->|"Google Calendar API
+ OAuth2"| POLL POLL --> MERGE MERGE --> JSON JSON --> DAK2
Trade-offs and Design Decisions
What Works Well
| Decision | Benefit |
|---|---|
| Containerize everything | Consistent environments, easy rollback, strong isolation |
| AI agents as operators | Autonomous handling of routine tasks |
| systemd as orchestrator | Reliable event-driven automation, survives reboots |
| Defense-in-depth | No single layer is trusted |
| Human-in-the-loop | System changes require explicit approval |
| Knowledge graph memory | Dakota remembers context across conversations |
| One-way bridge | Prevents agent-to-agent loops |
What Requires Compromise
| Decision | Trade-off |
|---|---|
| Claude Opus 4.5 | High cost, API dependency |
| Single NUC | Single point of failure |
| Telegram interface | Message limits, platform dependency |
| No global git remote | Maximum security, but disk failure = total loss |
| LLM Guard fail-open | Availability over temporary protection loss |
Risks
Technical Risks
| Risk | Severity | Mitigation |
|---|---|---|
| Anthropic API outage | High | Dakota offline, no fallback yet |
| Session corruption | Medium | Automated watchdog |
| NUC failure | Medium | No HA, manual recovery |
| Secret sprawl | Medium | Rotation schedule |
What Survives a NUC Failure
If the office server goes down (hardware failure, power outage, disk failure):
| Component | Status | Notes |
|---|---|---|
| Azure Web Server | Continues running | All 41 containers, websites, LibreChat — fully independent |
| Azure A100 VM | Continues running | vLLM serving, research workloads — fully independent |
| Dakota | Offline | Runs on the NUC; no failover |
| Claude Code | Offline | Runs on the NUC; no remote management |
| Email notifications | Offline | Gmail Notify, Request Broker — all on NUC |
| Security monitoring | Partially degraded | Azure VMs have their own Wazuh; NUC-side offline |
Recovery: No automated disaster recovery. Git repos and Docker volumes are local-only — a disk failure without backup means total configuration loss. This is an acknowledged trade-off: simplicity and security (no cloud exposure) over resilience.
AI-Specific Risks
| Risk | Severity | Mitigation |
|---|---|---|
| Prompt injection | High | 7-layer defense stack |
| Agent hallucination | Medium | Logged changes, approval for destructive ops |
| Cost unpredictability | Medium | Token limits, session management |
Opportunities
Near-term
- Local model fallback — A100 as backup when Anthropic unavailable
- Proactive monitoring — Trend analysis and preventive action
- Multi-agent delegation — Specialized agents for code review, security
Medium-term
- Self-healing infrastructure — Automatic failure detection and remediation
- Organizational knowledge base — Extend knowledge graph to decisions, policies
Long-term
- Replicable architecture — Reference architecture for small teams
- Federated agent networks — Cross-organization collaboration
Lessons Learned
1. Session corruption is the hardest operational problem
AI sessions accumulate state. Malformed API responses corrupt that state permanently. Fix: Automated monitoring and safe reset without losing learned behaviors.
2. AI agents lose learned behaviors on session reset
Deleting a corrupted session lost Dakota's learned TTS format. Fix: Document critical behaviors in config files, not session memory.
3. Defense-in-depth is not optional for AI agents
Adaptive attacks bypass prompt-level defenses >90% of the time. Fix: Hard boundaries (code checks, sandboxing) are the actual security.
4. systemd is underrated for AI orchestration
Path watchers and timers spawn instances on demand. No always-on process, automatic restart, built-in logging.
5. The one-way bridge is a feature, not a limitation
Prevents agent-to-agent feedback loops. The "dead drop" pattern works surprisingly well.
6. Timezone bugs in containerized services are deceptively hard
A UTC container consuming localized datetimes without timezone suffixes produces offset errors. Simple fix, complex debugging.
Appendix: Service Inventory
Office Server (clawdbot)
| Service | Type | Purpose |
|---|---|---|
| OpenClaw | Docker | AI agent gateway (Dakota) |
| LLM Guard | Docker | Prompt injection / content safety scanner |
| Graphiti + FalkorDB | Docker | Temporal knowledge graph for agent memory |
| Request Broker | Docker | Email-based approval workflow |
| Gmail Notify | Docker | Email → Telegram push notifications |
| Calendar Aggregator | Docker | Multi-source calendar polling |
| SearXNG + Valkey | Docker | Self-hosted meta-search + cache |
| Nginx + Cloudflare Tunnel | Docker | Reverse proxy and secure ingress |
| Wazuh SIEM | Docker | Security monitoring (manager + indexer + dashboard) |
| Uptime Kuma | Docker | Website availability monitoring |
| CC Telegram Bot | systemd | Claude Code admin via Telegram |
| Session Health Watchdog | systemd timer | AI session corruption detection |
| Maintenance (3-tier) | systemd timers | Daily/weekly/monthly automated maintenance |
| Secret Rotation Reminder | systemd timer | Monthly secret rotation alerts |
Azure Web Server
| Service | Type | Purpose |
|---|---|---|
| LibreChat | Docker | Multi-provider AI chat platform |
| Skill Gateway (8 services) | Docker | Microservice API backend |
| 7 Static Websites | Docker (Nginx) | Public-facing web properties |
| n8n | Docker | Workflow automation |
| Flowise | Docker | LLM flow builder |
| Firecrawl (4 containers) | Docker | Web scraping service |
| Authelia + Redis | Docker | SSO / 2FA gateway |
| Wazuh SIEM (3 containers) | Docker | Security monitoring |
| Nginx | bare-metal | Reverse proxy with GeoIP |
Azure A100 VM
| Service | Type | Purpose |
|---|---|---|
| vLLM | Conda env | High-throughput model inference |
| Research projects | Various | EU AI Act, Verification Gap, EuroLLM |