Clawdbot

AI-Administered Server Infrastructure

How a mini PC, two cloud VMs, and a team of AI agents run a production environment — with an honest look at what works, what breaks, and what it means for the future of IT operations.

Author: Daniel Verloop — CiviQs | Date: February 2026

What This Is

This document describes a production infrastructure where AI agents are not just tools — they are operators. Two AI agents collaboratively manage a three-machine environment: an office server, a GPU compute node, and a web hosting platform. Together, they handle system administration, user communication, email, calendar management, security monitoring, and more.

Production since early 2026. This is not a demo or proof-of-concept — it serves real users, hosts public websites, runs research workloads, and manages itself with minimal human intervention.

What makes this different

Most AI-in-operations setups use AI as an assistant: a human asks a question, the AI answers. This system inverts that model:

Dakota — the user-facing agent, runs 24/7 on Telegram. Reads emails, manages calendars, performs research, writes code, sends voice messages — autonomously.
Claude Code — the system administrator. Deploys containers, diagnoses issues, applies patches, maintains documentation — triggered by events, not prompts.
A request broker enables asynchronous collaboration with human approval for consequential actions.

The human operator remains the decision-maker for anything consequential. But the day-to-day operational burden has shifted dramatically.

Architecture Overview

The infrastructure spans three machines with distinct roles, connected via SSH and managed by AI agents.

High-level: The Three Machines

graph LR subgraph Internet USERS["Telegram / Email / Web"] end subgraph clawdbot ["Office Server (clawdbot)"] AGENTS["Dakota + Claude Code
AI Agent Stack
~20 Docker containers"] end subgraph azure_web ["Azure Web Server"] WEB["LibreChat + 6 Websites
Skill Gateway
41 Docker containers"] end subgraph azure_gpu ["Azure A100 VM"] GPU["vLLM + NVIDIA A100 80GB
Model Inference
Research Projects"] end USERS <-->|"Telegram Bot API
Gmail API
Cloudflare Tunnel"| AGENTS AGENTS -.->|SSH| WEB AGENTS -.->|SSH| GPU USERS -->|HTTPS| WEB

Detail: Inside the Office Server

graph TB subgraph external ["External"] TG[Telegram] EMAIL[Gmail] ANTH[Anthropic API] end subgraph clawdbot ["Office Server"] CC[Claude Code
System Administrator] subgraph agent_stack ["AI Agent Stack (OpenClaw)"] OC[OpenClaw Gateway] DAK[Dakota Agent
Claude Opus 4.5] SB[Sandbox Container] LG[LLM Guard] GM[Graphiti
Knowledge Graph] end subgraph support ["Supporting Services"] RB[Request Broker] GN[Gmail Notify] CA[Calendar Aggregator] end subgraph ops ["Operations"] WZ[Wazuh SIEM] MAINT[Maintenance Timers] end end TG <-->|Bot API| OC EMAIL <--> GN EMAIL <--> RB OC <--> ANTH CC <-->|Bridge CLI| OC OC --> LG OC --> GM DAK --> SB

The Three Machines

Machine	Role	Hardware	AI Agent
clawdbot (office)	AI agent hub, service gateway	Intel i7 NUC, 64 GB RAM, 1.8 TB NVMe	Dakota + Claude Code
Azure Web Server	Public web hosting, SaaS	Xeon 16 vCPU, 32 GB RAM	— (managed remotely)
Azure A100 VM	GPU inference & research	AMD EPYC 24 vCPU, 216 GB RAM, A100 80GB	— (managed remotely)

The Physical Setup

The heart of the system is an ASUS NUC 13 mini PC — a compact, silent machine in an office. Despite its small footprint, it runs the entire AI agent stack: ~20 Docker containers, two AI agents, security monitoring, and acts as the control plane for two Azure VMs.

graph LR subgraph Office ["Office Network"] NUC["ASUS NUC 13
Intel i7-1360P
64 GB RAM
1.8 TB NVMe"] WS["Windows 11 Workstation
(SSH access)"] end subgraph Azure ["Azure Cloud"] WEBVM["Web Server
16 vCPU, 32 GB RAM
41 Docker containers"] GPUVM["A100 VM
24 vCPU, 216 GB RAM
80 GB GPU VRAM"] end subgraph External ["External Services"] ANTH["Anthropic API
(Claude Opus 4.5)"] TGAPI["Telegram Bot API"] GAPI["Google APIs
(Gmail, Calendar)"] MSAPI["Microsoft Graph API
(Calendar, Email)"] EL["ElevenLabs
(Text-to-Speech)"] GROQ["Groq
(Speech-to-Text)"] end WS <-->|SSH| NUC NUC <-->|SSH| WEBVM NUC <-->|SSH| GPUVM NUC <--> ANTH NUC <--> TGAPI NUC <--> GAPI NUC <--> MSAPI NUC <--> EL NUC <--> GROQ

Why a NUC and not the cloud?

Cost and control. The office server runs 24/7 on a home internet connection. Cloud VMs are used only where their specific capabilities are needed: GPU for AI inference, and a public IP with high bandwidth for web hosting. The NUC handles everything else at a fraction of the cost.

Containerization Philosophy

Every service runs in Docker, with one exception (Kodi media center). Containers are configured with:

Read-only root filesystems where possible
All capabilities dropped (cap_drop: ALL), with only necessary ones added back
Non-root execution inside containers
Memory, CPU, and PID limits to prevent resource exhaustion
Isolated Docker networks — services can only communicate where explicitly configured

Dakota: The User-Facing AI Agent

Dakota runs inside the OpenClaw gateway — an open-source AI agent framework (Node.js) that connects LLMs to messaging platforms with sandboxed tool access. OpenClaw handles the full agent lifecycle: message routing, tool execution, session management, and provider integration.

How Dakota Works

sequenceDiagram participant U as User (Telegram) participant TG as Telegram API participant OC as OpenClaw Gateway participant LG as LLM Guard participant API as Claude API (Opus 4.5) participant SB as Sandbox Container participant EXT as External Services U->>TG: Send message (text/voice/photo) TG->>OC: Webhook / polling OC->>LG: Scan input for injection LG-->>OC: Pass / Block OC->>OC: Enrich with memory graph context OC->>API: Messages + tools + system prompt API-->>OC: Response (possibly with tool calls) loop Tool execution OC->>SB: Execute tool in sandbox SB->>EXT: API calls, file operations, code execution EXT-->>SB: Results SB-->>OC: Tool result OC->>API: Continue with tool results end API-->>OC: Final response OC->>LG: Scan output for PII/malicious URLs OC->>TG: Send response (text/voice/image) TG->>U: Deliver message

Capabilities

Capability	Implementation	Description
Text conversation	Claude Opus 4.5	Natural language interaction
Voice messages	Groq Whisper + ElevenLabs	Receives and sends voice
Email	Gmail API	Reads and sends autonomously
Calendar	MS Graph + Google Calendar	14 calendars, 3 accounts
Web research	Perplexity + Playwright	Search and browser automation
Code execution	Sandboxed exec	Python, shell in isolation
Memory	Graphiti + FalkorDB	Temporal knowledge graph
Sub-agents	OpenClaw sessions	Delegates heavy tasks
Image generation	Azure OpenAI	DALL-E / Sora on request
Elevated access	Allowlisted CLIs	Claude Code, Codex, Gemini

The Sandbox Model

Dakota operates in a double-containerized environment:

graph TB subgraph host ["Host (clawdbot)"] subgraph gateway ["OpenClaw Gateway Container"] OC_PROC["OpenClaw Process
(Node.js)"] subgraph sandbox ["Sandbox Container"] TOOLS["Sandbox Tools:
exec, read, write, edit,
browser, python3, gh, curl"] FS["Sandboxed Filesystem
/workspace"] end end SHARED["Shared Directory
(host filesystem)"] ELEVATED["Elevated Exec
(host binaries)"] end OC_PROC --> sandbox OC_PROC -->|"allowlisted
commands only"| ELEVATED sandbox <-->|"bind mount"| SHARED

Sandbox tools (default): Isolated container, no network, dropped capabilities, memory limits
Elevated tools (explicit): Host-level access only from allowlisted senders with permission checks

Even if Dakota's prompt is compromised, the blast radius is contained to the sandbox. Host-level access requires passing through application-layer permission checks that cannot be bypassed by prompt injection.

Claude Code: The System Administrator

Claude Code runs on the host as the system administrator. Full host access: Docker, SSH, configuration files, deployments.

How Claude Code Operates

Not a long-running daemon — invoked on demand:

Trigger	Mechanism	Use Case
Interactive SSH	Human runs `claude`	Major changes, debugging
Telegram bot	systemd service	Quick admin from mobile
Dakota request	systemd path watcher	Tasks she can't do herself
Email from operator	systemd path watcher	Deploy via email
Scheduled maintenance	systemd timers	Daily/weekly/monthly checks

Each invocation is a fresh Claude Code instance with appropriate tool permissions scoped to the task.

Multi-Agent Collaboration

The two AI agents serve different roles, have different access levels, and communicate through structured protocols.

sequenceDiagram participant D as Dakota
(User Agent) participant FS as Shared Filesystem participant SD as systemd
(Path Watcher) participant CC as Claude Code
(Sysadmin) participant OP as Operator
(Email) Note over D: Dakota needs a system change
she can't do herself D->>FS: Write request JSON FS->>SD: inotify triggers dakota-request.path SD->>CC: Spawn claude -p (scoped permissions) CC->>CC: Evaluate request
(risk assessment, feasibility) CC->>OP: Email: "Dakota requests X. Approve?" OP->>OP: Reply via email (natural language) Note over CC: Gmail poller detects reply CC->>CC: Interpret intent
(approve / deny / modify) alt Approved CC->>CC: Execute the change CC->>D: Bridge message: "Done, here's what happened" else Denied CC->>D: Bridge message: "Denied because..." else Questions CC->>OP: Follow-up email end

The Bridge Protocol

Claude Code can message Dakota via a bridge CLI. One-way: CC → Dakota. Every message is HMAC-SHA256 authenticated with a shared secret and 5-minute replay window. This prevents:

Telegram messages from impersonating bridge commands
Replay attacks on previously sent instructions
External parties from injecting messages into the agent pipeline

The Request Broker

When Dakota needs host access:

Dakota writes a structured JSON request to shared directory
systemd detects via inotify
Claude Code instance spawned to evaluate
Claude Code emails operator with summary
Operator replies in natural language
Claude Code interprets and acts
Dakota notified via bridge

Request Broker State Machine

stateDiagram-v2 [*] --> RequestWritten: Dakota writes JSON RequestWritten --> Detected: systemd inotify Detected --> Evaluating: Claude Code spawned Evaluating --> EmailSent: Send to operator EmailSent --> PendingReply: Waiting for email reply PendingReply --> ReplyDetected: Gmail poller (10s) ReplyDetected --> Interpreting: Claude Code spawned Interpreting --> Approved: Operator says yes Interpreting --> Denied: Operator says no Interpreting --> FollowUp: Operator has questions FollowUp --> EmailSent: Send follow-up Approved --> Executing: Claude Code acts Executing --> DakotaNotified: Bridge message Denied --> DakotaNotified DakotaNotified --> [*]

The Web Platform

The Azure web server hosts public-facing services: 41 Docker containers across 17 isolated networks.

Category	Services
AI Chat Platforms	LibreChat (govgpt.nl, edugpt.nl, chat.civiqs.ai)
Skill Gateway	8 microservices: office, content, analytics, research, video
Public Websites	6+ sites including portfolio and client projects
Automation	n8n, Flowise
Security	Authelia (SSO/2FA), Wazuh SIEM, GeoIP blocking, fail2ban (7 jails)
Research	Firecrawl, SearXNG, Qdrant

GPU Infrastructure

The Azure A100 VM serves as the GPU compute node for model inference and research.

Component	Specification
GPU	NVIDIA A100 80GB PCIe
CPU	AMD EPYC 7V13 (24 vCPUs)
RAM	216 GB DDR4
Storage	248 GB root + 1 TB data (685 GB model cache)
Software	vLLM 0.12.0, CUDA 12.4

Cached Models

Model	Size	Purpose
Mistral-Large-Instruct-2411	229 GB	General instruction following
Qwen2.5-72B-Instruct	136 GB	Multilingual instruction following
QwQ-32B	62 GB	Reasoning tasks
DeepSeek-R1-Distill-Qwen-14B	28 GB	Efficient reasoning
EuroLLM models	18-43 GB	EU-language specialized

Security Architecture

Security is designed as defense-in-depth: seven layers, from hard boundaries to probabilistic defenses. The key principle is that no single layer is trusted to be sufficient.

graph TB subgraph hard ["Hard Boundaries (code-enforced, injection-proof)"] L1["Application Permissions
Code-level Telegram sender checks
Elevated exec only from allowlisted users"] L2["Sandbox Isolation
Read-only root, no network, caps dropped
Memory / CPU / PID limits"] end subgraph probabilistic ["Probabilistic Defenses"] L3["LLM Guard
ML-based injection detection (DeBERTa)
PII redaction, malicious URL scanning"] L4["Telegram Allowlist
Only authorized users can interact"] L5["Bridge Authentication
HMAC-SHA256 signed, 5-min replay window"] end subgraph behavioral ["Behavioral Defenses (speed bumps)"] L6["Content Markers
External content wrapped in trust markers
80 regex patterns for known injections"] L7["Prompt Instructions
Identity anchoring, trust tiers"] end

Threat Model

The system acknowledges the Lethal Trifecta (Simon Willison's formulation): AI agent with private data + untrusted content + external actions. Dakota sits squarely in this trifecta.

Research reference: The October 2025 multi-lab paper (OpenAI, Anthropic, DeepMind) demonstrated that adaptive attacks bypass all published prompt-level defenses >90% of the time. This is why hard boundaries (code-level checks, sandbox isolation) are considered essential — prompt-level defenses alone are insufficient.

What Happens When LLM Guard Flags Something

LLM Guard operates in enforce mode — flagged content is blocked, not just logged:

The incoming message is rejected before reaching the Claude API
Dakota does not see or respond to the flagged message
The event is logged and triggers a Wazuh SIEM alert (level 10+)
The operator receives notification via email and Telegram

LLM Guard Technical Details

Model: DeBERTa (Microsoft language model fine-tuned for classification)
Execution: Runs locally on CPU via ONNX — no external API calls
Threshold: 0.8 (raised from 0.6 to reduce false positives on voice transcriptions)
Failure mode: Fail-open — Dakota continues if LLM Guard goes down (hard boundaries still active)

SIEM and Monitoring

The entire infrastructure is monitored by Wazuh SIEM (deployed on both office server and web server):

File integrity monitoring on critical configuration directories
Custom alert rules for AI-specific events: sandbox execution, elevated command usage, LLM Guard injection flags, bridge authentication failures
Active response: automatic IP blocking on SSH brute force
Daily digest reports emailed to the operator
Real-time alerts for high-severity events (level 10+) via email and Telegram

Automated Operations

Three-tier automated maintenance on systemd timers:

graph LR subgraph daily ["Daily (06:30)"] D1[Disk usage check] D2[Memory check] D3[Container health] D4[Failed systemd units] D5[Unexpected ports] D6[Docker daemon health] D7[Wazuh agent status] end subgraph weekly ["Weekly (Saturday 04:30)"] W1[Docker cleanup
dangling images/volumes] W2[Orphaned sandbox cleanup] W3[Stale file reports] W4[Log rotation] W5[Image age report] W6[Pending apt updates] end subgraph monthly ["Monthly (15th, 03:00)"] M1[Database maintenance
VACUUM, OPTIMIZE] M2[Full disk breakdown] M3[Container resource audit] M4[Security posture review] M5[Package inventory diff] M6[Uptime/load report] end daily -->|"Alert-only
(email if issues)"| OP[Operator] weekly -->|"Always sends
report"| OP monthly -->|"Always sends
report"| OP

Session Health Watchdog

AI session corruption is the hardest operational challenge. When the Anthropic API returns malformed responses during streaming (truncated JSON, merged SSE events), session history can become permanently corrupted.

The system runs a session health watchdog every 5 minutes that:

Scans the tail of all session files for corruption patterns
Backs up corrupted sessions before modifying them
Truncates to the last valid boundary
Automatically restarts the OpenClaw container on repair

Secret Rotation

A quarterly/semi-annual/annual rotation schedule covers all secrets, with:

A systemd timer that checks which secrets are due each month
Automated email reminders to the operator
A documented post-rotation checklist

Data Flows

Telegram Message Flow

sequenceDiagram participant U as User participant TG as Telegram participant OC as OpenClaw participant LG as LLM Guard participant MEM as Knowledge Graph participant API as Claude API participant SB as Sandbox participant TTS as ElevenLabs U->>TG: Voice message TG->>OC: Audio file OC->>OC: Groq Whisper: Speech → Text OC->>LG: Scan transcribed text LG-->>OC: Pass OC->>MEM: GET /get-memory (extract context) MEM-->>OC: 10 relevant facts from graph OC->>API: System prompt + memory + message + tools API->>OC: Tool call: web_search("query") OC->>SB: Execute search SB-->>OC: Search results OC->>API: Tool result API-->>OC: Response with TTS tags OC->>LG: Scan output LG-->>OC: Pass (PII check clean) OC->>TTS: Convert text to speech TTS-->>OC: Audio (OGG) OC->>TG: Send voice message TG->>U: Voice reply Note over OC,MEM: Async: ingest message into knowledge graph

Calendar Aggregation Flow

graph TB subgraph Sources ["Calendar Sources (polled every 10 min)"] MS1["Microsoft 365
(6 calendars)"] MS2["Personal Outlook
(5 calendars)"] GC["Google Calendar"] end subgraph Aggregator ["Calendar Aggregator Service"] POLL[Poller] MERGE[Merge & Deduplicate] JSON["Output: availability.json"] end subgraph Consumer ["Consumer"] DAK2[Dakota reads via
shared filesystem] end MS1 -->|"MS Graph API
+ MSAL"| POLL MS2 -->|"MS Graph API
+ MSAL"| POLL GC -->|"Google Calendar API
+ OAuth2"| POLL POLL --> MERGE MERGE --> JSON JSON --> DAK2

Trade-offs and Design Decisions

What Works Well

Decision	Benefit
Containerize everything	Consistent environments, easy rollback, strong isolation
AI agents as operators	Autonomous handling of routine tasks
systemd as orchestrator	Reliable event-driven automation, survives reboots
Defense-in-depth	No single layer is trusted
Human-in-the-loop	System changes require explicit approval
Knowledge graph memory	Dakota remembers context across conversations
One-way bridge	Prevents agent-to-agent loops

What Requires Compromise

Decision	Trade-off
Claude Opus 4.5	High cost, API dependency
Single NUC	Single point of failure
Telegram interface	Message limits, platform dependency
No global git remote	Maximum security, but disk failure = total loss
LLM Guard fail-open	Availability over temporary protection loss

Risks

Technical Risks

Risk	Severity	Mitigation
Anthropic API outage	High	Dakota offline, no fallback yet
Session corruption	Medium	Automated watchdog
NUC failure	Medium	No HA, manual recovery
Secret sprawl	Medium	Rotation schedule

What Survives a NUC Failure

If the office server goes down (hardware failure, power outage, disk failure):

Component	Status	Notes
Azure Web Server	Continues running	All 41 containers, websites, LibreChat — fully independent
Azure A100 VM	Continues running	vLLM serving, research workloads — fully independent
Dakota	Offline	Runs on the NUC; no failover
Claude Code	Offline	Runs on the NUC; no remote management
Email notifications	Offline	Gmail Notify, Request Broker — all on NUC
Security monitoring	Partially degraded	Azure VMs have their own Wazuh; NUC-side offline

Recovery: No automated disaster recovery. Git repos and Docker volumes are local-only — a disk failure without backup means total configuration loss. This is an acknowledged trade-off: simplicity and security (no cloud exposure) over resilience.

AI-Specific Risks

Risk	Severity	Mitigation
Prompt injection	High	7-layer defense stack
Agent hallucination	Medium	Logged changes, approval for destructive ops
Cost unpredictability	Medium	Token limits, session management

Opportunities

Near-term

Local model fallback — A100 as backup when Anthropic unavailable
Proactive monitoring — Trend analysis and preventive action
Multi-agent delegation — Specialized agents for code review, security

Medium-term

Self-healing infrastructure — Automatic failure detection and remediation
Organizational knowledge base — Extend knowledge graph to decisions, policies

Long-term

Replicable architecture — Reference architecture for small teams
Federated agent networks — Cross-organization collaboration

Lessons Learned

1. Session corruption is the hardest operational problem

AI sessions accumulate state. Malformed API responses corrupt that state permanently. Fix: Automated monitoring and safe reset without losing learned behaviors.

2. AI agents lose learned behaviors on session reset

Deleting a corrupted session lost Dakota's learned TTS format. Fix: Document critical behaviors in config files, not session memory.

3. Defense-in-depth is not optional for AI agents

Adaptive attacks bypass prompt-level defenses >90% of the time. Fix: Hard boundaries (code checks, sandboxing) are the actual security.

4. systemd is underrated for AI orchestration

Path watchers and timers spawn instances on demand. No always-on process, automatic restart, built-in logging.

5. The one-way bridge is a feature, not a limitation

Prevents agent-to-agent feedback loops. The "dead drop" pattern works surprisingly well.

6. Timezone bugs in containerized services are deceptively hard

A UTC container consuming localized datetimes without timezone suffixes produces offset errors. Simple fix, complex debugging.

Appendix: Service Inventory

Office Server (clawdbot)

Service	Type	Purpose
OpenClaw	Docker	AI agent gateway (Dakota)
LLM Guard	Docker	Prompt injection / content safety scanner
Graphiti + FalkorDB	Docker	Temporal knowledge graph for agent memory
Request Broker	Docker	Email-based approval workflow
Gmail Notify	Docker	Email → Telegram push notifications
Calendar Aggregator	Docker	Multi-source calendar polling
SearXNG + Valkey	Docker	Self-hosted meta-search + cache
Nginx + Cloudflare Tunnel	Docker	Reverse proxy and secure ingress
Wazuh SIEM	Docker	Security monitoring (manager + indexer + dashboard)
Uptime Kuma	Docker	Website availability monitoring
CC Telegram Bot	systemd	Claude Code admin via Telegram
Session Health Watchdog	systemd timer	AI session corruption detection
Maintenance (3-tier)	systemd timers	Daily/weekly/monthly automated maintenance
Secret Rotation Reminder	systemd timer	Monthly secret rotation alerts

Azure Web Server

Service	Type	Purpose
LibreChat	Docker	Multi-provider AI chat platform
Skill Gateway (8 services)	Docker	Microservice API backend
7 Static Websites	Docker (Nginx)	Public-facing web properties
n8n	Docker	Workflow automation
Flowise	Docker	LLM flow builder
Firecrawl (4 containers)	Docker	Web scraping service
Authelia + Redis	Docker	SSO / 2FA gateway
Wazuh SIEM (3 containers)	Docker	Security monitoring
Nginx	bare-metal	Reverse proxy with GeoIP

Azure A100 VM

Service	Type	Purpose
vLLM	Conda env	High-throughput model inference
Research projects	Various	EU AI Act, Verification Gap, EuroLLM

🔐 Access Required