bot0-daemon.md

bot0 Daemon - System Design

Overview

The daemon is the brain of bot0. It's a long-running Node.js process that lives on every machine where bot0 is installed. While Desktop and CLI are just interfaces, the daemon IS the agent — it thinks, plans, and acts.

Key principle: The daemon does the work. Everything else is UI or routing.


What Is The Daemon?

┌─────────────────────────────────────────────────────────────────────────────┐
│                              bot0 DAEMON                                     │
│                                                                             │
│  "A personal AI that lives on your machine"                                 │
│                                                                             │
│  • Runs as a background process (launchd on macOS, systemd on Linux)       │
│  • Always available, even when Desktop is closed                            │
│  • Connects to Hub for remote commands (Telegram, other machines)          │
│  • Executes tasks: chat, computer use, automations                          │
│  • Learns and adapts through the proactive engine                           │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Identity

Each daemon has a unique identity:

typescript
interface DaemonIdentity { id: string; // "d_abc123" - unique ID name: string; // "macbook-pro" - human-readable userId: string; // "u_xyz" - owner machineId: string; // Hardware fingerprint capabilities: string[]; // ["chat", "computer_use", "browser", "code"] status: "online" | "offline" | "busy"; }

Architecture

┌─────────────────────────────────────────────────────────────────────────────┐
│                              DAEMON PROCESS                                  │
│                                                                             │
│  ┌───────────────────────────────────────────────────────────────────────┐ │
│  │                         COMMUNICATION LAYER                            │ │
│  │                                                                        │ │
│  │   ┌─────────────┐    ┌─────────────┐    ┌─────────────┐              │ │
│  │   │ IPC Server  │    │ Hub Client  │    │ HTTP Server │              │ │
│  │   │             │    │             │    │ (optional)  │              │ │
│  │   │ Unix socket │    │ WebSocket   │    │ localhost   │              │ │
│  │   │ for local   │    │ for remote  │    │ for tools   │              │ │
│  │   └──────┬──────┘    └──────┬──────┘    └──────┬──────┘              │ │
│  │          │                  │                  │                      │ │
│  └──────────┼──────────────────┼──────────────────┼──────────────────────┘ │
│             │                  │                  │                        │
│             └──────────────────┼──────────────────┘                        │
│                                ▼                                           │
│  ┌───────────────────────────────────────────────────────────────────────┐ │
│  │                         MESSAGE ROUTER                                 │ │
│  │                                                                        │ │
│  │   Receives messages from all sources, routes to appropriate handler   │ │
│  │                                                                        │ │
│  └─────────────────────────────────┬─────────────────────────────────────┘ │
│                                    │                                       │
│                                    ▼                                       │
│  ┌───────────────────────────────────────────────────────────────────────┐ │
│  │                           AGENT LOOP                                   │ │
│  │                                                                        │ │
│  │   ┌─────────┐   ┌─────────┐   ┌─────────┐   ┌─────────┐             │ │
│  │   │ RECEIVE │ → │  THINK  │ → │   ACT   │ → │ RESPOND │             │ │
│  │   │         │   │         │   │         │   │         │             │ │
│  │   │ Parse   │   │ Plan    │   │ Execute │   │ Stream  │             │ │
│  │   │ input   │   │ w/ LLM  │   │ tools   │   │ output  │             │ │
│  │   └─────────┘   └─────────┘   └─────────┘   └─────────┘             │ │
│  │                                                                        │ │
│  └─────────────────────────────────┬─────────────────────────────────────┘ │
│                                    │                                       │
│                                    ▼                                       │
│  ┌───────────────────────────────────────────────────────────────────────┐ │
│  │                          CORE SERVICES                                 │ │
│  │                                                                        │ │
│  │   ┌──────────────┐  ┌──────────────┐  ┌──────────────┐               │ │
│  │   │  LLM Client  │  │ Conversation │  │    Config    │               │ │
│  │   │  (Anthropic) │  │   Manager    │  │   Manager    │               │ │
│  │   └──────────────┘  └──────────────┘  └──────────────┘               │ │
│  │                                                                        │ │
│  │   ┌──────────────┐  ┌──────────────┐  ┌──────────────┐               │ │
│  │   │    Logger    │  │  Lifecycle   │  │   Secrets    │               │ │
│  │   │              │  │   Manager    │  │   Store      │               │ │
│  │   └──────────────┘  └──────────────┘  └──────────────┘               │ │
│  │                                                                        │ │
│  └───────────────────────────────────────────────────────────────────────┘ │
│                                    │                                       │
│                                    ▼                                       │
│  ┌───────────────────────────────────────────────────────────────────────┐ │
│  │                            TOOL LAYER                                  │ │
│  │                                                                        │ │
│  │   ┌──────────────┐  ┌──────────────┐  ┌──────────────┐               │ │
│  │   │ Computer Use │  │    Shell     │  │   Browser    │               │ │
│  │   │ (screen,kbd) │  │  (commands)  │  │ (automation) │               │ │
│  │   └──────────────┘  └──────────────┘  └──────────────┘               │ │
│  │                                                                        │ │
│  │   ┌──────────────┐  ┌──────────────┐  ┌──────────────┐               │ │
│  │   │ File System  │  │     Code     │  │   Memory     │               │ │
│  │   │  (read/write)│  │  (exec/edit) │  │   (store)    │               │ │
│  │   └──────────────┘  └──────────────┘  └──────────────┘               │ │
│  │                                                                        │ │
│  └───────────────────────────────────────────────────────────────────────┘ │
│                                    │                                       │
│                                    ▼                                       │
│  ┌───────────────────────────────────────────────────────────────────────┐ │
│  │                       PROACTIVE ENGINE (Future)                        │ │
│  │                                                                        │ │
│  │   Observes → Thinks → Acts autonomously (with permissions)            │ │
│  │   See: bot0-proactive-architecture.md                                 │ │
│  │                                                                        │ │
│  └───────────────────────────────────────────────────────────────────────┘ │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

The Agent Loop

The agent loop is the core intelligence of the daemon. It follows a simple pattern:

1. RECEIVE

Parse incoming message and extract intent.

typescript
interface IncomingMessage { id: string; source: "desktop" | "cli" | "hub" | "telegram" | "scheduler"; type: "chat" | "task" | "command"; content: string; context?: { conversationId?: string; attachments?: Attachment[]; metadata?: Record<string, unknown>; }; }

2. THINK

Call the LLM to understand intent and plan actions.

typescript
// Simplified flow async function think(message: IncomingMessage): Promise<Plan> { // 1. Load conversation history const history = await conversationManager.getHistory(message.context?.conversationId); // 2. Load relevant context (memory, recent actions) const context = await memoryStore.getRelevantContext(message.content); // 3. Build prompt with system instructions + tools const prompt = buildPrompt({ systemPrompt: SYSTEM_PROMPT, tools: getAvailableTools(), history, context, message: message.content, }); // 4. Call LLM const response = await llm.complete(prompt); // 5. Parse response into plan return parsePlan(response); }

3. ACT

Execute the planned actions using tools.

typescript
interface Plan { thinking: string; // Internal reasoning actions: Action[]; // Tools to call response?: string; // Direct response (if no tools needed) } interface Action { tool: string; // "computer_use", "shell", "browser", etc. input: Record<string, unknown>; reasoning: string; // Why this action } async function act(plan: Plan): Promise<ActionResult[]> { const results: ActionResult[] = []; for (const action of plan.actions) { const tool = getTool(action.tool); const result = await tool.execute(action.input); results.push(result); // If tool returns error or needs clarification, stop if (result.status === "error" || result.needsInput) { break; } } return results; }

4. RESPOND

Stream response back to the client.

typescript
async function respond( message: IncomingMessage, plan: Plan, results: ActionResult[] ): Promise<void> { // If simple response (no tools) if (plan.response && results.length === 0) { await streamResponse(message.source, plan.response); return; } // If tools were used, summarize results const summary = await llm.summarize({ originalRequest: message.content, actions: plan.actions, results, }); await streamResponse(message.source, summary); }

Loop Diagram

┌─────────────────────────────────────────────────────────────────────────────┐
│                            AGENT LOOP                                        │
│                                                                             │
│   User: "Find all TODO comments in my code and create a task list"         │
│                                                                             │
│   ┌─────────────────────────────────────────────────────────────────────┐  │
│   │ RECEIVE                                                              │  │
│   │                                                                      │  │
│   │ • Parse message                                                      │  │
│   │ • Load conversation context                                          │  │
│   │ • Identify source (Desktop app)                                      │  │
│   └─────────────────────────────────────────────────────────────────────┘  │
│                                    │                                        │
│                                    ▼                                        │
│   ┌─────────────────────────────────────────────────────────────────────┐  │
│   │ THINK                                                                │  │
│   │                                                                      │  │
│   │ LLM Response:                                                        │  │
│   │ "I'll search for TODO comments using grep, then format as tasks"    │  │
│   │                                                                      │  │
│   │ Plan:                                                                │  │
│   │ 1. shell: grep -rn "TODO" --include="*.ts" ./src                    │  │
│   │ 2. code: Parse results and format as markdown task list             │  │
│   │ 3. file: Write to ./TASKS.md                                        │  │
│   └─────────────────────────────────────────────────────────────────────┘  │
│                                    │                                        │
│                                    ▼                                        │
│   ┌─────────────────────────────────────────────────────────────────────┐  │
│   │ ACT                                                                  │  │
│   │                                                                      │  │
│   │ Execute tools in sequence:                                           │  │
│   │                                                                      │  │
│   │ 1. shell.execute("grep -rn TODO...")                                │  │
│   │    → Found 23 TODO comments                                         │  │
│   │                                                                      │  │
│   │ 2. code.transform(results)                                          │  │
│   │    → Formatted as markdown checklist                                │  │
│   │                                                                      │  │
│   │ 3. file.write("./TASKS.md", content)                                │  │
│   │    → Created file                                                   │  │
│   └─────────────────────────────────────────────────────────────────────┘  │
│                                    │                                        │
│                                    ▼                                        │
│   ┌─────────────────────────────────────────────────────────────────────┐  │
│   │ RESPOND                                                              │  │
│   │                                                                      │  │
│   │ "I found 23 TODO comments across your codebase and created          │  │
│   │  TASKS.md with a checklist. Here's a summary:                       │  │
│   │  - 12 in src/components                                             │  │
│   │  - 8 in src/services                                                │  │
│   │  - 3 in src/utils"                                                  │  │
│   │                                                                      │  │
│   │ Stream response back to Desktop app                                  │  │
│   └─────────────────────────────────────────────────────────────────────┘  │
│                                                                             │
│   Loop continues for next user message...                                   │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Tools

Tools are the daemon's hands. They let it interact with the world.

Core Tools

ToolDescriptionCapabilities
computer_useControl mouse/keyboard, take screenshotsClick, type, scroll, screenshot, OCR
shellExecute shell commandsRun any CLI command, capture output
browserAutomate web browsersNavigate, click, fill forms, extract data
fileRead/write filesRead, write, list, search files
codeCode operationsParse, transform, execute code
memoryPersistent storageStore/retrieve context, learn preferences

Tool Interface

typescript
interface Tool { name: string; description: string; inputSchema: JSONSchema; // Execute the tool execute(input: unknown): Promise<ToolResult>; // Check if tool is available (permissions, dependencies) isAvailable(): Promise<boolean>; } interface ToolResult { status: "success" | "error" | "needs_input"; output: unknown; error?: string; artifacts?: Artifact[]; // Screenshots, files created, etc. }

Computer Use Tool (Detail)

The most powerful tool — controls the actual computer.

typescript
interface ComputerUseTool extends Tool { name: "computer_use"; actions: { screenshot(): Promise<{ image: Buffer; dimensions: Size }>; click(x: number, y: number): Promise<void>; doubleClick(x: number, y: number): Promise<void>; rightClick(x: number, y: number): Promise<void>; type(text: string): Promise<void>; keyPress(key: string, modifiers?: string[]): Promise<void>; scroll(direction: "up" | "down", amount: number): Promise<void>; moveMouse(x: number, y: number): Promise<void>; drag(fromX: number, fromY: number, toX: number, toY: number): Promise<void>; }; }

Implementation options:

  • macOS: AppleScript + Accessibility APIs
  • Cross-platform: nut-tree/nut.js
  • MLX grounding model for visual understanding

Shell Tool (Detail)

Execute commands in a sandboxed environment.

typescript
interface ShellTool extends Tool { name: "shell"; execute(input: { command: string; cwd?: string; timeout?: number; env?: Record<string, string>; }): Promise<{ stdout: string; stderr: string; exitCode: number; }>; }

Safety considerations:

  • Timeout all commands
  • Sandbox dangerous operations
  • Log all executions
  • Require confirmation for destructive commands

Browser Tool (Detail)

Automate web interactions.

typescript
interface BrowserTool extends Tool { name: "browser"; actions: { navigate(url: string): Promise<void>; click(selector: string): Promise<void>; type(selector: string, text: string): Promise<void>; screenshot(): Promise<Buffer>; extractText(selector?: string): Promise<string>; waitFor(selector: string, timeout?: number): Promise<void>; evaluate(script: string): Promise<unknown>; }; }

Implementation: Playwright or Puppeteer


Communication

IPC Server (Local)

Desktop and CLI communicate with daemon via Unix socket.

~/.bot0/daemon.sock

Protocol: JSON-RPC over Unix socket

typescript
// Request { "jsonrpc": "2.0", "id": "msg_123", "method": "chat", "params": { "message": "Hello", "conversationId": "conv_abc" } } // Response (streaming) { "jsonrpc": "2.0", "id": "msg_123", "result": { "type": "stream", "content": "Hello! How can I help you today?" } }

Hub Connection (Remote)

WebSocket to Hub for remote commands.

typescript
interface HubConnection { url: string; // wss://hub.bot0.dev token: string; // JWT from Bytespace auth reconnect: boolean; // Auto-reconnect on disconnect heartbeatInterval: number; // Send heartbeat every N seconds }

Messages from Hub:

  • Task dispatch from other daemons
  • Commands from Telegram/Slack/WhatsApp
  • System messages (updates, alerts)

Data Storage

All daemon data lives in ~/.bot0/:

~/.bot0/
├── config.json           # User configuration
├── daemon.pid            # PID file (running daemon)
├── daemon.sock           # Unix socket
├── secrets.enc           # Encrypted API keys
├── logs/
│   ├── daemon.log        # Main log file
│   └── daemon.log.1      # Rotated logs
├── conversations/
│   └── {id}.json         # Conversation history
├── memory/
│   ├── context.json      # Learned context
│   └── preferences.json  # User preferences
└── cache/
    └── ...               # Temporary files

Configuration

typescript
interface DaemonConfig { // Identity name: string; // Connection hubUrl: string; // Features features: { computerUse: boolean; proactiveEngine: boolean; }; // Limits limits: { maxConcurrentTasks: number; maxConversationHistory: number; maxToolExecutionTime: number; }; // Logging logLevel: "debug" | "info" | "warn" | "error"; }

Lifecycle

Startup

1. Check if already running (read PID file)
   └─ If running, exit with message

2. Load configuration
   └─ Create ~/.bot0 if not exists
   └─ Load config.json or use defaults

3. Initialize core services
   └─ Logger
   └─ Secrets store
   └─ LLM client

4. Start IPC server
   └─ Create Unix socket at ~/.bot0/daemon.sock
   └─ Begin accepting connections

5. Connect to Hub (if configured)
   └─ Authenticate with JWT
   └─ Register daemon identity
   └─ Start heartbeat

6. Write PID file
   └─ ~/.bot0/daemon.pid

7. Log "Daemon started"

Shutdown

1. Receive SIGTERM or SIGINT

2. Stop accepting new tasks

3. Wait for active tasks to complete (with timeout)

4. Disconnect from Hub
   └─ Send "going offline" message

5. Close IPC server
   └─ Notify connected clients

6. Cleanup
   └─ Remove PID file
   └─ Remove socket file
   └─ Flush logs

7. Exit

Running as Service

macOS (launchd):

xml
<!-- ~/Library/LaunchAgents/com.bot0.daemon.plist --> <plist version="1.0"> <dict> <key>Label</key> <string>com.bot0.daemon</string> <key>ProgramArguments</key> <array> <string>/usr/local/bin/bot0-daemon</string> </array> <key>RunAtLoad</key> <true/> <key>KeepAlive</key> <true/> </dict> </plist>

Linux (systemd):

ini
# ~/.config/systemd/user/bot0-daemon.service [Unit] Description=bot0 Daemon After=network.target [Service] ExecStart=/usr/local/bin/bot0-daemon Restart=always [Install] WantedBy=default.target

Security

Permissions

The daemon runs with user permissions. It can do anything the user can do.

Sensitive operations require confirmation:

  • Sending emails
  • Deleting files
  • Running commands with sudo
  • Accessing sensitive APIs

Secrets Storage

API keys stored encrypted at ~/.bot0/secrets.enc:

  • Anthropic API key
  • Bytespace auth token
  • Third-party API keys

Encryption: OS keychain (macOS Keychain, Linux Secret Service) or encrypted file with user passphrase.

Sandboxing (Future)

Consider sandboxing tool execution:

  • Shell commands in container/VM
  • Browser in separate profile
  • File access restricted to certain directories

Roadmap

Phase 1: Foundation

  • IPC server (Unix socket)
  • Configuration management
  • Lifecycle (start/stop/status)
  • Logging
  • PID file management

Phase 2: Agent Core ← IMPLEMENTED

  • LLM integration (Anthropic SDK)
  • Tool registry and execution
  • Basic agent loop with tools
  • HTTP test server
  • Streaming responses

See: Agent Loop Documentation

Phase 3: Tools ← IN PROGRESS

  • Shell tool
  • File tools (read, write, list, glob)
  • Code tool
  • Memory tool

Phase 4: Computer Use

  • Screenshot capture
  • Mouse/keyboard control
  • Visual grounding (MLX model)

Phase 5: Hub Connection

  • WebSocket client
  • Authentication
  • Remote task execution
  • Heartbeat/presence

Phase 6: Proactive Engine

  • Observation layer
  • Evaluation/thinking
  • Permission system
  • Autonomous actions

Open Questions

  1. LLM choice: Always Anthropic, or support multiple providers?

  2. Tool approval: How to handle tools that need user confirmation?

  3. Conversation storage: How long to keep history? Local vs cloud?

  4. Multi-daemon: Can one user run multiple daemons on different machines?

  5. Offline mode: What can daemon do without internet?

  6. Resource limits: How to prevent runaway tasks from consuming all resources?


References