ctx0 — System Architecture
Overview
ctx0 is a context management system that gives AI agents persistent memory. It works with any agent (bot0, Claude Code, Cursor, Gemini CLI, etc.) and provides a unified way to capture, store, retrieve, and organize context across sessions.
Core insight: ctx0 is not a separate AI — it's bot0 configured to manage a context "codebase". Same engine, specialized purpose.
┌─────────────────────────────────────────────────────────────────────────────┐
│ │
│ Claude Code = Agent Loop + Code Repo + Coding Tools │
│ ctx0 = Agent Loop + Context Store + Memory Tools │
│ │
│ Same pattern. Different purpose. │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
📖 Related documentation:
- bot0 Architecture — The agent runtime
- bot0 Proactive Engine — Autonomous agent capabilities
Architecture Overview
┌─────────────────────────────────────────────────────────────────────────────┐
│ ctx0 SYSTEM │
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ ANY MAIN AGENT │ │
│ │ bot0 │ Claude Code │ Cursor │ Gemini CLI │ OpenCode │ etc. │ │
│ │ │ │
│ │ Has: Session Tracker (builds changelog) │ │
│ │ Has: Extractor tool (retrieves context) │ │
│ │ Triggers: ctx0 dump on compaction/idle/clear │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ┌───────────────────────┼───────────────────────┐ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
│ │ EXTRACTOR │ │ LIBRARIAN │ │ CURATOR │ │
│ │ (retrieve) │ │ (archive) │ │ (organize) │ │
│ │ │ │ │ │ │ │
│ │ On-demand │ │ On dump trigger │ │ Periodic/sync │ │
│ │ context fetch │ │ session→vault │ │ vault cleanup │ │
│ └────────┬────────┘ └────────┬────────┘ └────────┬────────┘ │
│ │ │ │ │
│ └─────────────────────┼─────────────────────┘ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ ctx0 STORAGE (Supabase) │ │
│ │ │ │
│ │ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ │
│ │ │ PostgreSQL │ │ Supabase │ │ pgvector │ │ │
│ │ │ │ │ Storage │ │ │ │ │
│ │ │ │ │ │ │ │ │ │
│ │ │ • Tree struct │ │ • Files │ │ • Embeddings │ │ │
│ │ │ • Metadata │ │ • Attachments │ │ • Semantic srch │ │ │
│ │ │ • SQL queries │ │ • Large data │ │ • Deduplication │ │ │
│ │ └─────────────────┘ └─────────────────┘ └─────────────────┘ │ │
│ │ │ │
│ │ Self-hostable: Supabase Cloud OR self-hosted instance │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ┌─────────────┴─────────────┐ │
│ │ SYNC LAYER │ │
│ │ (conflict resolution) │ │
│ └─────────────┬─────────────┘ │
│ │ │
│ ┌────────────────────┴────────────────────┐ │
│ │ │ │
│ ▼ ▼ │
│ ┌─────────────────────┐ ┌─────────────────────┐ │
│ │ E2B SANDBOX │ │ LOCAL DAEMON │ │
│ │ (cloud, ephemeral)│ OR │ (on-prem) │ │
│ │ │ │ │ │
│ │ Spun up per op │ │ Always available │ │
│ │ Fully isolated │ │ Lower latency │ │
│ └──────────┬──────────┘ └──────────┬──────────┘ │
│ │ │ │
│ └──────────────┬───────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ CODEBASE EXPORT │ │
│ │ │ │
│ │ Generates a crawlable folder structure from the database │ │
│ │ Agents can use standard file tools (read, grep, glob) │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
1. Trigger System
ctx0 dumps are triggered automatically. Works with any agent that supports hooks.
Trigger Events
| Trigger | When | Priority |
|---|---|---|
| Pre-compaction | Before context window compacts | High — capture before data is lost |
| Idle timeout | Chat idle for 5 minutes | Medium — natural pause point |
| Session end | User closes chat or clears context | High — session complete |
| New session | User starts new chat | Medium — archive previous |
| Manual | User runs /archive or similar | High — explicit intent |
| Task complete | Agent signals task done | High — natural checkpoint |
Hook Implementation
// Universal trigger interface — adapters for each agent interface Ctx0Trigger { type: 'compaction' | 'idle' | 'session_end' | 'new_session' | 'manual' | 'task_complete' sessionId: string timestamp: number } // bot0 implementation class Bot0Ctx0Hook { private idleTimer: NodeJS.Timeout | null = null // Hook into agent loop onMessageComplete() { this.resetIdleTimer() } onCompactionStart() { // Dump BEFORE compaction happens this.triggerDump({ type: 'compaction' }) } onSessionEnd() { this.triggerDump({ type: 'session_end' }) } private resetIdleTimer() { if (this.idleTimer) clearTimeout(this.idleTimer) this.idleTimer = setTimeout(() => { this.triggerDump({ type: 'idle' }) }, 5 * 60 * 1000) // 5 minutes } private async triggerDump(trigger: Ctx0Trigger) { const session = this.sessionTracker.export() await ctx0.librarian.archive(session, trigger) } } // Claude Code adapter (via CLAUDE.md instruction + MCP) // Gemini CLI adapter (via config) // Cursor adapter (via rules file)
Agent-Specific Configuration
# ctx0 trigger config for different agents bot0: hooks: compaction: native # Built-in hook idle_timeout: 300 # 5 minutes session_end: native tracker: native # Built-in session tracker claude_code: hooks: compaction: mcp # Via MCP server idle_timeout: 300 session_end: mcp tracker: mcp # MCP-based tracker instruction: | # Add to CLAUDE.md You have access to ctx0 memory. Before any context compaction, call the ctx0_dump tool. When finishing a task, call ctx0_archive. cursor: hooks: compaction: extension # VS Code extension idle_timeout: 300 session_end: extension tracker: extension rules_file: .cursorrules gemini_cli: hooks: compaction: plugin idle_timeout: 300 session_end: plugin tracker: plugin
2. Session Tracker
Every agent runs a Session Tracker that builds a detailed changelog programmatically. This runs in the background during the session.
What Gets Tracked
interface SessionLog { sessionId: string startedAt: number endedAt?: number // Conversation messages: Message[] // File operations filesRead: FileAccess[] filesWritten: FileWrite[] filesModified: FileEdit[] // Tool usage toolCalls: ToolCall[] // Subagents subagents: SubagentRun[] // External interactions apiCalls: ApiCall[] webSearches: WebSearch[] // Data touched dataAccessed: DataAccess[] // Outcomes tasksCompleted: Task[] decisions: Decision[] artifacts: Artifact[] // Metadata agent: string // 'bot0', 'claude_code', etc. context: { workingDir: string project?: string user: string } } interface FileWrite { path: string content: string timestamp: number reason?: string } interface ToolCall { tool: string input: any output: any timestamp: number duration: number } interface SubagentRun { type: string // 'explore', 'plan', 'execute' prompt: string result: any toolCalls: ToolCall[] timestamp: number duration: number } interface Decision { description: string reasoning: string alternatives?: string[] timestamp: number }
Session Tracker Implementation
class SessionTracker { private log: SessionLog constructor(sessionId: string, agent: string) { this.log = { sessionId, agent, startedAt: Date.now(), messages: [], filesRead: [], filesWritten: [], filesModified: [], toolCalls: [], subagents: [], apiCalls: [], webSearches: [], dataAccessed: [], tasksCompleted: [], decisions: [], artifacts: [], context: {} } } // Called by agent loop trackMessage(message: Message) { this.log.messages.push(message) } trackToolCall(call: ToolCall) { this.log.toolCalls.push(call) // Auto-categorize if (call.tool === 'read_file') { this.log.filesRead.push({ path: call.input.path, timestamp: call.timestamp }) } if (call.tool === 'write_file') { this.log.filesWritten.push({ path: call.input.path, content: call.input.content, timestamp: call.timestamp }) } // ... etc } trackSubagent(run: SubagentRun) { this.log.subagents.push(run) } trackDecision(decision: Decision) { this.log.decisions.push(decision) } // Export for dump export(): SessionLog { this.log.endedAt = Date.now() return this.log } // Serialize for storage serialize(): string { return JSON.stringify(this.log, null, 2) } }
3. Subagents
3.1 Extractor (Retrieve)
The Extractor is called by the main agent when it needs context from the vault. It crawls the vault, runs semantic search, and returns relevant information.
// Extractor — retrieves context on demand const EXTRACTOR_PROMPT = ` You are the ctx0 Extractor. Your job is to find relevant context from the user's memory vault. ## Your capabilities - Search the vault using file operations (read, glob, grep) - Run semantic searches using the vector index - Execute learned queries for common patterns - Build and return focused context bundles ## Your process 1. Understand what context is needed 2. Check learned queries for a match (fast path) 3. If no match, search the vault intelligently 4. Return concise, relevant context 5. Optionally, save this as a new learned query ## Output format Return a structured context bundle: - summary: Brief overview of what you found - items: Array of relevant memories/files - sources: Paths to source files - confidence: How confident you are this is complete ## Rules - Be concise — the main agent has limited context - Prioritize recency and relevance - Include source paths so main agent can dig deeper - If you create a new learned query, note it ` class Extractor { private agent: Bot0Agent constructor() { this.agent = new Bot0Agent({ systemPrompt: EXTRACTOR_PROMPT, tools: [ 'read_file', 'glob', 'grep', 'vector_search', 'run_learned_query', 'save_learned_query' ], maxTokens: 4096 // Keep responses focused }) } async retrieve(query: string, scope?: string[]): Promise<ContextBundle> { const result = await this.agent.run(` Find context for: "${query}" Scope: ${scope?.join(', ') || 'all'} `) return this.parseResult(result) } }
Learned Queries
The Extractor learns common queries and saves them for fast retrieval:
interface LearnedQuery { id: string name: string naturalLanguage: string[] // Variations that match queryPlan: QueryPlan // How to execute params?: ParamDef[] useCount: number avgLatencyMs: number createdAt: number lastUsed: number } interface QueryPlan { steps: QueryStep[] } type QueryStep = | { type: 'glob', pattern: string } | { type: 'grep', pattern: string, paths: string } | { type: 'vector_search', query: string, collection: string } | { type: 'read', path: string } | { type: 'filter', condition: string } | { type: 'join', on: string } // Example learned query const investorEmailsQuery: LearnedQuery = { id: 'lq_001', name: 'investor_emails', naturalLanguage: [ 'investor emails', 'emails from investors', 'what have investors said', 'investor communications' ], queryPlan: { steps: [ { type: 'glob', pattern: 'vault/contacts/*.md' }, { type: 'grep', pattern: 'tags:.*investor', paths: '$prev' }, { type: 'read', path: '$prev' }, // Get contact emails { type: 'vector_search', query: 'emails from $contacts', collection: 'syncs.gmail' }, { type: 'filter', condition: 'date > NOW() - :days days' } ] }, params: [{ name: 'days', type: 'number', default: 7 }], useCount: 47, avgLatencyMs: 120, createdAt: 1706400000, lastUsed: 1706486400 }
3.2 Librarian (Archive)
The Librarian is triggered on dump. It takes the full session log and archives it properly.
const LIBRARIAN_PROMPT = ` You are the ctx0 Librarian. Your job is to archive completed sessions into the vault. ## Your input You receive a complete session log containing: - All messages from the conversation - All files read, written, modified - All tool calls and their results - All subagent runs - Decisions made - Artifacts created ## Your process 1. Read and understand the session 2. Extract key information: - Facts learned (new information about the world) - Decisions made (and their reasoning) - Contact updates (new info about people) - Project updates (progress, blockers, changes) - Preferences expressed (how user likes things) - Tasks completed 3. Update existing vault entries (don't duplicate) 4. Create new entries where needed 5. Write a session summary to the archive ## Rules - UPDATE existing files, don't create duplicates - Keep vault entries concise and scannable - Always include dates on new information - Link related items (project ↔ contact ↔ decision) - Don't archive transient/ephemeral information - Preserve important reasoning and context ` class Librarian { private agent: Bot0Agent constructor() { this.agent = new Bot0Agent({ systemPrompt: LIBRARIAN_PROMPT, tools: [ 'read_file', 'write_file', 'edit_file', 'glob', 'grep', 'create_entry', 'update_entry', 'link_entries' ], maxTokens: 16000 // Needs space to process large sessions }) } async archive(session: SessionLog, trigger: Ctx0Trigger): Promise<ArchiveResult> { // First, write raw session to workspace const sessionPath = `workspace/sessions/${session.sessionId}/` await this.writeSessionRaw(sessionPath, session) // Then, have agent process and archive const result = await this.agent.run(` Archive the session at ${sessionPath} Trigger: ${trigger.type} Session ID: ${session.sessionId} Duration: ${session.endedAt - session.startedAt}ms Messages: ${session.messages.length} Tool calls: ${session.toolCalls.length} Files modified: ${session.filesModified.length} Decisions: ${session.decisions.length} Process this session and update the vault accordingly. `) return this.parseResult(result) } }
3.3 Curator (Organize)
The Curator runs periodically to maintain vault health. It processes syncs, dedupes, links, and organizes.
const CURATOR_PROMPT = ` You are the ctx0 Curator. Your job is to maintain and organize the vault. ## Your responsibilities 1. Process incoming syncs (gmail, slack, calendar, etc.) 2. Deduplicate entries (merge "Sarah" and "Sarah Chen") 3. Link related items (connect decisions to projects) 4. Archive old/stale content 5. Update embeddings for new content 6. Clean up orphaned files 7. Maintain vault statistics ## When you run - Hourly: Process new syncs - Daily: Deduplication pass - Weekly: Full organization pass - On-demand: User request ## Rules - Never delete without archiving first - Preserve all version history - Maintain referential integrity (links) - Log all changes for audit trail ` class Curator { private agent: Bot0Agent constructor() { this.agent = new Bot0Agent({ systemPrompt: CURATOR_PROMPT, tools: [ 'read_file', 'write_file', 'edit_file', 'move_file', 'glob', 'grep', 'vector_search', 'generate_embedding', 'merge_entries', 'link_entries', 'archive_entry' ], maxTokens: 32000 // Heavy processing }) } // Scheduled runs async runHourly() { await this.agent.run('Process all pending syncs and update vault.') } async runDaily() { await this.agent.run('Run deduplication pass across all contacts and entries.') } async runWeekly() { await this.agent.run('Full organization: dedupe, link, archive stale, update stats.') } }
4. Storage Backend
ctx0 needs storage that supports: versioning, vectors, buckets (file storage), folder structures, and any data type.
Recommended: Supabase (PostgreSQL + Storage + pgvector)
Why Supabase:
- PostgreSQL is battle-tested and universal
- pgvector is native — no separate vector service
- S3-compatible storage built-in
- Real-time subscriptions
- Row-level security for multi-tenant
- Open source — can self-host
- Learned queries are just SQL (no translation layer)
┌─────────────────────────────────────────────────────────────────────────────┐
│ ctx0 STORAGE STACK │
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ SUPABASE │ │
│ │ │ │
│ │ PostgreSQL │ │
│ │ ├── entries (tree structure, metadata, frontmatter) │ │
│ │ ├── sessions (session logs, status) │ │
│ │ ├── learned_queries (SQL templates) │ │
│ │ ├── sync_connections (service credentials) │ │
│ │ ├── sync_data (raw sync data) │ │
│ │ └── embeddings (pgvector) │ │
│ │ │ │
│ │ Storage (S3-compatible) │ │
│ │ ├── vault/{user}/*.md (file contents) │ │
│ │ ├── sessions/{user}/{id}/ (session archives) │ │
│ │ └── syncs/{user}/{service}/ (raw sync dumps) │ │
│ │ │ │
│ │ Real-time │ │
│ │ └── Subscriptions for live sync │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
│ Self-hosting: Users can run their own Supabase instance │
│ or use Supabase Cloud (managed) │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Supabase Schema
-- Enable pgvector CREATE EXTENSION IF NOT EXISTS vector; -- Users CREATE TABLE users ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), email TEXT UNIQUE NOT NULL, name TEXT, settings JSONB DEFAULT '{}', created_at TIMESTAMPTZ DEFAULT NOW() ); -- Vault entries (tree structure) CREATE TABLE entries ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), user_id UUID REFERENCES users(id) ON DELETE CASCADE, -- Tree structure path TEXT NOT NULL, -- "/contacts/sarah-chen" parent_path TEXT NOT NULL, -- "/contacts" name TEXT NOT NULL, -- "sarah-chen" entry_kind TEXT NOT NULL, -- "file" | "folder" -- Content (for files) content_ref TEXT, -- Storage bucket reference content_preview TEXT, -- First 500 chars for quick access content_hash TEXT, -- For change detection -- Metadata entry_type TEXT, -- "contact" | "project" | "note" | etc. tags TEXT[] DEFAULT '{}', frontmatter JSONB, -- Parsed YAML frontmatter -- Relationships links UUID[] DEFAULT '{}', -- Linked entry IDs -- Versioning version INT DEFAULT 1, -- Vector embedding (pgvector) embedding VECTOR(1536), -- Timestamps created_at TIMESTAMPTZ DEFAULT NOW(), updated_at TIMESTAMPTZ DEFAULT NOW(), accessed_at TIMESTAMPTZ DEFAULT NOW(), access_count INT DEFAULT 0, -- Constraints UNIQUE(user_id, path) ); -- Entry versions (history) CREATE TABLE entry_versions ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), entry_id UUID REFERENCES entries(id) ON DELETE CASCADE, version INT NOT NULL, content_ref TEXT NOT NULL, changed_at TIMESTAMPTZ DEFAULT NOW(), changed_by TEXT, -- Agent or user identifier change_summary TEXT ); -- Sessions (workspace) CREATE TABLE sessions ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), user_id UUID REFERENCES users(id) ON DELETE CASCADE, session_id TEXT NOT NULL, agent TEXT NOT NULL, -- "bot0" | "claude_code" | etc. -- Session data log_ref TEXT, -- Storage reference to full log summary TEXT, -- Stats message_count INT DEFAULT 0, tool_call_count INT DEFAULT 0, files_modified INT DEFAULT 0, -- Status status TEXT DEFAULT 'active', -- "active" | "archived" | "processing" -- Timestamps started_at TIMESTAMPTZ DEFAULT NOW(), ended_at TIMESTAMPTZ, archived_at TIMESTAMPTZ, UNIQUE(user_id, session_id) ); -- Learned queries (SQL templates) CREATE TABLE learned_queries ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), user_id UUID REFERENCES users(id) ON DELETE CASCADE, name TEXT NOT NULL, natural_language TEXT[] NOT NULL, -- Variations that match sql_template TEXT NOT NULL, -- Actual SQL with :params params JSONB, -- Param definitions -- Usage stats use_count INT DEFAULT 0, avg_latency_ms INT, last_used TIMESTAMPTZ, created_at TIMESTAMPTZ DEFAULT NOW(), UNIQUE(user_id, name) ); -- Sync connections CREATE TABLE sync_connections ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), user_id UUID REFERENCES users(id) ON DELETE CASCADE, service TEXT NOT NULL, -- "gmail" | "slack" | etc. status TEXT DEFAULT 'active', credentials_encrypted TEXT, -- Encrypted credentials config JSONB, last_sync TIMESTAMPTZ, UNIQUE(user_id, service) ); -- Sync data (raw) CREATE TABLE sync_data ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), user_id UUID REFERENCES users(id) ON DELETE CASCADE, connection_id UUID REFERENCES sync_connections(id), service TEXT NOT NULL, external_id TEXT NOT NULL, data JSONB NOT NULL, embedding VECTOR(1536), processed BOOLEAN DEFAULT false, created_at TIMESTAMPTZ DEFAULT NOW(), UNIQUE(user_id, service, external_id) ); -- Indexes CREATE INDEX idx_entries_user ON entries(user_id); CREATE INDEX idx_entries_path ON entries(user_id, path); CREATE INDEX idx_entries_parent ON entries(user_id, parent_path); CREATE INDEX idx_entries_type ON entries(user_id, entry_type); CREATE INDEX idx_entries_tags ON entries USING GIN(tags); CREATE INDEX idx_entries_embedding ON entries USING ivfflat(embedding vector_cosine_ops); CREATE INDEX idx_sessions_user ON sessions(user_id); CREATE INDEX idx_sessions_status ON sessions(user_id, status); CREATE INDEX idx_sync_data_user ON sync_data(user_id, service); CREATE INDEX idx_sync_data_processed ON sync_data(user_id, processed); CREATE INDEX idx_sync_data_embedding ON sync_data USING ivfflat(embedding vector_cosine_ops); -- Row Level Security ALTER TABLE entries ENABLE ROW LEVEL SECURITY; ALTER TABLE sessions ENABLE ROW LEVEL SECURITY; ALTER TABLE learned_queries ENABLE ROW LEVEL SECURITY; ALTER TABLE sync_connections ENABLE ROW LEVEL SECURITY; ALTER TABLE sync_data ENABLE ROW LEVEL SECURITY; -- Policies (users can only access their own data) CREATE POLICY entries_policy ON entries FOR ALL USING (user_id = auth.uid()); CREATE POLICY sessions_policy ON sessions FOR ALL USING (user_id = auth.uid()); CREATE POLICY queries_policy ON learned_queries FOR ALL USING (user_id = auth.uid()); CREATE POLICY sync_conn_policy ON sync_connections FOR ALL USING (user_id = auth.uid()); CREATE POLICY sync_data_policy ON sync_data FOR ALL USING (user_id = auth.uid());
Storage Bucket Structure
ctx0-storage/
├── {userId}/
│ ├── vault/ # Vault file contents
│ │ ├── contacts/
│ │ │ └── sarah-chen.md
│ │ ├── projects/
│ │ │ └── bot0/
│ │ └── ...
│ │
│ ├── sessions/ # Session archives
│ │ └── {sessionId}/
│ │ ├── log.json
│ │ ├── files/
│ │ └── summary.md
│ │
│ └── syncs/ # Raw sync data
│ ├── gmail/
│ ├── slack/
│ └── calendar/
Self-Hosting Configuration
# ctx0-config.yaml # Option 1: Supabase Cloud (managed) storage: provider: supabase url: https://your-project.supabase.co anon_key: your-anon-key service_key: your-service-key # For admin operations # Option 2: Self-hosted Supabase storage: provider: supabase url: http://localhost:54321 # Local Supabase anon_key: local-anon-key service_key: local-service-key # Option 3: Direct PostgreSQL (advanced) storage: provider: postgres connection_string: postgresql://user:pass@host:5432/ctx0 storage_backend: s3 # or 'local' for filesystem s3_endpoint: http://localhost:9000 s3_bucket: ctx0-storage
5. Execution Environment (E2B or Local)
Subagents (Librarian, Curator, Extractor) need a filesystem to work with. This can run:
- E2B — Ephemeral cloud sandbox (per operation)
- Local daemon — On-prem, user's own machine
Modular Execution Pattern
┌─────────────────────────────────────────────────────────────────────────────┐
│ EXECUTION ENVIRONMENT (MODULAR) │
│ │
│ OPTION A: E2B (Cloud) OPTION B: Local Daemon │
│ ───────────────────── ────────────────────── │
│ │
│ ┌─────────────────────────┐ ┌─────────────────────────┐ │
│ │ E2B Sandbox │ │ Local bot0 Daemon │ │
│ │ │ │ │ │
│ │ • Ephemeral Linux VM │ │ • Runs on user machine │ │
│ │ • Spun up per operation│ │ • Always available │ │
│ │ • Auto-terminated │ │ • Lower latency │ │
│ │ • Fully isolated │ │ • No cloud cost │ │
│ │ │ │ │ │
│ │ /vault/ (exported) │ │ ~/.ctx0/vault/ │ │
│ │ bot0 subagent │ │ bot0 subagent │ │
│ │ Standard file tools │ │ Standard file tools │ │
│ │ │ │ │ │
│ └───────────┬─────────────┘ └───────────┬─────────────┘ │
│ │ │ │
│ └──────────────┬──────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────┐ │
│ │ SYNC LAYER │ │
│ │ │ │
│ │ • Diff changes vs DB state │ │
│ │ • Resolve conflicts │ │
│ │ • Apply updates atomically │ │
│ │ • Update embeddings (batched) │ │
│ │ │ │
│ └──────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────┐ │
│ │ SUPABASE │ │
│ │ (source of truth) │ │
│ └──────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Execution Environment Interface
// Abstract interface — works with E2B or local interface ExecutionEnvironment { // Initialize with vault export setup(userId: string): Promise<void> // Run a subagent (Librarian, Curator, Extractor) runAgent(agent: SubagentConfig, task: string): Promise<AgentResult> // Get filesystem changes getChanges(): Promise<FileChange[]> // Cleanup (terminate E2B, or just return for local) cleanup(): Promise<void> } // E2B Implementation class E2BEnvironment implements ExecutionEnvironment { private sandbox: Sandbox async setup(userId: string) { // Spin up ephemeral E2B sandbox this.sandbox = await Sandbox.create({ template: 'ctx0-agent' }) // Export vault from Supabase to sandbox filesystem const entries = await supabase.from('entries').select('*').eq('user_id', userId) for (const entry of entries) { const content = await supabase.storage.from('vault').download(entry.content_ref) await this.sandbox.filesystem.write(entry.path, content) } } async runAgent(agent: SubagentConfig, task: string) { // Run bot0 subagent inside sandbox const result = await this.sandbox.process.start({ cmd: 'bot0', args: ['--agent', agent.name, '--task', task] }) return result } async getChanges(): Promise<FileChange[]> { // Diff sandbox filesystem against original export return await this.sandbox.filesystem.diff('/vault') } async cleanup() { await this.sandbox.close() // Terminate and discard } } // Local Daemon Implementation class LocalEnvironment implements ExecutionEnvironment { private vaultPath: string = '~/.ctx0/vault' async setup(userId: string) { // Sync from Supabase to local filesystem await ctx0Cli.sync({ direction: 'pull', userId }) } async runAgent(agent: SubagentConfig, task: string) { // Run via local bot0 daemon const daemon = await connectToLocalDaemon() return await daemon.runSubagent(agent, task) } async getChanges(): Promise<FileChange[]> { // Diff local filesystem against last sync state return await ctx0Cli.diff() } async cleanup() { // Nothing to cleanup for local } } // Factory — choose based on configuration function createEnvironment(config: Ctx0Config): ExecutionEnvironment { if (config.execution === 'e2b') { return new E2BEnvironment() } else { return new LocalEnvironment() } }
Configuration
# ctx0-config.yaml execution: # Option 1: E2B (cloud, ephemeral) provider: e2b api_key: e2b_xxx # Option 2: Local daemon # provider: local # daemon_socket: /tmp/bot0.sock # Option 3: Hybrid (prefer local, fallback to E2B) # provider: hybrid # prefer: local # fallback: e2b
6. Sync Layer
The Sync Layer handles bidirectional sync between the execution environment (E2B or local) and Supabase. It detects changes, resolves conflicts, and applies updates atomically.
Sync Flow
┌─────────────────────────────────────────────────────────────────────────────┐
│ SYNC FLOW │
│ │
│ 1. EXPORT (Supabase → Execution Env) │
│ ───────────────────────────────────────────────────────────────────── │
│ │
│ Supabase E2B / Local │
│ ┌────────────────────┐ ┌────────────────────┐ │
│ │ entries table │ ────── export ─────► │ /vault/ │ │
│ │ storage bucket │ │ ├── contacts/ │ │
│ └────────────────────┘ │ ├── projects/ │ │
│ │ └── ... │ │
│ Records: base_version for each file └────────────────────┘ │
│ │
│ 2. WORK (Agent operates on filesystem) │
│ ───────────────────────────────────────────────────────────────────── │
│ │
│ bot0 subagent uses standard file tools: │
│ • read_file, write_file, edit_file │
│ • glob, grep │
│ • No special database tools needed │
│ │
│ 3. DIFF (Detect changes) │
│ ───────────────────────────────────────────────────────────────────── │
│ │
│ Sync Layer computes diff: │
│ ┌────────────────────────────────────────────────────────────────────┐ │
│ │ Changes: │ │
│ │ • M /vault/contacts/sarah-chen.md (modified) │ │
│ │ • A /vault/decisions/new-decision.md (added) │ │
│ │ • D /vault/scratch/temp.md (deleted) │ │
│ └────────────────────────────────────────────────────────────────────┘ │
│ │
│ 4. CONFLICT CHECK │
│ ───────────────────────────────────────────────────────────────────── │
│ │
│ For each modified file: │
│ • Check if DB version > base_version (someone else changed it) │
│ • If yes → conflict resolution │
│ • If no → safe to apply │
│ │
│ 5. APPLY (Execution Env → Supabase) │
│ ───────────────────────────────────────────────────────────────────── │
│ │
│ E2B / Local Supabase │
│ ┌────────────────────┐ ┌────────────────────┐ │
│ │ Changes │ ────── apply ──────► │ entries table │ │
│ │ │ │ storage bucket │ │
│ │ • Upload content │ │ embeddings queue │ │
│ │ • Update metadata │ └────────────────────┘ │
│ │ • Bump versions │ │
│ └────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Conflict Resolution
// Conflict resolution strategies interface ConflictStrategy { resolve( base: EntryVersion, // Original version when agent started current: EntryVersion, // Current DB version (changed by someone else) incoming: EntryVersion // Agent's changes ): ResolvedEntry } // Field-level merge (default for structured content) class FieldLevelMerge implements ConflictStrategy { resolve(base, current, incoming) { const merged = { ...current } // For each field the agent changed for (const [field, value] of Object.entries(incoming.frontmatter)) { const baseValue = base.frontmatter[field] const currentValue = current.frontmatter[field] if (baseValue === currentValue) { // Field wasn't changed by others, safe to apply agent's change merged.frontmatter[field] = value } else if (Array.isArray(currentValue) && Array.isArray(value)) { // Both arrays — merge (union) merged.frontmatter[field] = [...new Set([...currentValue, ...value])] } else { // True conflict — log and prefer current (DB wins) this.logConflict(field, currentValue, value) } } // Merge content (append-style for notes sections) merged.content = this.mergeContent(base.content, current.content, incoming.content) return merged } } // Append-only (for logs, history) class AppendOnly implements ConflictStrategy { resolve(base, current, incoming) { // Just append new content, never overwrite return { ...current, content: current.content + '\n' + incoming.newContent } } } // Last-write-wins (for ephemeral/scratch) class LastWriteWins implements ConflictStrategy { resolve(base, current, incoming) { return incoming // Agent's version wins } } // Strategy selection based on entry type function getStrategy(entryType: string): ConflictStrategy { switch (entryType) { case 'contact': case 'project': return new FieldLevelMerge() case 'log': case 'history': return new AppendOnly() case 'scratch': case 'temp': return new LastWriteWins() default: return new FieldLevelMerge() } }
Sync Layer Implementation
class SyncLayer { private supabase: SupabaseClient // Apply changes from execution environment to database async applyChanges( userId: string, changes: FileChange[], baseVersions: Map<string, number> ): Promise<SyncResult> { const results: SyncResult = { applied: [], conflicts: [], errors: [] } // Start transaction const { data: entries } = await this.supabase .from('entries') .select('*') .eq('user_id', userId) .in('path', changes.map(c => c.path)) const currentVersions = new Map(entries.map(e => [e.path, e])) for (const change of changes) { const base = baseVersions.get(change.path) const current = currentVersions.get(change.path) try { if (change.type === 'added') { // New file — just insert await this.insertEntry(userId, change) results.applied.push(change.path) } else if (change.type === 'deleted') { // Archive, don't hard delete await this.archiveEntry(current.id) results.applied.push(change.path) } else if (change.type === 'modified') { if (!current || current.version === base) { // No conflict — apply directly await this.updateEntry(current.id, change) results.applied.push(change.path) } else { // Conflict — resolve const strategy = getStrategy(current.entry_type) const resolved = strategy.resolve( { version: base, ...change.baseContent }, current, change ) await this.updateEntry(current.id, resolved) results.conflicts.push({ path: change.path, resolution: 'merged' }) } } } catch (error) { results.errors.push({ path: change.path, error }) } } // Queue embedding updates (batched, async) await this.queueEmbeddingUpdates(results.applied) return results } // Queue embeddings to be updated (batched hourly) private async queueEmbeddingUpdates(paths: string[]) { await this.supabase .from('embedding_queue') .insert(paths.map(path => ({ path, status: 'pending' }))) } }
Embedding Updates (Batched)
// Embeddings are expensive — batch them hourly class EmbeddingProcessor { async processQueue() { // Get pending entries const { data: queue } = await supabase .from('embedding_queue') .select('path') .eq('status', 'pending') .limit(100) // Generate embeddings in batch const embeddings = await openai.embeddings.create({ model: 'text-embedding-3-small', input: queue.map(q => q.content) }) // Update entries with embeddings for (let i = 0; i < queue.length; i++) { await supabase .from('entries') .update({ embedding: embeddings.data[i].embedding }) .eq('path', queue[i].path) } // Mark as processed await supabase .from('embedding_queue') .update({ status: 'processed' }) .in('path', queue.map(q => q.path)) } } // Run hourly via cron // 0 * * * * node process-embeddings.js
7. Codebase Export
Generate a crawlable folder structure from the database that any agent can use with standard file tools.
How It Works
┌─────────────────────────────────────────────────────────────────────────────┐
│ CODEBASE EXPORT │
│ │
│ Supabase Export Filesystem │
│ ──────── ────── ────────── │
│ │
│ entries table /vault/ │
│ ├── /contacts/sarah ─────► ctx0 export ─────► ├── contacts/ │
│ ├── /contacts/mike │ ├── sarah │
│ ├── /projects/bot0 │ └── mike │
│ └── /knowledge/prefs ├── projects/ │
│ └── knowledge/│
│ storage bucket │
│ └── vault/*.md ─────► download ─────► (file contents)│
│ │
│ Result: A folder that looks exactly like a codebase │
│ Agents use: read_file, glob, grep — standard tools │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Export Types
// Different export formats interface ExportOptions { format: 'full' | 'minimal' | 'selective' scope?: string[] // Paths to include since?: number // Only entries updated since maxSize?: number // Max total size in bytes includeVersions?: boolean // Include version history flatten?: boolean // Flatten to single folder } // Full export — everything // Used for: local agent access, backup ctx0 export --full --output ~/.ctx0/vault // Minimal export — just structure + previews // Used for: quick context, sharing ctx0 export --minimal --output /tmp/ctx0-context // Selective export — specific paths // Used for: focused work ctx0 export --scope contacts,projects/bot0 --output /tmp/ctx0-subset // Since export — only recent changes // Used for: incremental sync ctx0 export --since 2026-01-20 --output /tmp/ctx0-recent
CLI Commands
# Export full vault to local filesystem ctx0 export --output ~/.ctx0/vault # Export specific paths ctx0 export --scope "contacts,projects/bot0" --output ./context # Export for sharing (minimal, no sensitive) ctx0 export --minimal --exclude "syncs,scratch" --output ./share # Sync local changes back to cloud ctx0 sync --push # Pull latest from cloud ctx0 sync --pull # Real-time sync (watches for changes) ctx0 sync --watch
Generated Structure
~/.ctx0/vault/ # The "codebase" agents crawl
├── CONTEXT.md # Instructions for agents
├── me.md # User identity
│
├── contacts/
│ ├── _index.md # Auto-generated index
│ ├── sarah-chen.md
│ └── mike-ycombinator.md
│
├── projects/
│ ├── _index.md
│ └── bot0/
│ ├── overview.md
│ ├── decisions.md
│ └── architecture.md
│
├── knowledge/
│ ├── _index.md
│ ├── company.md
│ └── preferences/
│ └── coding-style.md
│
├── decisions/
│ ├── _index.md
│ └── 2026-01-28-series-a-exclusivity.md
│
├── skills/
│ ├── _index.md
│ ├── pptx/SKILL.md
│ └── email/SKILL.md
│
└── .ctx0/
├── config.yaml
├── queries.yaml # Learned queries
└── stats.json # Vault statistics
8. Main Agent Integration
How the main agent (bot0, Claude Code, etc.) uses ctx0.
Tool Interface
// Tools available to any main agent const ctx0Tools = [ { name: 'ctx0_retrieve', description: ` Get context from your memory vault. Spawns the Extractor subagent to intelligently find and return relevant information. Use when: - Starting a task and need background info - Encountering a name/reference you don't recognize - Need to check past decisions or preferences `, input_schema: { type: 'object', properties: { query: { type: 'string', description: 'What context do you need? Natural language.' }, scope: { type: 'array', items: { type: 'string' }, description: 'Limit to specific areas: contacts, projects, decisions, etc.' } }, required: ['query'] } }, { name: 'ctx0_remember', description: ` Store important information for later. Writes to working memory immediately. The Librarian will archive it to the vault on session end. Use when: - User shares important information - A decision is made - You learn something new about a contact/project `, input_schema: { type: 'object', properties: { content: { type: 'string' }, type: { type: 'string', enum: ['fact', 'decision', 'contact_update', 'project_update', 'preference', 'insight'] }, tags: { type: 'array', items: { type: 'string' } }, linkedTo: { type: 'array', items: { type: 'string' }, description: 'Paths of related entries to link' } }, required: ['content', 'type'] } }, { name: 'ctx0_query', description: ` Execute a learned query by name. Faster than ctx0_retrieve for common patterns. Use when: - You know a learned query exists for this pattern - Speed is important - Running a repeatable query `, input_schema: { type: 'object', properties: { name: { type: 'string' }, params: { type: 'object' } }, required: ['name'] } }, { name: 'ctx0_archive_now', description: ` Immediately archive the current session. Triggers the Librarian. Use when: - Completing an important task - User explicitly asks to save progress - Before starting a very different task `, input_schema: { type: 'object', properties: { summary: { type: 'string', description: 'Optional summary of what should be archived' } } } } ]
System Prompt Addition
## Memory System (ctx0) You have access to ctx0, your persistent memory vault. ### Tools - `ctx0_retrieve` — Get context (spawns Extractor to find relevant info) - `ctx0_remember` — Store important information - `ctx0_query` — Run a learned query by name - `ctx0_archive_now` — Archive current session immediately ### When to use ctx0 **RETRIEVE context when:** - Starting a complex task → get background - Someone is mentioned → who are they? - A project is referenced → what's the status? - Making a decision → check past decisions **REMEMBER information when:** - User shares a fact → store it - A decision is made → record reasoning - Contact info changes → update it - Preference expressed → note it ### Best practices 1. **Retrieve before assuming** — Don't guess who "Sarah" is 2. **Remember proactively** — If it seems important, store it 3. **Use learned queries** — Faster for common patterns 4. **Link related items** — Connect decisions to projects, contacts to projects, etc.
9. Data Flow Summary
┌─────────────────────────────────────────────────────────────────────────────┐
│ COMPLETE DATA FLOW │
│ │
│ INGRESS (data coming in) │
│ ───────────────────────────────────────────────────────────────────── │
│ │
│ 1. User works with main agent (bot0, Claude Code, etc.) │
│ 2. Session Tracker records everything in background │
│ 3. Agent uses ctx0_retrieve to get context (Extractor) │
│ 4. Agent uses ctx0_remember to note important things │
│ │
│ DUMP (session → vault) │
│ ───────────────────────────────────────────────────────────────────── │
│ │
│ 5. Trigger fires (compaction/idle/session end/manual) │
│ 6. Full session log dumped to workspace/sessions/{id}/ │
│ 7. Librarian subagent processes session: │
│ - Extracts facts, decisions, updates │
│ - Updates existing vault entries │
│ - Creates new entries │
│ - Links related items │
│ 8. Session marked as archived │
│ │
│ MAINTENANCE (ongoing) │
│ ───────────────────────────────────────────────────────────────────── │
│ │
│ 9. External syncs arrive (gmail, slack, etc.) │
│ 10. Curator processes syncs: │
│ - Extracts relevant information │
│ - Updates contacts, projects │
│ - Deduplicates entries │
│ - Maintains links and embeddings │
│ │
│ EGRESS (data going out) │
│ ───────────────────────────────────────────────────────────────────── │
│ │
│ 11. Codebase export generates crawlable folder structure │
│ 12. Agents access via file tools or API │
│ 13. Learned queries provide fast access patterns │
│ 14. Passport export for pasting into other chats │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
10. Implementation Phases
Phase 1: Core (MVP)
- Supabase project setup (PostgreSQL + Storage + pgvector)
- Database schema migration
- Basic entry CRUD (create, read, update, archive)
- Storage bucket structure
- Session tracker for bot0
- Librarian subagent (basic)
- Codebase export (full)
- CLI: init, export, sync
- Local execution environment
Phase 2: Intelligence
- Extractor subagent
- Vector embeddings (pgvector)
- Semantic search
- Learned queries system (SQL templates)
- Curator subagent (basic)
- Embedding batch processor
Phase 3: Integrations
- E2B execution environment
- Sync layer with conflict resolution
- Claude Code adapter (MCP)
- Cursor adapter (extension)
- Gmail sync
- Slack sync
- Calendar sync
Phase 4: Scale & Self-Hosting
- Self-hosted Supabase support
- Direct PostgreSQL mode (advanced)
- Hot/cold data splitting
- Incremental sync
- Team vaults
- API for external agents
- Hybrid execution (local + E2B fallback)
11. Open Questions
-
Embedding model — OpenAI text-embedding-3-small? Local model for privacy?
-
Conflict resolution — What if two agents update same entry simultaneously?✅ Resolved: Field-level merge for structured content, append-only for logs, last-write-wins for scratch (Section 6) -
Privacy tiers — Some entries more sensitive? Encryption at rest?
-
Query learning threshold — How many uses before a query is "learned"?
-
Sync deduplication — How to handle same email in multiple syncs?
-
Export freshness — How often to regenerate codebase export? On-demand vs scheduled?
-
Cross-device — How to handle multiple devices with local exports?
-
E2B template — What should the ctx0-agent E2B sandbox template include?
-
Local daemon design — Socket vs HTTP? How to manage lifecycle?
12. Summary
| Component | Purpose |
|---|---|
| Session Tracker | Records everything during agent sessions |
| Trigger System | Fires ctx0 dump on compaction/idle/end |
| Extractor | Retrieves context on-demand (bot0 subagent) |
| Librarian | Archives sessions to vault (bot0 subagent) |
| Curator | Maintains vault health (bot0 subagent) |
| Supabase (PostgreSQL) | Tree structure, metadata, relationships, learned queries (SQL) |
| Supabase Storage | File content, large data, archives (S3-compatible) |
| pgvector | Semantic search, embeddings (native PostgreSQL) |
| E2B Environment | Ephemeral cloud sandboxes for subagent execution |
| Local Environment | On-prem execution via local bot0 daemon |
| Sync Layer | Bidirectional sync with conflict resolution |
| Codebase Export | Generates crawlable folder for agents |
| Learned Queries | SQL templates that get smarter with use |
Architecture Principles
- Modular execution — E2B (cloud) or local daemon, swappable
- Self-hostable — Users can run their own Supabase instance
- Database-first — PostgreSQL is source of truth, filesystem is export
- Agent-agnostic — Works with bot0, Claude Code, Cursor, Gemini CLI, etc.
- Conflict-safe — Field-level merge prevents data loss
ctx0 = bot0 configured to manage a context "codebase"
Same engine. Different purpose. Your memory, everywhere.