← back

Kanipi v1.8.0

multi-channel agent gateway

TL;DR
Kanipi routes messages from 10 channels (Telegram, Discord, WhatsApp, Email, Web, Twitter/X, Facebook, Reddit, Mastodon, Bluesky) to containerized Claude Code agents. Each agent runs in Docker with full SDK access (subagents, tools, skills, MCP). The gateway handles routing, queueing, IPC, media, memory injection, social actions, and a dashboard portal. Agents are configurations, not code. 65 source files, 13,980 LOC. 43 test files, 13,170 LOC. 798 tests. 97 specs.

Architecture

TypeScript (ESM, NodeNext). Gateway polls channels for messages, stores them in SQLite, queues per group, spawns a Docker container per agent invocation. Agent reads messages from IPC files as XML, uses Claude Code SDK, writes results to stdout as JSON. Gateway streams responses back to the originating channel. Social channels use a unified inbound event model with verb-typed events (message, reply, post, react, repost, follow, etc.).

Channel (tg/discord/wa/email/web/twitter/reddit/mastodon/bluesky/facebook)
    |  message / social event (verb-typed)
    v
  Impulse Gate  ---  per-JID weight accumulation, flush on threshold
    |
    v
  SQLite DB  ------  store message + attachments + chat metadata
    |
    v
  GroupQueue  -----  per-group FIFO, one agent at a time
    |
    v
  Router  ---------  JID -> group lookup, prompt assembly
    |                 system messages, diary + episode injection
    |                 {sender} template expansion (auto-threading)
    v
  Container  ------  docker run, IPC file pipe, unified home mount
    |                 groups/{folder}/ mounted as /home/node
    v
  Claude Code  ----  SDK: tools, subagents, skills, MCP
    |
    v
  IPC (SIGUSR1)  --  file-based request/response
    |                 agent -> gateway actions
    v
  Action Registry    Zod-validated, tier-authorized
    |                 core + social actions (post, reply, react, repost...)
    v
  Reply Router  ---  per-sender batching, chunk chaining, reply threading
    |
    v
  Channel  --------  send text, files, typing indicators, <status> updates

Key modules:

ModulePurpose
index.tsmain loop, channel init, message routing, delegatePerSender
config.tstyped constants from .env + env vars
db.tsSQLite (better-sqlite3): messages, groups, sessions, tasks, auth
container-runner.tsdocker lifecycle, IPC file pipe, mount assembly
group-queue.tsper-group message queueing, concurrency control, circuit breaker
router.tsJID resolution, prompt formatting, XML history, {sender} templates
action-registry.tsunified action system (Zod schemas, authorization, MCP/command flags)
actions/social.ts22 social actions (post, reply, react, repost, follow, ban, pin...)
impulse.tsimpulse gate: per-JID event weight accumulation before agent flush
ipc.tscontainer-gateway IPC (request-response + file-based)
task-scheduler.tscron-based scheduled tasks
dashboards/dashboard portal (self-registration) + status dashboard
channels/telegram, whatsapp, discord, email, web, twitter, reddit, mastodon, bluesky, facebook
actions/core (messaging, tasks, groups, session, inject) + social actions

Channels

Ten channel adapters implementing a shared interface (Channel in types.ts): connect(), sendMessage(), sendDocument?(), ownsJid(). Each channel is enabled by config presence — no token, no channel. JIDs use URI-like prefixes for cross-channel routing. Social channels register their PlatformClient at connect time, enabling dynamic action availability.

Chat channels

Bidirectional messaging channels. Users send messages, agent responds in the same conversation.

ChannelLibraryEnabled byJID prefixNotes
Telegramgrammy TELEGRAM_BOT_TOKENtg: long-poll, markdown-to-HTML, 4096-char chunking, typing indicator
Discorddiscord.js DISCORD_BOT_TOKENdiscord: gateway mode, 2000-char split
WhatsAppbaileys store/auth/creds.json(native) QR pairing, read receipts, markdown conversion
Emailimapflow + nodemailer EMAIL_IMAP_HOSTemail: IMAP IDLE real-time, SMTP reply threading via email_threads table
Web (Slink)built-in always (HTTP POST)web: HMAC JWT auth, rate limiting, SSE streaming, sloth.js widget

Social channels

Social platform adapters. Inbound events use a unified verb model (message, reply, post, react, repost, follow, join, edit, delete, close). Outbound uses the social action catalog. Each channel registers a PlatformClient at connect time — actions are only available while the channel is connected.

ChannelLibraryEnabled byJID prefixNotes
Twitter/Xagent-twitter-client TWITTER_USERNAMEtwitter: mentions, replies, DMs; watcher polls timeline
Redditsnoowrap REDDIT_CLIENT_IDreddit: subreddit watching, post/comment/reply, moderation actions
Mastodonmasto.js MASTODON_INSTANCE_URLmastodon: streaming API, notifications, post/reply/boost
Bluesky@atproto/api BLUESKY_IDENTIFIERbluesky: AT Protocol, notifications, post/reply/repost
FacebookFB Graph API (fetch) FACEBOOK_PAGE_IDfacebook: Page API, post/comment/react, webhook events

Social actions

22 social actions available via IPC: post, reply, react, repost, follow, unfollow, set_profile, delete_post, edit_post, ban, unban, timeout, mute, block, pin, unpin, lock, unlock, hide, approve, set_flair, kick. Each action declares which platforms support it. Agents address actions by JID — the registry routes to the correct platform client.

Impulse gate

Social channels produce high-volume events (joins, edits, deletes) that shouldn't each trigger an agent. The impulse gate accumulates event weights per JID and flushes to the agent when the threshold is reached (default 100) or a max hold timer expires (default 5 min). Configurable per-verb weights — joins, edits, and deletes default to weight 0 (dropped). Direct messages default to weight 100 (immediate flush).

All channels support /chatid and /ping commands. Forward metadata (origin, reply-to) extracted per channel and rendered as nested XML in agent prompt. Per-channel output styles (formatting conventions) activated via SDK settings. Local channel (local: JID prefix) provides internal routing for delegation responses.

Agent Runtime

Each agent runs Claude Code inside a Docker container. The gateway writes start.json to IPC (prompt + secrets), agent reads it and runs the SDK query. Follow-up messages arrive as IPC files with SIGUSR1 wake signal (500ms fallback poll). MAX_CONCURRENT_CONTAINERS limits parallel agents (default 5). IDLE_TIMEOUT controls container keepalive (default 30min; set to 0 for chat-bound sessions that never idle out).

Workspace mounts

Unified home directory: groups/{folder}/ is mounted as /home/node (both cwd and HOME). .claude/ state lives inside the group folder. Media paths use container-relative ~/media/... format.

MountPathAccess
home/home/nodegroup workdir + .claude/ state (rw, ro setup files tier 2+, full ro tier 3)
groups~/groupsall groups (root tier 0 only, for skill sync)
share/workspace/shareworld-level shared state (rw root/world, ro rest)
web/workspace/webvite-served output (rw root/world, ro rest)
ipc/workspace/ipcIPC requests/responses (rw all tiers)
self/workspace/selfkanipi source (ro, root tier 0 only)
agent-runner/workspace/agent-runneragent-runner TS source (ro, hot-patch via bunx tsc)
extra~/{name}allowlisted additional mounts (ro, configured per instance)

Action registry

Agents request actions via IPC files. Gateway validates with Zod schemas and dispatches. Core actions: messaging (send_message, send_file, send_reply, inject_message), tasks (schedule_task, cancel_task), groups (delegate_group, set_routing_rules, add_route, delete_route), session (reset_session). Plus 22 social actions (see Channels). Each action is a single source of truth for IPC dispatch, MCP tools, and commands. Authorization enforced per action and per tier.

4-tier permissions

TierFolder patternAccess
0 (root)root, mainfull rw, ~/self and ~/groups visible, can delegate to any world
1 (world){world} (no /)rw home/share/web, ro setup files, world-scoped delegation
2 (agent){world}/{name}rw home + temp dirs, ro setup files (CLAUDE.md/SOUL.md/skills), world-scoped
3 (worker){world}/{a}/{b}/...full ro home, rw overlays for .claude/projects + media + tmp only

Tier derived from folder depth (count / separators). Tier 2+ use overlay mounts for ro enforcement. Tier 3 gets minimal rw overlays for SDK operation. Root world (tier 0) can delegate to any folder; all others restricted to same-world descendants.

Container commands

runContainerCommand supports dual modes: agent mode (full SDK ceremony with prompt assembly, session, memory injection) and raw mode (bash command, captures stdout). Raw mode used by task scheduler for git pulls and maintenance. Same sandbox, different command.

Agent output processing

<think> blocks are stripped from output before delivery — agents use them for silent reasoning visible only in logs. <status> blocks are extracted and sent as interim updates to the user while the agent continues working. <internal> blocks are stripped entirely.

Skills and identity

Skills seeded from container/skills/ to .claude/skills/ inside the group folder on first spawn. SOUL.md defines agent personality per group. CLAUDE.md defines behavior and instructions. Migration system: MIGRATION_VERSION + numbered files in container/skills/self/migrations/; /migrate skill syncs groups when version advances.

Hot-patching and DinD

Agent-runner TypeScript source is mounted into containers and recompiled with bunx tsc at runtime. Changes to agent behavior deploy without rebuilding the container image. When gateway runs in Docker, HOST_GROUPS_DIR, HOST_DATA_DIR, and HOST_APP_DIR translate container-internal paths to host paths for child container mounts.

Error handling

Circuit breaker: 3 consecutive failures per group opens the breaker. New user message resets. Agent errors advance the cursor, notify the user, and evict the session. error_max_turns recovery resumes with maxTurns=3 and asks the agent to summarize progress.

Memory Layers

Seven memory layers with different persistence, scope, and injection mechanisms. Push layers are injected by the gateway. Pull layers are searched by the agent. The pattern: markdown files with summary: YAML frontmatter, selected and injected as XML. Progressive compression turns conversations into durable knowledge.

LayerStorageScopeInjection
MessagesSQLiteper-group push: stdin XML history (recent N messages)
SessionSDK JSONLper-container push: Claude Code native --resume
ManagedCLAUDE.md + MEMORY.mdper-group push: Claude Code native read
Diarydiary/*.mdper-group push: gateway injects 14 most recent as <diary> XML
User contextusers/*.mdper-group push: <user> pointer per message sender, agent reads file
Factsfacts/*.mdper-world pull: agent searches via /recall or grep
Episodesepisodes/*.mdper-group push: gateway injects most recent day/week/month as <episodes> XML

Knowledge stores

Four directories follow the same pattern: markdown files with summary: YAML frontmatter. facts/ (per-world shared knowledge), diary/ (daily work log), users/ (per-sender context), episodes/ (compressed session history). All four are indexed by /recall and searchable by the agent.

Episode system

Session transcripts (.jl files) are compressed into daily summaries, daily into weekly, weekly into monthly. The /compact-memories skill runs on cron in isolated containers:

.claude/projects/-home-node/*.jl → episodes/20260310.md (day) — 0 2 * * *
daily episodes →  episodes/2026-W11.md  (week)  — 0 3 * * 1
weekly episodes → episodes/2026-03.md   (month) — 0 4 1 * *
diary daily    →  diary/week/2026-W11.md        — 0 3 * * 1
diary weekly   →  diary/month/2026-03.md        — 0 4 1 * *

Gateway injects most recent day, week, and month episode summaries on session start. Each file tracks sources: for traceability back to originals. Diary week/month summaries are not injected (14-day daily covers it) but are searchable via /recall.

Recall

/recall skill searches across all knowledge stores. Two implementations:

VersionMethodWhen
v1 Explore subagent greps summary: across store dirs, LLM judges relevance corpus < ~300 files
v2 recall CLI: FTS5 + sqlite-vec hybrid search (RRF fusion), then Explore judges candidates corpus > ~300 files

v2 uses per-store SQLite databases (.local/recall/*.db) with lazy mtime-based indexing. Embeddings via Ollama (nomic-embed-text, 768-dim). RRF weights: 0.7 vector, 0.3 BM25. Config in .recallrc (TOML). Agent expands query into ~10 search terms, calls recall "term" for each, then spawns Explore to judge scored candidates.

System messages (new-session, new-day) injected as XML. Last 2 previous sessions included as <previous_session> elements. PreCompact hook nudges agent to write diary entries before context compression.

Groups and Routing

Groups are the organizational unit. Each group maps to a folder, a channel JID, and an agent configuration. Groups are registered via CLI. Unregistered chats are silently dropped. Multiple JIDs can share one folder (multi-channel groups).

Worlds

World = first folder segment. worldOf('atlas/support') === 'atlas'. Authorization is world-scoped: cross-world actions are denied. /workspace/share is mounted per-world (rw for root/world, ro for deeper groups).

Hierarchical delegation

Parent groups delegate to children via routing rules. Five rule types: command (prefix match), pattern (regex, max 200 chars), keyword (case-insensitive), sender (regex match on sender name), default (fallback / catch-all). Rules evaluated in tier order; first match wins. Delegation is parent-to-child or parent-to-self, same world (except tier 0 root can delegate anywhere), max chain depth 2.

Auto-threading

Route targets support RFC 6570 {sender} templates. At routing time, {sender} expands to the sender name, creating per-sender child folders automatically. Enables one-group-per-user patterns where each person gets their own agent context, diary, and memory.

Reply routing

delegatePerSender batches messages by sender before forwarding to child groups. Reply threading tracks lastSentId per chunk sequence so multi-message responses chain correctly on platforms that support reply threading. Escalation responses from local: JIDs carry origin JID and messageId for proper attribution.

Chat-bound sessions

Setting IDLE_TIMEOUT=0 keeps containers alive indefinitely, binding them to the chat. Useful for persistent agents that maintain conversational state across arbitrarily spaced messages. Cross-channel preemption: if a different JID needs the same folder, idle containers are closed.

CLI

kanipi config <instance> group list         # registered + discovered
kanipi config <instance> group add [folder] # register group (folder only, no JID)
kanipi config <instance> group rm  <folder> # unregister (folder stays on disk)
kanipi config <instance> user  add|rm|list|passwd  # manage web auth
kanipi config <instance> mount add|rm|list          # manage container mounts

Media Processing

Attachment pipeline: download, detect MIME (magic bytes via file-type), enrich, deliver. Handlers run in parallel per attachment type. Configurable per instance: MEDIA_ENABLED, VOICE_TRANSCRIPTION_ENABLED, VIDEO_TRANSCRIPTION_ENABLED.

Voice transcription

Whisper service (kanipi-whisper Docker image) transcribes voice messages before agent delivery. Per-group language hints via .whisper-language file (one BCP-47 code per line). Parallel passes: auto-detect + each configured language. Output labeled [voice/auto→en] or [voice/cs].

File sending

Agents send files via send_file IPC action. Path-safe: must be under group dir. Telegram routes photos/videos/audio to native methods (inline display). WhatsApp routes by MIME. Discord uses AttachmentBuilder.

File commands

/put, /get, /ls for bidirectional file transfer between chat users and group workspace. Deny globs, symlink escape protection, atomic writes. Disabled by default (FILE_TRANSFER_ENABLED).

Capability introspection

Gateway writes .gateway-caps TOML manifest to group dir before each spawn. Agent reads it to answer capability questions accurately (voice, video, media limits, web host).

Web Interface

Vite dev server managed by the entrypoint (not the TS gateway). Serves from web/ directory. /pub/ prefix = public (no auth). /priv/ = requires auth. Auth: argon2-hashed local accounts + JWT sessions. OAuth providers: GitHub, Discord, Telegram.

Slink (web channel)

POST /pub/s/<token> accepts messages via HTTP. Optional Authorization: Bearer <jwt> for higher rate limits (SLINK_AUTH_RPM default 60/min vs SLINK_ANON_RPM 10/min). SSE streaming at /_sloth/stream for agent-to-browser push. sloth.js widget for embedding in web pages.

Dashboard Portal

Self-registering dashboard system at /dash/. Each dashboard calls registerDashboard() with name, title, description, and handler. The portal index lists all registered dashboards. Dashboards receive a DashboardContext with access to the group queue and connected channels. Requires auth (/dash/ is behind the auth boundary).

Status dashboard

Built-in at /dash/status/. Shows gateway uptime, memory usage, connected channels, registered groups (with active/idle state), running containers (cached 5s via docker ps), queue state per group (pending messages, pending tasks, failure count), and scheduled tasks. JSON API at /dash/status/api/state. HTML auto-refreshes every 10s.

Task Scheduling

SQLite-backed scheduler with three modes: cron, interval, once. Agents request scheduled tasks via schedule_task IPC action. Two context modes: group (shares conversation history) or isolated (fresh context per run). Optional command field for raw-mode tasks (bash, not agent). Run log with per-task status and last result. TIMEZONE validated via Intl.DateTimeFormat.

Products as Configurations

A product is a group configured for a specific role. Same gateway, different CLAUDE.md (behavior) + SOUL.md (persona) + skills (capabilities) + mounts (data) + tasks. The gateway runs groups, not products.

Atlas

Code support agent. Mounted repos, workspace knowledge, support persona. Searches code, researches via subagents, answers architecture questions. Uses facts/ memory for persistent knowledge, /recall for retrieval.

Channels: Telegram, Discord · shipped

Yonder

Research associate and knowledge mapper. Message from phone, agent researches topics, builds knowledge pages, maps connections. Vite serves results live.

Channels: Telegram, Web · shipped

Quick Start

Prerequisites

Node.js 22+, Docker, bun. Anthropic credentials: CLAUDE_CODE_OAUTH_TOKEN (from claude login) or ANTHROPIC_API_KEY.

Docker deployment

make image                                    # gateway image
make agent-image                              # agent image
./kanipi create foo                           # seed /srv/data/kanipi_foo/
edit /srv/data/kanipi_foo/.env                # set tokens
./kanipi config foo group add tg:-123456789   # register main group
./kanipi foo                                  # start

Bare metal / development

npm install && make build
npx tsx src/cli.ts create foo
edit /srv/data/kanipi_foo/.env
make agent-image                              # agent container still needs docker
npx tsx src/cli.ts config foo group add tg:-123456789
npm run dev

Instance layout

/srv/data/kanipi_foo/
  .env                    # config (tokens, ports, flags)
  store/                  # SQLite DB, whatsapp auth
  groups/main/            # root group workdir
    .claude/              # skills, projects, MEMORY.md
    logs/                 # conversation logs
    diary/                # agent daily notes
    episodes/             # compressed session history (day/week/month)
    facts/                # world-shared knowledge
    users/                # per-sender context files
    media/                # uploaded/generated files
    tmp/                  # scratch space
    CLAUDE.md             # agent behavior
    SOUL.md               # agent personality
  data/ipc/main/          # IPC requests/responses per group
  web/pub/                # public web (no auth)
  web/priv/               # private web (auth required)

Key config

KeyPurpose
ASSISTANT_NAMEinstance name
TELEGRAM_BOT_TOKENenables telegram
DISCORD_BOT_TOKENenables discord
EMAIL_IMAP_HOSTenables email (IMAP IDLE)
TWITTER_USERNAMEenables twitter/x (+ _PASSWORD, _EMAIL)
REDDIT_CLIENT_IDenables reddit (+ _CLIENT_SECRET, _USERNAME, _PASSWORD)
MASTODON_INSTANCE_URLenables mastodon (+ _ACCESS_TOKEN)
BLUESKY_IDENTIFIERenables bluesky (+ _PASSWORD)
FACEBOOK_PAGE_IDenables facebook (+ _PAGE_ACCESS_TOKEN)
CONTAINER_IMAGEagent docker image name
CLAUDE_CODE_OAUTH_TOKENpassed to agent containers
IDLE_TIMEOUTcontainer keepalive (ms, default 1800000; 0 = never)
MAX_CONCURRENT_CONTAINERSparallel agent limit (default 5)
MEDIA_ENABLEDattachment pipeline (default false)
VITE_PORTenables web serving + dashboard
AUTH_SECRETJWT secret for sessions and dashboard
WHISPER_BASE_URLwhisper service URL
TIMEZONEcron timezone (validated, fallback UTC)

systemd deployment

sudo cp /srv/data/kanipi_foo/kanipi_foo.service /etc/systemd/system/
sudo systemctl enable --now kanipi_foo

Multi-instance

Each instance runs independently with its own data dir, agent image tag, and systemd service. Per-instance image tags (kanipi-agent-foo:latest) allow independent upgrades. Set CONTAINER_IMAGE per instance in .env.

Roadmap

PhaseStatusKey items
Phase 1 shipped 5 chat channels, routing, actions, diary, auth, scheduling, IPC, skills
Phase 2 shipped 5 social channels (Twitter, Reddit, Mastodon, Bluesky, Facebook), social events + actions, impulse gate
Phase 3 almost done 4-tier permissions, recall v2, episodes, auto-threading, reply routing, chat-bound sessions, dashboard portal, status dashboard
Phase 4 deferred Gmail API, instance repos, memory viewer, evangelist product, per-platform action grants
Phase 5 future agent-to-agent messaging, IPC-to-MCP proxy, cross-channel identity, workflows
Future planned Go gateway rewrite (single binary, native concurrency, same interfaces)

Stats

MetricValue
Source files65 TypeScript
Source LOC13,980
Test files43 files · 13,170 LOC
Test:code ratio0.94:1
Spec files97 specification documents
Tests798
Chat channelsTelegram, WhatsApp, Discord, Email, Web (Slink)
Social channelsTwitter/X, Reddit, Mastodon, Bluesky, Facebook
Social actions22 (post, reply, react, repost, follow, ban, pin...)
Memory layers7 (messages, session, managed, diary, user context, facts, episodes)
Permission tiers4 (root, world, agent, worker)
Config.env + env vars, SQLite-backed group state
Deploysystemd per instance, kanipi create CLI
BaseNanoClaw fork (upstream v1.1.3)
Runtime testedyes (v1.8.0)