Last Updated: 2026-04-05 Status: Production — 25-persona phased cognitive pipeline + self-research, 715 tests green
Start Command:
source venv/bin/activate && python bot_unified.pyConfig: projects.yaml (gitignored) — see projects.example.yaml for schema
Loaded from projects.yaml at startup. Validated with validate_config().
File: .env (gitignored)
SESSION_BACKEND—apiormaxSESSION_MODEL— defaultclaude-sonnet-4-6SHELLACK_BOT=1— suppresses Slack MCP in claude subprocess
File: projects.yaml (gitignored)
- All project definitions, channel routing, coding standards
- Env var overrides:
{PROJECT}_PROJECT_PATH,{PROJECT}_BUNDLE_ID SHELLACK_CONFIGoverrides config file path
- Self-research auto-dispatch —
tools/self_research.pyruns a Haiku-driven loop (up to 5 steps) to investigate codebase questions. Iteratively runs safe read-only commands, accumulates findings, injects into enriched context. Fires automatically on moderate+ when Toolkeeper doesn't gather output. - Toolkeeper persona —
tools/personas/toolkeeper.pyauto-executes safe read-only commands (cat, grep, git log, etc.) to gather context before the main agent call. Safety-hardened with shell injection blocking ($(), backticks, process substitution, eval, exec). - Agent-manager enabled by default — 3-tier Haiku classification (simple/moderate/complex) now runs on every message. Simple questions skip pipeline entirely (~14s), moderate gets pipeline + self-research (~75s), complex gets full 9-phase pipeline (~3min).
- Cost tracking — real token counts extracted from Anthropic API responses, wired through pipeline and micro-loops via
_usagemetadata pattern. - Skeptic dynamic targeting — micro-loop
revision_targetfield allows Skeptic to route revisions to any persona, not just the statictotarget. - Discussion log improvements —
_summarize_slotshows meaningful previews (verdict, command count, findings) instead of dict keys. Persona JSON fences stripped in base class. - Markdown→mrkdwn tables —
_convert_tables()wraps markdown tables in code blocks for Slack rendering. - 14 per-persona unit tests — Connector, DataScientist, Empathizer, Dreamer, GrowthCoach, Insights, Inspector, Tester, VisualUX, Rogue, Hacker, Prioritizer, Simplifier, DevilsAdvocate.
- Classification prompt tuning — "uptime?" → simple (14s), "how does triage work?" → moderate (reads source), "trace lifecycle" → complex (full pipeline).
- Pipeline metadata passthrough — underscore-prefixed keys (e.g.
_project_path) pass through to all personas regardless ofreadsdeclaration.
- Toolkeeper
_project_pathwas filtered by reads list → always ran in cwd instead of project dir - Self-research trigger was inverted (fired when Toolkeeper said "no tools needed" = wrong direction)
- Toolkeeper
run()override silently dropped_usage→ cost always 0 - Agent-manager was opt-in (default OFF) → everything defaulted to moderate
- Enriched context fed to Toolkeeper made Haiku think it already had enough info
- Shell injection vectors:
$(), backticks,<(),system(),sed -i,eval,exec,source, pipe to sh/bash _StubPersonaundefined type annotation in test- Unused
fieldimport in pipeline.py - Persona JSON wrapped in markdown fences →
{"raw": text}fallback → bad discussion display
projects.yaml→orchestrator_config.py(YAML loader) → module-level exports- One
ProjectAgentper channel, pre-warmed at startup - 3-tier classification:
agent_manager.py→ Haiku classifies → simple/moderate/complex - Pipeline:
pipeline.py→ 25 personas across 9 phases, tier-gated activation - Self-research:
self_research.py→ iterative Haiku loop for codebase investigation - Toolkeeper: auto-executes safe commands, hardened safety whitelist
- Single-turn:
APIBackend→ThinkingIndicator→ gray attachment with response inline run:sessions:MaxBackendorAPIBackend→SlackSessionstreaming
- Self-research capability — auto-dispatch for context gaps (#22)
- Memory calibration agent (#21)
- Action buttons trigger run: sessions (#20)
- XML leak fix for cross-chunk boundaries in streaming
- Computer Use (#15), Visual UX screenshots (#16), Log Access (#17)
- Anthropic client consolidation (single global instance)
- Landing page: shellack.dev deployed on Cloudflare Workers
- LLM-driven agent transitions — mid-conversation routing (#14, low priority)
- Parallel-within-phase execution (stretch goal for latency)
Shellack/
├── bot_unified.py
├── orchestrator_config.py # YAML loader
├── projects.yaml # YOUR config (gitignored)
├── projects.example.yaml # template for forks
├── hooks/pre-commit # secret scanning
├── agents/
│ ├── agent_factory.py
│ ├── project_agent.py
│ └── sub_agents.py
├── tools/
│ ├── pipeline.py # TurnContext, Phase, run_pipeline orchestration
│ ├── self_research.py # Multi-step Haiku codebase investigation
│ ├── personas/ # 25 cognitive personas (one file each)
│ │ ├── __init__.py # Persona base class + registry + fence stripping
│ │ ├── toolkeeper.py # Auto-execute safe commands for context
│ │ ├── strategist.py # Phase 3: task decomposition
│ │ ├── historian.py # Phase 3: prior decisions
│ │ ├── architect.py # Phase 4: structural proposals
│ │ ├── skeptic.py # Phase 6: assumption challenge
│ │ ├── infosec.py # Phase 7: defensive security
│ │ ├── inspector.py # Phase 8: completeness check
│ │ ├── learner.py # Phase 9: lesson extraction
│ │ └── ... (25 total)
│ ├── token_cart.py # Haiku-powered context compaction
│ ├── triage.py # Complexity classification (unused — agent_manager used instead)
│ ├── agent_manager.py # Haiku classifier + model selection (enabled by default)
│ ├── agent_discussion.py # Phase-grouped discussion log with emoji personas
│ ├── registry.py # .shellack/registry.md management
│ ├── thread_memory.py # Cross-thread persistence (24h TTL)
│ ├── cost_tracker.py # Per-turn/thread cost tracking
│ ├── slack_session.py # Streaming + _md_to_mrkdwn + table conversion
│ ├── session_backend.py
│ ├── thinking_indicator.py
│ └── lifecycle.py
├── tests/ # 715 tests
└── .env # credentials (SECRET)
ps aux | grep bot_unified
source venv/bin/activate && python bot_unified.py
venv/bin/pytest -qLast session: 2026-04-05