OpenAI-compatible reverse proxy with token-scoped policies for budgets, rate limits, model access, content rules, and multi-provider routing.
make build # compile to ./bin/tokenomics
make test # go test ./...
make lint # golangci-lint run ./...
make tidy # go mod tidyAlways run make test after adding or modifying features. Fix failures before committing.
cmd/ CLI commands (Cobra): serve, token, init, remote
internal/
config/ YAML config loading, provider definitions, logging config
events/ Event emitter interface, webhook delivery
policy/ Policy parsing, rules engine, PII detection
proxy/ HTTP handler, rate limiting, stats, logging
remote/ Remote config server and client for centralized token sync
session/ Usage tracking (memory or Redis)
store/ BoltDB token storage, encryption
tls/ Certificate generation
tokencount/ tiktoken-based token counting
examples/ Provider configs, sample policies, webhook collector
docs/ Feature documentation
- Go 1.21+, modules at
github.com/rickcrawford/tokenomics - CLI uses Cobra with
cmd/root.goas the entrypoint - Config loaded via Viper from
config.yamlor$HOME/.tokenomics/config.yaml - Env prefix:
TOKENOMICS_(e.g.TOKENOMICS_HASH_KEY) .tokenomicsdirectory: Always relative to the current working directory where the command runs (e.g.tokenomics servefrom project root creates.tokenomics/). Override withTOKENOMICS_DIRenv var ordir:in config.yaml to use a different location (including absolute paths like~/.tokenomics).- Policies are JSON, stored AES-256-GCM encrypted in BoltDB
- Rules use object format:
{"type":"regex|keyword|pii", "action":"fail|warn|log|mask", "scope":"input|output|both"} - Event emitter uses
Emitterinterface; passnilfor no-op in tests
- Shared HTTP Client: Always use a shared
*http.Clienton theHandlerstruct (initialized inNewHandler()). Never create&http.Client{}inside request handlers — each new client bypasses Go's connection pooling. - Body Reads: Use
io.LimitReaderfor all body reads (request and response). ReusemaxRequestBodySize(10 MB) andmaxResponseBodySize(32 MB) constants to prevent unbounded memory allocation. - SSE Parsing: Use
bytes.Bufferwith incrementalReadBytes('\n')for O(1)-per-chunk processing instead of string concatenation andstrings.Split. - Content Accumulation: Cap assistant response content (
contentBuilder.Len() < maxMemoryContentSize) and user message accumulation (partsSize < maxMemoryContentSize). Default cap is 512 KB. - Persistent Loggers: Use
sync.Onceto initialize file handles once (e.g., debug log) instead of opening/closing on every call. - File Handles: Close stale file handles when paths change (e.g., date rollover in
DirMemoryWriter.getFile()).
Update docs when adding features. Keep docs concise and scannable. Reference files:
docs/CONFIGURATION.mdfor config fieldsdocs/POLICIES.mdfor policy schema and rulesdocs/EVENTS.mdfor webhook event typesdocs/TOKEN_MANAGEMENT.mdfor CLI token commandsdocs/AGENT_INTEGRATION.mdfor init command usageREADME.mdfeatures table for new capabilitiesdocs/WEB.mdfor embedded admin routes and UXdocs/ADMIN_UI.mdfor admin tabs, policy editor UX, and embedded docs workflow
Documentation update rule:
- For every new feature or behavior change, update the relevant docs in the same change.
- If admin UX or in-app instructions change, update embedded docs content in
cmd/web/admin/assets/docs.jsonin the same change. - If policy behavior changes, update
docs/POLICIES.md. - If configuration or CLI behavior changes, update
docs/CONFIGURATION.mdandREADME.mdwhere applicable.
- No em dashes. Use periods or commas instead.
- Keep prose short. Tables over paragraphs where possible.
- No emojis unless explicitly asked.