aictl 🤖

AI agent in your terminal — 53 built-in cloud models across 8 providers, plus any model available through Ollama, native GGUF inference via llama.cpp, or native MLX inference on Apple Silicon

Project website: aictl.app — source in website/.

User guides: https://aictl.app/guides.html

Note

The aictl is a general-purpose AI agent. Dedicated coding capabilities may be added in the future. If you are looking for an AI agent specialized in software development today, consider Claude Code, Codex, or opencode — they are purpose-built for that workflow.

Install

curl -sSf https://aictl.app/install.sh | sh

The installer downloads a prebuilt binary for your platform from the latest GitHub release and places it in ~/.local/bin/aictl. If aictl is already installed at ~/.cargo/bin/aictl (e.g. from a prior cargo install), the installer updates it in place at that location instead of the default ~/.local/bin/. Set AICTL_INSTALL_DIR to pick a different location explicitly. If no prebuilt binary exists for your platform, the installer falls back to building from source with cargo install.

Supported platforms

Prebuilt binaries are published for:

OS	Architectures
Linux	`x86_64`, `aarch64`
macOS	`x86_64`, `aarch64` (Apple Silicon)

Native Windows is not supported — aictl depends on a POSIX shell (sh) and Unix tools (date, pbcopy, etc.) for its built-in tool calls. Windows users can run aictl inside WSL using the Linux binary, which works normally.

Other platforms (FreeBSD, other BSDs, uncommon Linux architectures) can still build from source via the cargo install fallback path, provided a Rust toolchain is available.

Prerequisites

Installing a prebuilt binary has no prerequisites beyond curl. Building from source (either via the installer fallback or manually) requires Rust (edition 2024).

From source

git clone git@github.com:pwittchen/aictl.git
cd aictl
cargo install --path .

This installs the aictl binary to ~/.cargo/bin/.

Build without installing

cargo build --release

The binary will be at target/release/aictl.

Optional feature flags

Native local-model inference is gated behind cargo features so a plain cargo build / cargo install keeps a lightweight default (no C++ toolchain or Metal Toolchain required). Opt in per backend:

Feature	What it enables	Platform	Extra build-time requirements
`gguf`	Native GGUF inference via `llama-cpp-2`	All	`cmake` + a working C/C++ compiler (Xcode Command Line Tools on macOS, `build-essential` on Debian/Ubuntu)
`mlx`	Native MLX inference via `mlx-rs` (Apple's MLX framework)	macOS + Apple Silicon only	Full Xcode (not just CLT) with the Metal Toolchain installed
`redaction-ner`	Layer-C Named Entity Recognition for the redaction pipeline via `gline-rs` (GLiNER ONNX models through the `ort` crate; bundled ONNX Runtime binary, no system install)	All	None

Examples:

# GGUF only
cargo build --release --features gguf
cargo install --path . --features gguf

# MLX only (macOS Apple Silicon)
cargo build --release --features mlx
cargo install --path . --features mlx

# All three (GGUF + MLX + NER-backed redaction)
cargo build --release --features "gguf mlx redaction-ner"
cargo install --path . --features "gguf mlx redaction-ner"

Without these features, the corresponding slash commands (/gguf, /mlx) and CLI flags (--pull-gguf-model, --pull-mlx-model, --pull-ner-model, etc.) still work for model management (download / list / remove); only the inference path is disabled, and trying to run a local model or enable NER-backed redaction prints a clear error telling you which feature to rebuild with.

The prebuilt binaries published on GitHub Releases (downloaded by install.sh) ship with --features gguf enabled on every platform — so one-liner installs get native GGUF inference out of the box where the platform supports it. The macOS Apple Silicon (aarch64) release additionally ships with --features mlx and includes a sibling mlx.metallib file alongside the binary (MLX needs the Metal library at runtime); every other platform's release contains just the aictl binary.

Uninstall

Binary release (installed via `install.sh`)

The install script places the binary at ~/.local/bin/aictl (or $AICTL_INSTALL_DIR if you set it). Remove it with:

rm ~/.local/bin/aictl

From source (installed via `cargo install`)

Cargo tracks its own installs, so the clean way is:

cargo uninstall aictl

This removes ~/.cargo/bin/aictl. If cargo uninstall doesn't find it (e.g. installed under a different crate name), delete the binary directly:

rm ~/.cargo/bin/aictl

Remove configuration and data (optional)

aictl stores all state under ~/.aictl/ — config file, saved agents, saved sessions. To wipe it completely:

rm -rf ~/.aictl

Skip this step if you plan to reinstall and want to keep your API keys, agents, and session history.

Usage

aictl [--version] [--update] [--uninstall] [--config] [--provider <PROVIDER>] [--model <MODEL>] [--message <MESSAGE>] [--auto] [--quiet] [--unrestricted] [--incognito] [--agent <NAME>] [--list-agents] [--skill <NAME>] [--list-skills] [--session <ID|NAME>] [--list-sessions] [--clear-sessions] [--lock-keys] [--unlock-keys] [--clear-keys] [--pull-gguf-model <SPEC>] [--list-gguf-models] [--remove-gguf-model <NAME>] [--clear-gguf-models] [--pull-mlx-model <SPEC>] [--list-mlx-models] [--remove-mlx-model <NAME>] [--clear-mlx-models]

Omit --message to enter interactive REPL mode with persistent conversation history.

REPL Commands

The interactive REPL supports slash commands:

Command	Description
`/agent`	Manage agents (create manually, create with AI, view/load/delete, unload)
`/clear`	Clear conversation context
`/compact`	Summarize conversation into a compact context
`/retry`	Remove the last user/assistant exchange and retry with the same prompt (useful when a response goes off track)
`/context`	Show context usage (token and message counts vs limits)
`/copy`	Copy last response to clipboard
`/help`	Show available commands
`/info`	Show setup info (provider, model, behavior, memory, agent, version, OS, binary size)
`/gguf`	Manage native GGUF models (view downloaded, pull, remove, clear all)
`/mlx`	Manage native MLX models (Apple Silicon; view downloaded, pull, remove, clear all)
`/memory`	Switch memory mode: long-term (all messages) or short-term (sliding window)
`/security`	Show current security policy (blocked commands, CWD jail, timeouts, etc.)
`/session`	Manage sessions (show current info, set name, view/load/delete saved, clear all)
`/skills`	Manage skills (create manually, create with AI, view/invoke/delete) — one-turn markdown playbooks
`/stats`	Manage usage statistics — view today/month/overall (sessions, calls, tokens, estimated cost) or clear all
`/behavior`	Switch between auto and human-in-the-loop mode during the session
`/model`	Switch model and provider during the session (persists to `~/.aictl/config`)
`/ping`	Validate every configured API key and probe provider connectivity (cloud providers + Ollama daemon)
`/tools`	Show available tools
`/keys`	Manage API key storage — lock (config → keyring), unlock (keyring → config), or clear (both stores)
`/config`	Re-run the interactive configuration wizard
`/update`	Update to the latest version
`/uninstall`	Remove the aictl binary from `~/.cargo/bin/` and `~/.local/bin/` (asks for confirmation)
`/version`	Check current version against the latest available
`/exit`	Exit the REPL

Any unrecognized /<name> that matches a saved skill (see Skills below) runs that skill for the next turn: /<skill-name> runs it with a default trigger, /<skill-name> <task> routes <task> as the user message.

Press Esc during any LLM call or tool execution to interrupt the operation and return to the prompt. Conversation history is rolled back so the interrupted turn has no effect.

Parameters

Only --version (-v) and --help (-h) have short flags. All other options use long form only, by convention.

Flag	Description
`--version`, `-v`	Print version information
`--help`, `-h`	Print help
`--update`	Update to the latest version
`--uninstall`	Remove the aictl binary from `~/.cargo/bin/aictl`, `~/.local/bin/aictl`, and `$AICTL_INSTALL_DIR/aictl` (if set) and exit. Leaves `~/.aictl/` untouched
`--config`	Interactive configuration wizard — set provider, model, and API keys step by step
`--provider`	LLM provider (`openai`, `anthropic`, `gemini`, `grok`, `mistral`, `deepseek`, `kimi`, `zai`, `ollama`, `gguf`, or `mlx`). Falls back to `AICTL_PROVIDER` in `~/.aictl/config`
`--model`	Model name (e.g. `gpt-4o`). Falls back to `AICTL_MODEL` in `~/.aictl/config`
`--message`	Message to send (omit for interactive mode)
`--agent`	Load a saved agent by name (works in both single-shot and interactive modes)
`--list-agents`	Print saved agents from `~/.aictl/agents/` and exit
`--skill`	Invoke a saved skill by name for a single turn. In single-shot mode the skill body is injected as a transient system prompt for the `--message` call only; in REPL mode it applies to the first user turn, then the REPL reverts to normal
`--list-skills`	Print saved skills from `~/.aictl/skills/` and exit
`--auto`	Run in autonomous mode (skip tool confirmation prompts)
`--quiet`	Suppress tool calls and reasoning, only print the final answer (requires `--auto`)
`--unrestricted`	Disable all security restrictions (use with caution)
`--incognito`	Start interactive REPL without saving any session (disables `/session`). Falls back to `AICTL_INCOGNITO` in `~/.aictl/config`
`--session`	Load a saved session by uuid or name on startup (interactive mode only)
`--list-sessions`	Print saved sessions from `~/.aictl/sessions/` and exit
`--clear-sessions`	Remove all saved sessions and exit
`--lock-keys`	Migrate plain-text API keys from `~/.aictl/config` into the system keyring and exit
`--unlock-keys`	Migrate API keys from the system keyring back into `~/.aictl/config` and exit
`--clear-keys`	Remove API keys from both `~/.aictl/config` and the system keyring and exit
`--pull-gguf-model`	Download a native GGUF model (spec: `hf:owner/repo/file.gguf`, `owner/repo:file.gguf`, or `https://…/file.gguf`). Saved under `~/.aictl/models/gguf/` and exits
`--list-gguf-models`	Print all downloaded native GGUF models and exit
`--remove-gguf-model`	Remove a downloaded native GGUF model by name and exit
`--clear-gguf-models`	Remove every downloaded native GGUF model and exit
`--pull-mlx-model`	Download a native MLX model (spec: `mlx:owner/repo` or `owner/repo`). Saved under `~/.aictl/models/mlx/<name>/` and exits
`--list-mlx-models`	Print all downloaded native MLX models and exit
`--remove-mlx-model`	Remove a downloaded native MLX model by name and exit
`--clear-mlx-models`	Remove every downloaded native MLX model and exit
`--pull-ner-model`	Download a redaction NER model (spec: `owner/repo` or `hf:owner/repo`; default shape: `onnx-community/gliner_small-v2.1`). Saved under `~/.aictl/models/ner/<name>/` and exits. Inference requires the `redaction-ner` cargo feature; management works on every build
`--list-ner-models`	Print all downloaded NER models and exit
`--remove-ner-model`	Remove a downloaded NER model by name and exit
`--clear-ner-models`	Remove every downloaded NER model and exit

CLI flags take priority over config file values.

Sessions

In interactive mode, each REPL run is a session. A new uuid is generated at startup and the conversation is persisted to ~/.aictl/sessions/<uuid> as JSON after every agent turn and compaction. Session names (optional, unique) are stored in ~/.aictl/sessions/.names. On exit, the session uuid (and name, if set) is printed.

Use /session to show current session info, assign a readable name, browse saved sessions (load or delete with confirmation), or clear all sessions. Pass --session <uuid|name> to resume an existing session on startup. Incognito mode (--incognito or AICTL_INCOGNITO=true) runs the REPL without creating or saving any session file; /session is disabled and displays a notice.

Agents

Agents are reusable system prompt extensions that specialize the LLM for dedicated tasks or behaviors. Agent prompts are stored as plain text files in ~/.aictl/agents/.

Use /agent to open the agent menu:

Create agent manually — enter a name and type or paste the agent prompt text directly
Create agent with AI — provide a name and brief description; the LLM generates the full agent prompt
View all agents — browse saved agents, view their prompt, load an agent, or delete it
Unload agent — remove the currently loaded agent (only shown when one is loaded)

Agents can also be loaded from the command line with --agent <name>, which works in both single-shot and interactive modes.

Agent names may contain only letters, numbers, underscores, and dashes. When an agent is loaded, its prompt is appended to the system prompt and the agent name appears in magenta brackets before the input prompt (e.g. [my-agent] ❯).

Skills

Skills are markdown playbooks invoked on demand for a single turn — unlike agents, which persist for the whole session. A skill encodes a repeatable procedure ("run the commit workflow", "review the pending diff") that the LLM should follow this one time; after the turn completes, the skill is gone. Skills live under ~/.aictl/skills/<name>/SKILL.md (overridable via AICTL_SKILLS_DIR).

Each SKILL.md starts with YAML frontmatter (name, description) followed by the markdown body:

---
name: commit
description: Commit staged changes with a clear, project-style message.
---

When the user asks you to commit:
1. Run `git status` and `git diff --cached` to see what's staged.
2. ...

Use /skills to open the skill menu:

Create skill manually — enter a name and description, then type or paste the body
Create skill with AI — provide a name and one-line description; the LLM drafts the body
View all skills — browse saved skills with view / invoke / delete actions

Invoke a skill directly by typing /<skill-name> at the REPL prompt. /commit runs the skill with a default trigger so the body alone drives the turn; /commit review the staged diff routes the trailing text as the user message. --skill <name> works the same way in single-shot and REPL modes. --list-skills prints saved skills and exits.

Skill names may contain only letters, numbers, underscores, and dashes and must not collide with a built-in slash command (e.g. help, exit, agent) — such names are rejected at save time. The skill body is merged into the base system prompt for the turn (rather than sent as a separate system message) so every provider, including those that accept only a single top-level system field, sees the skill alongside the tool catalog.

Configuration

Configuration is loaded from ~/.aictl/config. This is a single global config file.

Additionally, aictl loads a project prompt file from the current working directory (default: AICTL.md). If present, its contents are appended to the system prompt, allowing per-project instructions for the agent. The filename can be customized via AICTL_PROMPT_FILE in ~/.aictl/config. When the configured/default file is missing, aictl falls back to CLAUDE.md and then AGENTS.md so existing project instructions for other tools are reused automatically; the fallback chain can be disabled with AICTL_PROMPT_FALLBACK=false.

The quickest way to get started is the interactive wizard:

aictl --config

It walks you through selecting a provider, model, and entering API keys. You can also edit ~/.aictl/config manually at any time.

Basic configuration

You need to configure API key for the provider and model you want to use. AICTL_MEMORY and AICTL_INCOGNITO params are optional.

Key	Description
`AICTL_PROVIDER`	Default provider (`openai`, `anthropic`, `gemini`, `grok`, `mistral`, `deepseek`, `kimi`, `zai`, `ollama`, `gguf`, or `mlx`)
`AICTL_MODEL`	Default model name
`AICTL_MEMORY`	Memory mode: `long-term` (all messages, default) or `short-term` (sliding window)
`AICTL_INCOGNITO`	Start interactive REPL without saving sessions. Accepts `true` or `false` (default: `false`)
`AICTL_PROMPT_FILE`	Filename for the project prompt file loaded from the current directory (default: `AICTL.md`)
`AICTL_PROMPT_FALLBACK`	When the primary prompt file is missing, fall back to `CLAUDE.md` then `AGENTS.md`. Accepts `true` or `false` (default: `true`)
`AICTL_TOOLS_ENABLED`	Enable or disable all tool calls. When `false`, the LLM can only respond with plain text (default: `true`)
`AICTL_AUTO_COMPACT_THRESHOLD`	Context usage percentage at which the REPL auto-compacts the conversation. Accepts an integer in `1..=100` (default: `80`)
`AICTL_LLM_TIMEOUT`	Per-call LLM response timeout in seconds. Applied to every provider (remote APIs, Ollama, native GGUF/MLX) and to the compaction and agent-generation calls. `0` disables the timeout. Default: `30`
`AICTL_MAX_ITERATIONS`	Maximum number of LLM calls allowed in a single agent turn before the loop aborts. Accepts a positive integer (default: `20`)
`AICTL_SKILLS_DIR`	Override the location of the skills directory (default: `~/.aictl/skills`)

API keys

FIRECRAWL_API_KEY is optional and is needed only if you want to use search_web tool.

Not all API keys are required. You need to provide only those, for which you set AICTL_PROVIDER and AICTL_MODEL.

If you want to use multiple LLM providers, then you need to provide appropriate keys.

Key	Description
`LLM_OPENAI_API_KEY`	API key for OpenAI
`LLM_ANTHROPIC_API_KEY`	API key for Anthropic
`LLM_GEMINI_API_KEY`	API key for Google Gemini
`LLM_GROK_API_KEY`	API key for xAI Grok
`LLM_MISTRAL_API_KEY`	API key for Mistral
`LLM_DEEPSEEK_API_KEY`	API key for DeepSeek
`LLM_KIMI_API_KEY`	API key for Kimi (Moonshot AI)
`LLM_ZAI_API_KEY`	API key for Z.ai
`LLM_OLLAMA_HOST`	Ollama server URL (default: `http://localhost:11434`)
`FIRECRAWL_API_KEY`	API key for Firecrawl (`search_web` tool)

Where to get API keys

Each provider issues API keys through its own developer console. Sign up, create a key, then paste it into ~/.aictl/config (or run aictl --config).

Provider	Console URL
OpenAI	platform.openai.com/api-keys
Anthropic	console.anthropic.com/settings/keys
Google Gemini	aistudio.google.com/app/apikey
xAI Grok	console.x.ai
Mistral	console.mistral.ai/api-keys
DeepSeek	platform.deepseek.com/api_keys
Kimi (Moonshot)	platform.moonshot.ai/console/api-keys
Z.ai	z.ai/manage-apikey/apikey-list
Firecrawl	firecrawl.dev/app/api-keys

Ollama, native GGUF, and native MLX run locally and require no API key.

Secure key storage (system keyring)

By default, API keys live as plain text in ~/.aictl/config. aictl can also store them in the OS-native keyring — macOS Keychain or Linux Secret Service (gnome-keyring / KWallet via D-Bus) — and reads them transparently from whichever store has them.

The active backend appears in the welcome banner (keys: Keychain (2 locked · 1 plain · 0 both)) and /security shows the per-key location.

Migration is done from inside the REPL via the /keys interactive menu:

lock keys — copies every plain-text key found in ~/.aictl/config into the system keyring and removes the plain-text copy
unlock keys — copies every keyring entry back into ~/.aictl/config and deletes it from the keyring
clear keys — removes the keys from both stores (asks for confirmation)

The same operations are available as one-shot CLI flags: --lock-keys, --unlock-keys, --clear-keys.

When the keyring backend is unavailable (e.g. headless Linux without a Secret Service daemon), aictl falls back to plain-text storage automatically and the banner shows keys: plain text in yellow.

Security configuration (optional)

Key	Description
`AICTL_SECURITY`	Master security switch (default: `true`)
`AICTL_SECURITY_INJECTION_GUARD`	Block user prompts that look like prompt-injection attempts (default: `true`)
`AICTL_SECURITY_CWD_RESTRICT`	Restrict file tools to working directory (default: `true`)
`AICTL_SECURITY_SHELL_ALLOWED`	Comma-separated whitelist of allowed shell commands (empty = all except blocked)
`AICTL_SECURITY_SHELL_BLOCKED`	Additional blocked shell commands (added to built-in defaults)
`AICTL_SECURITY_BLOCK_SUBSHELL`	Block `$()`, backticks, and process substitution (default: `true`)
`AICTL_SECURITY_BLOCKED_PATHS`	Additional blocked file paths (added to built-in defaults)
`AICTL_SECURITY_ALLOWED_PATHS`	Paths allowed outside the working directory
`AICTL_SECURITY_SHELL_TIMEOUT`	Shell command timeout in seconds (default: `30`)
`AICTL_SECURITY_MAX_WRITE`	Max file write size in bytes (default: `1048576` = 1 MB)
`AICTL_SECURITY_DISABLED_TOOLS`	Comma-separated tool names to disable (e.g. `exec_shell,search_web`)
`AICTL_SECURITY_BLOCKED_ENV`	Additional env vars to scrub from shell subprocesses
`AICTL_SECURITY_AUDIT_LOG`	Append one JSON line per tool invocation to `~/.aictl/audit/<session-id>` (default: `true`)
`AICTL_SECURITY_REDACTION`	Outbound-message redaction mode: `off` (default), `redact`, or `block`. In `redact` mode each credential/PII match is swapped for `[REDACTED:<KIND>]` on the wire; in `block` mode the turn aborts with a scrubbed error.
`AICTL_SECURITY_REDACTION_LOCAL`	Also redact when sending to local providers (Ollama / GGUF / MLX). Default `false` — data never leaves the machine for these, so there's no privacy gain.
`AICTL_REDACTION_DETECTORS`	Comma-separated subset of built-in detectors (empty = all): `api_key, aws, jwt, private_key, connection_string, credit_card, iban, email, phone, high_entropy`.
`AICTL_REDACTION_EXTRA_PATTERNS`	Semicolon-separated `NAME=REGEX` pairs. Each match is replaced with `[REDACTED:NAME]` (e.g. `CUSTOMER_ID=CUST-\d{8};TICKET=JIRA-\d{4,}`).
`AICTL_REDACTION_ALLOW`	Semicolon-separated regexes; any detection whose span is covered by an allowlist hit is dropped. Useful for documentation examples or internal IDs that trip the entropy scanner.
`AICTL_REDACTION_NER`	Enable the optional Layer-C NER pass (person / location / organization). Requires the `redaction-ner` cargo feature and a pulled model. Default `false`.
`AICTL_REDACTION_NER_MODEL`	NER model spec (`owner/repo` or `hf:owner/repo`). Default: `onnx-community/gliner_small-v2.1`.

Create ~/.aictl/config (see .aictl/config in this repo for the reference):

AICTL_PROVIDER=anthropic
AICTL_MODEL=claude-sonnet-4-20250514
LLM_ANTHROPIC_API_KEY=sk-ant-...
FIRECRAWL_API_KEY=fc-...

The file format supports comments (#), quoted values, and optional export prefixes.

Providers

aictl supports eleven LLM providers — eight remote APIs plus Ollama, native GGUF inference via llama.cpp, and native MLX inference on Apple Silicon:

OpenAI

Requires LLM_OPENAI_API_KEY. Supported models with cost estimates (input/output per 1M tokens):

Model	Input	Output
`gpt-4.1-nano`	$0.10	$0.40
`gpt-4.1-mini`	$0.40	$1.60
`gpt-4.1`	$2.00	$8.00
`gpt-4o-mini`	$0.15	$0.60
`gpt-4o`	$2.50	$10.00
`gpt-5-mini`	$0.25	$2.00
`gpt-5`	$1.25	$10.00
`gpt-5.2`	$1.75	$14.00
`gpt-5.2-pro`	$30.00	$180.00
`gpt-5.4-nano`	$0.20	$1.25
`gpt-5.4-mini`	$0.75	$4.50
`gpt-5.4`	$2.50	$15.00
`gpt-5.4-pro`	$60.00	$270.00
`o4-mini`	$1.10	$4.40
`o3`	$2.00	$8.00
`o1`	$15.00	$60.00

GPT-5.2 and GPT-5.4 use dual-tier pricing that doubles above the 272K context threshold; the table shows the short-context rates. The cost meter in aictl always reports the short-context price.

Anthropic

Requires LLM_ANTHROPIC_API_KEY. Supported models with cost estimates (input/output per 1M tokens):

Model	Input	Output
`claude-haiku-*` (3.x)	$0.25	$1.25
`claude-haiku-4-*`	$1.00	$5.00
`claude-sonnet-*`	$3.00	$15.00
`claude-opus-4-5-` / `claude-opus-4-6-` / `claude-opus-4-7-*`	$5.00	$25.00
`claude-opus-4-*` (older)	$15.00	$75.00

Google Gemini

Requires LLM_GEMINI_API_KEY. Supported models with cost estimates (input/output per 1M tokens):

Model	Input	Output
`gemini-3.1-pro-preview`	$2.00	$12.00
`gemini-3.1-flash-lite-preview`	$0.25	$1.50
`gemini-2.5-pro`	$1.25	$10.00
`gemini-2.5-flash`	$0.15	$0.60

Gemini 3.1 Pro uses dual-tier pricing that doubles above a 200K context threshold; the table shows the short-context rates. gemini-2.0-flash has been removed from the model list because Google is shutting it down on June 1, 2026.

xAI Grok

Requires LLM_GROK_API_KEY. Supported models with cost estimates (input/output per 1M tokens):

Model	Input	Output
`grok-4`	$3.00	$15.00
`grok-4-fast-reasoning` / `grok-4-fast-non-reasoning`	$0.20	$0.50
`grok-4-1-fast-reasoning` / `grok-4-1-fast-non-reasoning`	$0.20	$0.50
`grok-3`	$3.00	$15.00
`grok-3-mini`	$0.30	$0.50

Grok 4 Fast variants ship with a 2M-token context window, the largest available across frontier models.

Mistral

Requires LLM_MISTRAL_API_KEY. Supported models with cost estimates (input/output per 1M tokens):

Model	Input	Output
`mistral-large-latest`	$2.00	$6.00
`mistral-medium-latest`	$0.40	$2.00
`mistral-small-latest`	$0.10	$0.30
`codestral-latest`	$0.30	$0.90

DeepSeek

Requires LLM_DEEPSEEK_API_KEY. Supported models with cost estimates (input/output per 1M tokens):

Model	Input	Output
`deepseek-chat`	$0.27	$1.10
`deepseek-reasoner`	$0.55	$2.19

Kimi

Requires LLM_KIMI_API_KEY. Supported models with cost estimates (input/output per 1M tokens):

Model	Input	Output
`kimi-k2.5`	$0.60	$2.00
`kimi-k2-0905-preview`	$0.60	$2.00
`kimi-k2-0711-preview`	$0.60	$2.00
`kimi-k2-turbo-preview`	$0.60	$2.00
`kimi-k2-thinking`	$0.60	$2.00
`kimi-k2-thinking-turbo`	$0.60	$2.00
`moonshot-v1-128k`	$0.60	$2.00
`moonshot-v1-32k`	$0.60	$2.00
`moonshot-v1-8k`	$0.60	$2.00

Z.ai

Requires LLM_ZAI_API_KEY. Supported models with cost estimates (input/output per 1M tokens):

Model	Input	Output
`glm-5.1`	$1.40	$4.40
`glm-5-turbo`	$1.20	$4.00
`glm-5`	$0.72	$2.30
`glm-4.7`	$0.39	$1.75
`glm-4.7-flash`	$0.06	$0.40

Ollama

Ollama runs models locally — no API key required. Install Ollama from ollama.com, pull a model, and start the server:

ollama pull llama3.2
ollama serve

Then configure aictl to use it:

AICTL_PROVIDER=ollama
AICTL_MODEL=llama3.2:latest

Available models are detected automatically from your local Ollama instance via the REST API. The /model command shows only models you have pulled locally. If Ollama is not running, it will not appear in the model menu.

By default, aictl connects to http://localhost:11434. To use a different address, set LLM_OLLAMA_HOST in ~/.aictl/config.

All Ollama models are free (self-hosted), so cost estimation shows $0.00.

Any model string can be passed via --model; cost estimation uses pattern matching on the model name and falls back to zero if unrecognized.

Native GGUF (llama.cpp) — experimental

Experimental. Native GGUF inference is a new, work-in-progress feature. It runs, it works, and it talks to the same tools the API providers do — but expect rough edges: small models struggle with tool-call formatting, chat templates are hard-coded to ChatML (so some models respond in a less natural style than their native template would produce), generation parameters are fixed, and performance tuning (GPU offload, context reuse across turns, speculative decoding) has not been wired up yet. The API-provider path remains the recommended default for day-to-day use. Please report issues at github.com/pwittchen/aictl/issues.

aictl can run GGUF models in-process via llama-cpp-2 — no Ollama server required. By default no local models are available; they must be downloaded explicitly by the user, one at a time, into ~/.aictl/models/gguf/.

Native inference is gated behind the gguf cargo feature. Prebuilt binaries published on GitHub Releases (the ones install.sh downloads) ship with --features gguf enabled, so users who install via the one-liner get native GGUF inference out of the box — no extra steps required.

When building from source, the gguf feature is off by default to keep a plain cargo install aictl / cargo build working without a C/C++ toolchain. Opt in explicitly:

cargo install --path . --features gguf
# or
cargo build --release --features gguf

Building with --features gguf requires cmake and a working C/C++ compiler (Xcode Command Line Tools on macOS, build-essential on Debian/Ubuntu). The install-script fallback path (cargo install --git ..., triggered when no prebuilt binary exists for your platform) does not pass --features gguf and will therefore produce a binary without native inference — in that case, rebuild manually with the command above.

Model management (works in every build, even without --features gguf):

# Pull a GGUF model from Hugging Face
aictl --pull-gguf-model hf:bartowski/Llama-3.2-3B-Instruct-GGUF/Llama-3.2-3B-Instruct-Q4_K_M.gguf

# Shorthand form
aictl --pull-gguf-model bartowski/Llama-3.2-3B-Instruct-GGUF:Llama-3.2-3B-Instruct-Q4_K_M.gguf

# Direct URL
aictl --pull-gguf-model https://example.com/path/model.gguf

# List, remove, clear
aictl --list-gguf-models
aictl --remove-gguf-model Llama-3.2-3B-Instruct-Q4_K_M
aictl --clear-gguf-models

Inside the REPL, /gguf opens an interactive menu with the same operations (view downloaded / pull / remove / clear all). Downloads stream to ~/.aictl/models/gguf/<name>.gguf.part with a progress bar and are atomically renamed on completion, so an interrupted download never leaves a half-written model in place.

Once a model is downloaded it appears in the /model picker under the Native GGUF header, alongside Ollama models. Configure it as the default:

AICTL_PROVIDER=gguf
AICTL_MODEL=Llama-3.2-3B-Instruct-Q4_K_M

Inference runs on a tokio::spawn_blocking task, so it doesn't block the async runtime. Cost always shows $0.00. Messages are flattened into a ChatML-style prompt, which works well for modern instruction-tuned models; per-model chat templates may be added later. If you try to use a GGUF model in a build without --features gguf, aictl prints a clear error telling you to rebuild.

Tested GGUF models

The following models have been verified end-to-end (download, load, inference, tool calls) via the /gguf pull menu's predefined catalog:

Model	Pull command
`Qwen3-4B-Q4_K_M`	`aictl --pull-gguf-model lmstudio-community/Qwen3-4B-GGUF:Qwen3-4B-Q4_K_M.gguf`

Native MLX (Apple Silicon) — experimental

Experimental. Native MLX inference is a new feature limited to macOS on Apple Silicon (aarch64). Architecture coverage is currently Llama-family — Llama 3.x, Qwen 2.5, Mistral 7B v0.3, DeepSeek-R1 Distill Qwen — plus Gemma 2. Phi-3.5 and MoE models are rejected with a clear error. Llama 3.1/3.2 RoPE scaling is not yet applied (quality degrades past ~8K context), top-p sampling is omitted (temperature only), and the chat-template renderer falls back to ChatML when the per-model jinja template fails to render. Please report issues at github.com/pwittchen/aictl/issues.

aictl can run MLX models in-process via mlx-rs — no Python, no mlx_lm, no separate server. Quantized 4-bit weights from the mlx-community Hugging Face organization are loaded directly via safetensors. By default no local MLX models are available; they must be downloaded explicitly by the user into ~/.aictl/models/mlx/<name>/.

The macOS Apple Silicon prebuilt binary on GitHub Releases ships with --features mlx enabled and includes a sibling mlx.metallib file placed next to the binary at install time (MLX's first runtime fallback is <exec_dir>/mlx.metallib). Other platform releases contain only the aictl binary — they don't support MLX.

Native inference is gated behind the mlx cargo feature. When building from source, the mlx feature is off by default. Opt in explicitly (Apple Silicon only):

cargo install --path . --features mlx
# or
cargo build --release --features mlx

Building with --features mlx requires the Xcode Metal Toolchain (full Xcode, not just the Command Line Tools). Install via Xcode → Settings → Components, or xcodebuild -downloadComponent MetalToolchain. Verify with xcrun --find metal.

Model management (works in every build, even without --features mlx and even on non-Apple-Silicon hosts):

# Pull an MLX model from Hugging Face (mlx-community)
aictl --pull-mlx-model mlx:mlx-community/Llama-3.2-3B-Instruct-4bit

# Shorthand form
aictl --pull-mlx-model mlx-community/Qwen2.5-7B-Instruct-4bit

# List, remove, clear
aictl --list-mlx-models
aictl --remove-mlx-model mlx-community__Llama-3.2-3B-Instruct-4bit
aictl --clear-mlx-models

Inside the REPL, /mlx opens an interactive menu with the same operations plus a curated catalog of popular mlx-community repos. Downloads stream multi-file safetensors directories with a per-file progress bar.

Once a model is downloaded it appears in the /model picker under the MLX (Apple Silicon) header. Configure it as the default:

AICTL_PROVIDER=mlx
AICTL_MODEL=mlx-community__Llama-3.2-3B-Instruct-4bit

Inference runs on a tokio::spawn_blocking task, so it doesn't block the async runtime. Cost always shows $0.00. If you try to use an MLX model in a build without --features mlx, or on a non-Apple-Silicon host, aictl prints a clear error explaining the constraint.

Tested MLX models

The following models have been verified end-to-end (download, load, inference, tool calls) on Apple Silicon:

Model	Pull command
`mlx-community__DeepSeek-R1-Distill-Qwen-7B-4bit`	`aictl --pull-mlx-model mlx-community/DeepSeek-R1-Distill-Qwen-7B-4bit`
`mlx-community__Llama-3.2-3B-Instruct-4bit`	`aictl --pull-mlx-model mlx-community/Llama-3.2-3B-Instruct-4bit`
`mlx-community__gemma-2-9b-it-4bit`	`aictl --pull-mlx-model mlx-community/gemma-2-9b-it-4bit`

Cost estimates

The per-token tables above tell you what each model charges; they don't tell you what a realistic workday actually costs. For that, see LLM_PRICING.md — it models two usage patterns (chat assistant and coding agent) and reports daily and monthly totals for every model in the catalog.

The headline numbers for intensive use (150 chat turns/day or 50 coding tasks/day, 22 working days/month, cached pricing):

Usage pattern	Cheapest	Flagship cluster	Opus 4.6
Chat	$2.64/mo (grok-4-fast)	~$35–$48/mo	$69.74/mo
Coding agent	$34.76/mo (grok-4-fast)	~$460–$525/mo	$874.50/mo

A few things worth knowing before you budget:

Intensive coding agent use is roughly 60× more expensive than chat use on any given model, because the agent loop re-sends the growing conversation history each iteration and produces long, code-heavy outputs. Tool call count is not the dominant factor.
Prompt caching cuts costs roughly in half, but the "cached" column is only reliable for Anthropic — aictl explicitly writes to Anthropic's prompt cache via cache_control markers. OpenAI, Gemini, Grok, DeepSeek, and Kimi cache automatically server-side, so you'll hit cached rates during sustained sessions but not after idle gaps longer than the provider's TTL (typically 5–10 minutes). Z.ai GLM and Mistral have no cache handling in aictl, so they always bill at the full rate.
The cost meter that aictl prints after every turn reflects actual cached vs. fresh tokens from each provider's response, so it's more accurate than any estimate. If you want to know what your specific workload really costs, run a few typical sessions and watch the per-turn summary.

Agent Loop & Tool Calling

aictl runs an agent loop: the LLM can invoke tools, see their results, and continue reasoning until it produces a final answer.

By default, every tool call requires confirmation (y/N prompt). Use --auto to skip confirmation and run autonomously.

Available tools:

Tool	Description
`exec_shell`	Execute a shell command via `sh -c`
`read_file`	Read the contents of a file
`write_file`	Write content to a file (first line = path, rest = content)
`remove_file`	Remove (delete) a file (regular files only, not directories)
`create_directory`	Create a directory and any missing parent directories
`list_directory`	List files and directories at a path with `[FILE]`/`[DIR]`/`[LINK]` prefixes
`search_files`	Search file contents by pattern (grep regex) with optional directory scope
`edit_file`	Apply a targeted find-and-replace edit to a file (exact unique match required)
`diff_files`	Compare two text files and return a unified diff with 3 lines of context. First line is the "before" path, second line is the "after" path. Works in-process via an LCS DP table — no external `diff` binary, no platform drift. Refuses to diff files longer than 2000 lines each
`search_web`	Search the web via Firecrawl API (requires `FIRECRAWL_API_KEY`)
`find_files`	Find files matching a glob pattern (e.g. `*/.rs`) with optional base directory
`fetch_url`	Fetch a URL and return readable text content (HTML tags stripped)
`extract_website`	Fetch a URL and extract only the main readable content (strips scripts, styles, nav, boilerplate)
`fetch_datetime`	Get the current date, time, timezone, and day of week
`fetch_geolocation`	Get geolocation data for an IP address (city, country, timezone, coordinates, ISP) via ip-api.com
`read_image`	Read an image from a file path or URL for vision analysis (PNG, JPEG, GIF, WebP, BMP, TIFF, SVG, ICO)
`generate_image`	Generate an image from a text description via DALL-E, Imagen, or Grok (auto-selects provider based on available keys; saves PNG to current directory)
`read_document`	Read a PDF, DOCX, or spreadsheet and extract content as markdown text. Supports `.pdf`, `.docx`, `.xlsx`, `.xls`, `.ods`. PDF text extracted directly; DOCX converted to markdown; spreadsheets converted to markdown tables (one per sheet)
`git`	Run a restricted `git` subcommand (no shell). Allows `status`, `diff`, `log`, `blame`, `commit` with a per-subcommand flag allowlist. Dangerous flags (`-c`, `-C`, `--ext-diff`, `--upload-pack`, `--exec-path`, `--no-verify`, `--amend`, `--git-dir`, `--work-tree`) and all other subcommands are rejected. Env vars that could redirect the subprocess (`GIT_DIR`, `GIT_SSH_COMMAND`, `GIT_CONFIG_*`, editor/askpass) are scrubbed
`run_code`	Execute a short code snippet in a chosen interpreter and return stdout/stderr. First line is the language (`python`, `node`, `ruby`, `perl`, `lua`, `bash`, `sh`); remaining lines are piped to the interpreter on stdin (no temp file). Useful for quick calculations, data transforms, and one-off logic checks. Shares the shell timeout, env scrubber, and CWD pin with `exec_shell`. Not a true sandbox
`lint_file`	Run a language-appropriate linter/formatter on a single file and return its diagnostics. Input is a file path; the linter is auto-selected from the extension (`.rs` → `rustfmt --check`, `.py` → `ruff`/`flake8`/`pyflakes`/`py_compile`, `.js`/`.ts` → `eslint`/`node --check`/`tsc`, `.go` → `gofmt`/`go vet`, `.sh` → `shellcheck`, `.rb` → `rubocop`/`ruby -c`, `.json` → `jq empty`, `.yaml` → `yamllint`, `.toml` → `taplo`, `.md` → `markdownlint`/`prettier`, `.lua` → `luacheck`, `.c`/`.cpp` → `clang-format`/`cppcheck`, `.html`/`.css` → `prettier`). The first candidate installed on `PATH` wins. No auto-fix — the file is never modified. Shares the shell timeout, env scrubber, and CWD pin with `exec_shell`
`json_query`	Query or transform JSON with jq-like expressions. First line is the jq filter (e.g. `.`, `.users[].name`, `.items \| length`, `map(select(.price > 10))`); remaining lines are inline JSON, or `@path/to/file.json` to load from a file in the working directory. Output is the pretty-printed filter result. Non-zero exits are reported as `[exit N]`. Requires `jq` on `PATH`. The filter is passed as a positional argument after `--` (no shell interpolation, no flag reinterpretation); `@path` is validated against the CWD jail before the bytes are piped to `jq` on stdin
`calculate`	Evaluate a math expression safely without any `eval` or shell subprocess. Pass the expression as input (e.g. `2 + 3 * 4`, `sqrt(16) + sin(pi/2)`, `(1 + 2) ^ 10`). Supports int/float/scientific/hex/binary literals; `+ - * / %`, `^` / `**` (power, right-assoc), unary `+`/`-`; constants `pi`, `e`, `tau`; functions `sqrt`, `cbrt`, `abs`, `exp`, `ln`, `log2`, `log10`, `log`, `sin`, `cos`, `tan`, `asin`, `acos`, `atan`, `sinh`, `cosh`, `tanh`, `floor`, `ceil`, `round`, `trunc`, `sign`, `min`, `max`, `pow`, `atan2`. Integer-valued results render without a decimal point; `inf` / `-inf` / `nan` are returned verbatim. Recursion depth is bounded
`csv_query`	Filter and project CSV/TSV with a SQL-like query language. First line is the query: `SELECT (* \| col, col, ...) FROM (csv \| tsv) [WHERE <cond> [AND\|OR <cond> ...]] [ORDER BY <col> [ASC\|DESC]] [LIMIT <N>]`. Remaining lines are inline CSV/TSV (with header row) or `@path/to/file.csv` to load from disk. Conditions support `=`, `!=`, `<>`, `<`, `<=`, `>`, `>=`, `LIKE` / `NOT LIKE` (with `%` wildcard), `IS NULL`, `IS NOT NULL`. Numeric comparison is used when both operands parse as numbers; otherwise string comparison. `AND` binds tighter than `OR`; no parentheses. Output is a Markdown-style pipe table. Fully in-process — no external binary required
`list_processes`	List running processes with structured filtering. Invokes `ps` directly (no shell) and parses the output in-process. Input is `key=value` pairs (empty = top 20 by %CPU): `name=<substring>` (command + args match), `user=<username>`, `pid=<N>`, `min_cpu=<N>`, `min_mem=<N>`, `port=<N>` (processes listening on TCP/UDP via `lsof`), `sort=cpu\|mem\|pid\|name` (default `cpu` desc), `limit=<N>` (default 20). Output is a Markdown table with PID, USER, %CPU, %MEM, RSS, COMMAND
`check_port`	Test whether a TCP port on a given host accepts connections. Pure tokio — no shell, no `nc`/`telnet`. Input is `<host>:<port> [timeout=<ms>]`; host may be DNS name, IPv4, or bracketed IPv6 (`[::1]:8080`); an `http://` / `https://` URL is also accepted with the port inferred (80/443) when omitted. Default timeout 3000ms, max 30000ms. Returns "Reachable — ... accepted TCP in ms" or "Unreachable — ..." with a reason (refused, timed out, DNS failure, unreachable)
`system_info`	Return structured OS, CPU, memory, and disk information as Markdown. Cross-platform for macOS (`sysctl`, `vm_stat`, `sw_vers`, `df`) and Linux (`/proc/cpuinfo`, `/proc/meminfo`, `/etc/os-release`, `df`). Input is optional `key=value` pairs (empty = all sections): `section=os\|cpu\|memory\|disk\|all`, `path=<directory>` (disk section only; defaults to the security working directory). Reports OS pretty name, arch, kernel, hostname; CPU model and logical/physical core counts; memory total/used/available; disk mount, filesystem, total/used/available
`archive`	Create, extract, or list `tar.gz` / `tgz` / `tar` / `zip` archives in-process — no `tar` / `gzip` / `unzip` subprocess needed. Three modes: `create <format> <output>` followed by one input path per line (directories added recursively, symlinks skipped); `extract <archive> <destination-dir>` (format inferred from extension); `list <archive>`. Extraction refuses entries with `..` components, absolute paths, or symlinks (zip-slip / tar-slip guard). All referenced paths are validated against the CWD jail
`checksum`	Compute SHA-256 and/or MD5 cryptographic digests of a file. Input is a bare file path (returns both digests) or `sha256 <path>` / `md5 <path>` to pick one algorithm. The file is streamed through the hashers so arbitrarily large files work without loading them into memory. Output is one `SHA-256: <hex>` and/or `MD5: <hex>` line — consistent across platforms (no `shasum` vs `sha256sum` drift)
`clipboard`	Read from or write to the system clipboard. Input is either `read` (or empty) to fetch the current clipboard contents, or `write` on the first line followed by the content on subsequent lines. Content is piped on stdin so arbitrary bytes round-trip safely. Cross-platform: macOS uses `pbcopy` / `pbpaste`; Linux prefers Wayland (`wl-copy` / `wl-paste`) with X11 (`xclip` / `xsel`) fallback. Write size capped at 1 MB
`notify`	Send a desktop notification. First line is the title (required, max 256 bytes); remaining lines are the body (optional, max 4096 bytes). Cross-platform: macOS uses the bundled `osascript`; Linux uses `notify-send` from libnotify. Useful in `--auto` mode or for long-running tasks to signal completion without the user watching the terminal

Image capabilities by provider

The read_image (vision/analysis) and generate_image tools depend on provider support:

Provider	Image analysis (`read_image`)	Image generation (`generate_image`)
OpenAI	All models	DALL-E 3
Anthropic	All models	--
Gemini	All models	Imagen 4.0
Grok	All models	Grok 2 Image
Mistral	All models	--
DeepSeek	--	--
Kimi	kimi-k2.5 and moonshot-v1 variants	--
Z.ai	-- (requires GLM vision models not in catalog)	--
Ollama	Model-dependent (e.g. llava, llama3.2-vision)	--

Image generation fallback: generate_image auto-selects a provider based on available API keys. The active provider is tried first (if it supports generation), then falls back through OpenAI, Gemini, and Grok in order. This means you can generate images even when your active chat provider (e.g. Anthropic or Mistral) doesn't offer a generation API — as long as you have at least one of LLM_OPENAI_API_KEY, LLM_GEMINI_API_KEY, or LLM_GROK_API_KEY configured.

The tool-calling mechanism uses a custom XML format in the LLM response text (not provider-native tool APIs):

<tool name="exec_shell">
ls -la /tmp
</tool>

The agent loop runs for up to 20 iterations. LLM reasoning is printed to stderr; the final answer goes to stdout. Token usage, estimated cost, and execution time are always displayed after each response.

Security

All tool calls pass through a configurable security policy (src/security.rs) before execution. By default:

Shell command blocking: dangerous commands are blocked (rm, sudo, dd, mkfs, nc, etc.). Command substitution ($(...), backticks) is blocked. Compound commands (|, &&, ||, ;) are split and each segment is validated independently.
CWD jail: file tools (read_file, write_file, remove_file, edit_file, create_directory, list_directory, search_files, find_files) can only operate within the working directory. Path traversal via .. is defeated by canonicalization.
Blocked paths: sensitive paths are always blocked (~/.ssh, ~/.gnupg, ~/.aictl, ~/.aws, ~/.config/gcloud, /etc/shadow, /etc/sudoers).
Environment scrubbing: shell subprocesses receive a clean environment — vars matching *_KEY, *_SECRET, *_TOKEN, *_PASSWORD are stripped so API keys cannot leak.
Shell timeout: commands are killed after 30 seconds (configurable).
Write size limit: file writes are capped at 1 MB (configurable).
Output sanitization: tool results are sanitized to prevent prompt injection via <tool> tags.
Injection guard: user prompts are scanned before being sent to the LLM. Inputs containing instruction-override phrases ("ignore previous instructions", "disable security", etc.) or forged role/tool tags (<tool …>, <|system|>, ### System:, etc.) are blocked with a clear error. Disable with AICTL_SECURITY_INJECTION_GUARD=false.
Audit log: every tool invocation appends one JSON line to ~/.aictl/audit/<session-id> (JSONL) with timestamp, tool name, truncated input, and an outcome tag (executed + result_summary, denied_by_policy + reason, denied_by_user, disabled, duplicate) — separate from session history so a reviewer can reconstruct exactly what the model ran. The filename mirrors the session file under ~/.aictl/sessions/. Skipped in incognito mode and single-shot runs. Disable with AICTL_SECURITY_AUDIT_LOG=false.
Sensitive-data redaction (opt-in): every outbound message body can be screened for credentials and PII before it reaches a remote provider. Enable with AICTL_SECURITY_REDACTION=redact to swap matches for [REDACTED:<KIND>] on the wire, or =block to abort the turn on any hit. Layer A: regex detectors for API keys (OpenAI / Anthropic / Google / GitHub / Stripe / Slack / HuggingFace / Groq), AWS access keys, JWTs (with base64-header sanity check), PEM private keys, DB/AMQP connection strings, emails, context-gated phones, credit cards (Luhn), IBANs (mod-97). Layer B: Shannon-entropy scanner for opaque tokens. Layer C (optional redaction-ner cargo feature + pulled GLiNER model): person / location / organization detection. User-supplied AICTL_REDACTION_EXTRA_PATTERNS and AICTL_REDACTION_ALLOW tune the detectors. Local providers (Ollama / GGUF / MLX) bypass by default. Every redaction event lands in the audit log; the persisted session file always keeps the user's original text.

Security denials are returned to the LLM as tool results (displayed in red) so it can adapt. Use --unrestricted to disable all security checks. Individual settings are configurable via AICTL_SECURITY_* keys in ~/.aictl/config. The audit log and redaction layer are observability and privacy controls, not tool-call enforcement, so --unrestricted leaves them running unless the config key turns them off.

Examples

# With defaults configured in ~/.aictl/config, just run:
aictl

# Or send a single message:
aictl --message "What is Rust?"

# Override provider/model from the command line:
aictl --provider openai --model gpt-4o --message "What is Rust?"

# Agent with tool calls (interactive confirmation)
aictl --message "List files in the current directory"

# Autonomous mode (no confirmation prompts)
aictl --auto --message "What OS am I running?"

# Quiet mode (only final answer, no tool calls or reasoning)
aictl --auto --quiet --message "What OS am I running?"

Tests

cargo test

Unit tests cover core logic across six modules: commands (slash command parsing), config (config file parsing), tools (tool-call XML parsing), ui (formatting helpers), llm (cost estimation and model matching), and security (shell validation, path validation, output sanitization). The session module handles persistence of REPL conversations under ~/.aictl/sessions/.

Roadmap

See ROADMAP.md for planned features and future direction, including new tools, UX improvements, desktop app plans, and coding agent capabilities.

Architecture

See ARCH.md for detailed ASCII diagrams covering:

Module structure
Startup flow
Agent loop
Tool execution dispatch
LLM provider abstraction
UI layer
End-to-end data flow

Claude Code Skills

This project includes Claude Code skills for common workflows. Run them as slash commands in a Claude Code session:

Skill	Description
`/commit`	Commit staged and unstaged changes with a clear commit message
`/update-docs`	Update README.md, CLAUDE.md, and ARCH.md to match the current project state
`/evaluate-rust-quality`	Audit code quality, idiomatic Rust usage, and best practices
`/evaluate-rust-security`	Audit security posture, injection risks, and credential handling
`/evaluate-rust-performance`	Audit performance patterns, allocations, and CLI responsiveness
`/project-stats-report`	Generate a project statistics report (LOC, commit activity, contributors, etc.)

Evaluation reports are saved to .claude/reports/ with timestamped filenames.

License

This project is licensed under the PolyForm Noncommercial License 1.0.0. It is free to use for non-commercial purposes, including personal use, research, education, and use by non-profit organizations. For commercial use, please contact piotr@wittchen.io.

Name		Name	Last commit message	Last commit date
Latest commit History 720 Commits
.aictl		.aictl
.cargo		.cargo
.claude		.claude
.github/workflows		.github/workflows
src		src
website		website
.gitignore		.gitignore
AICTL.md		AICTL.md
ARCH.md		ARCH.md
CLAUDE.md		CLAUDE.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
LLM_PRICING.md		LLM_PRICING.md
README.md		README.md
ROADMAP.md		ROADMAP.md
build.rs		build.rs
install.sh		install.sh
screenshot.png		screenshot.png

Folders and files

Latest commit

History

Repository files navigation

aictl 🤖

Install

Supported platforms

Prerequisites

From source

Build without installing

Optional feature flags

Uninstall

Binary release (installed via install.sh)

From source (installed via cargo install)

Remove configuration and data (optional)

Usage

REPL Commands

Parameters

Sessions

Agents

Skills

Configuration

Basic configuration

API keys

Where to get API keys

Secure key storage (system keyring)

Security configuration (optional)

Providers

OpenAI

Anthropic

Google Gemini

xAI Grok

Mistral

DeepSeek

Kimi

Z.ai

Ollama

Native GGUF (llama.cpp) — experimental

Tested GGUF models

Native MLX (Apple Silicon) — experimental

Tested MLX models

Cost estimates

Agent Loop & Tool Calling

Image capabilities by provider

Security

Examples

Tests

Roadmap

Architecture

Claude Code Skills

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 72

Contributors 2

Languages

Binary release (installed via `install.sh`)

From source (installed via `cargo install`)