Download here
OpenGuider is an Electron desktop AI assistant designed to help you complete real UI tasks on your machine.
It combines chat, planning, screenshot context, pointer hints, and optional voice features in one desktop workflow.
- Turkish guide: OpenGuider - Configuration (TR)
- English guide: OpenGuider - Configuration (EN)
- Converts your goal into a step-by-step execution plan.
- Uses screenshot context to reason about what is currently on screen.
- Gives coordinate-based pointer guidance for "click here" style help.
- Keeps session history so long tasks remain coherent across messages.
- Supports multiple model providers so you can switch based on speed/cost/quality.
- Adds optional speech-to-text and text-to-speech for hands-free usage.
OpenGuider supports:
- Claude
- OpenAI
- Gemini
- Groq
- OpenRouter
- Ollama (local)
Why this matters:
- You can optimize for latency, pricing, or reasoning quality per task.
- You can fail over to another provider if one API is unavailable.
- You can use local models (Ollama) for privacy-sensitive workflows.
Instead of only returning plain text responses, OpenGuider can:
- build a structured plan,
- track current step,
- replan when state changes,
- and continue until completion.
This makes the app useful for real multi-step operations, not just simple Q/A.
OpenGuider can reason with screenshot context to produce actionable guidance:
- identify likely UI regions,
- map instructions to on-screen targets,
- emit pointer hints with coordinates.
This is the core of "guide me while I use my apps" behavior.
Speech-to-text options:
- AssemblyAI
- Whisper-compatible endpoints
Text-to-speech options:
- Google TTS
- OpenAI TTS
- ElevenLabs
You can run chat-only, voice-only, or hybrid flows depending on your setup.
- Landing page: https://mo-tunn.github.io/OpenGuider/
- Latest release: https://github.com/mo-tunn/OpenGuider/releases/latest
- Windows installer: OpenGuider-windows-setup-latest.exe
- macOS installer (DMG): OpenGuider-macos-installer-latest.dmg
- Linux installer: OpenGuider-linux-latest.zip
- Open the latest release page: https://github.com/mo-tunn/OpenGuider/releases/latest
- Download your platform artifact:
- Windows:
OpenGuider-windows-setup-latest.exe - macOS:
OpenGuider-macos-installer-latest.dmg - Linux:
OpenGuider-linux-latest.zip
- Windows:
- Extract and run the app.
- Install dependencies:
npm install - Start the app:
npm run start
Open Settings in the app and configure in this order.
Pick one provider first (you can add others later):
- Claude / OpenAI / Gemini / Groq / OpenRouter / Ollama
Then set:
- provider API key (if required),
- default model,
- and any provider-specific endpoint fields.
Tip:
- Start with a single stable provider before enabling all options.
Choose a model based on your use case:
- Fast and cheap for short daily guidance.
- Stronger reasoning model for complex multi-step planning.
- Recommended default for daily usage:
google/gemini-3.1-flash-image-preview(via OpenRouter).
If model output quality is inconsistent, switch to a more capable model.
If you want microphone-driven workflows:
- Select your STT provider.
- Set language options.
- Verify system microphone permissions.
- Run a short recognition test.
Practical low-cost default:
- STT provider: Groq
- STT model:
whisper-large-v3-turbo
For spoken responses:
- Select TTS provider.
- Pick voice.
- Test output volume and speaking speed.
Suggested ElevenLabs voice IDs:
pNInz6obpgDQGcFmaJgB(male)EXAVITQu4vr4xnSDxMaL(female)
Send a simple prompt first, for example:
- "Open settings and guide me to configure notifications step by step."
Then try a planning-style prompt:
- "Help me complete this task in 5 steps and wait for confirmation after each step."
After your main provider works, add backups for reliability:
- Primary provider for default usage.
- Secondary provider for fallback.
- Local Ollama profile for offline/private runs.
For best results, write goals in this format:
- Context: what app/page you are in.
- Objective: what you want to complete.
- Constraints: things to avoid or mandatory requirements.
Good example:
- "I am in Figma settings. Help me enable autosave and version history safely. Give one step at a time and wait."
- If the AI response is generic:
- check selected model/provider,
- include clearer UI context,
- provide a fresh screenshot context by retrying the step.
- If voice does not work:
- verify OS microphone permission,
- verify API key for STT/TTS provider,
- test with a shorter input phrase.
- If pointer hints are off:
- capture a fresh screenshot and retry,
- avoid heavily zoomed/scaled UI when possible.
OpenGuider is designed as a local-first desktop app. This section explains what data is stored, what may be sent to external providers, and how to operate safely.
- App settings (provider choices, model selection, preferences) are stored in the Electron
userDatadirectory. - Session/task history is stored locally to keep multi-step context coherent.
- Logs are written locally for debugging and runtime diagnostics.
- API keys are stored with secure storage (
keytar) when available; otherwise encrypted fallback storage is used.
Depending on your configuration, OpenGuider may send:
- prompts and conversation context to your selected LLM provider,
- voice audio/text to selected STT/TTS providers,
- screenshot-derived context when screen-aware guidance is used.
Important:
- data is sent only to providers you explicitly configure,
- there is no hidden relay server by default between your app and providers.
- Screenshots are used to improve on-screen guidance and step suggestions.
- For privacy-sensitive tasks, avoid including confidential content on screen before capture.
- If required by policy, disable screen-aware workflows and use text-only guidance.
- Use a dedicated provider API key for OpenGuider (do not reuse high-privilege keys).
- Rotate API keys periodically.
- Never commit
.envor key files to Git. - Prefer local model usage (Ollama) for highly sensitive workflows.
- Review logs before sharing them publicly in issues.
- OpenGuider is open-source, so security behavior is auditable.
- Compliance posture depends on your selected providers and their data policies.
- If your team has strict requirements, define an approved provider/model list and disable non-approved endpoints.
If you want to support OpenGuider by contributing code, docs, tests, or design updates, this section is for you.
main: stable branch used for production-ready updates.feature/<short-name>: new features.fix/<short-name>: bug fixes.docs/<short-name>: README/docs-only changes.chore/<short-name>: maintenance and tooling updates.
Examples:
feature/voice-hotkeysfix/linux-build-artifactdocs/readme-configuration-guide
- Fork the repository (or create a branch if you are a direct collaborator).
- Create a new branch from
main. - Keep commits focused and descriptive.
- Run tests locally:
npm run test. - Push your branch and open a Pull Request.
- Explain what changed and why.
- Include test notes (what you ran and results).
- Add screenshots/GIF for UI changes.
- Keep scope small and review-friendly.
- Rebase/merge latest
mainif needed before final review.
- Improve docs and onboarding examples.
- Report reproducible bugs with logs/steps.
- Propose UX improvements for panel/widget flows.
- Help test releases on Windows/macOS/Linux.
- Run with inspector:
npm run dev - Run tests:
npm run test
- Build all platform targets on your current OS:
npm run dist - Build only Windows NSIS installer (
.exe):npm run dist:win - Build only macOS installer (
.dmg):npm run dist:mac - Build only Linux packages (
.AppImage+.deb):npm run dist:linux - Output artifacts are written to
release/
flowchart LR
User[User]
UI[Renderer UI\nPanel + Widget + Settings]
Preload[preload.js\nSecure IPC Bridge]
Main[main.js\nElectron Main Process]
Agent[src/agent/*\nPlanner + Orchestrator]
AI[src/ai/*\nProvider Clients]
Session[src/session/*\nSession State + Persistence]
Screen[src/screenshot.js\nScreen Capture]
Voice[src/tts/* + STT adapters]
User --> UI
UI --> Preload
Preload --> Main
Main --> Agent
Agent --> AI
Agent <--> Session
Main --> Screen
Main --> Voice
Agent --> UI
main.js: app lifecycle, tray/shortcuts, IPC routing, orchestration entrypoint.preload.js: secure boundary between renderer and main process APIs.renderer/*: user-facing UI surfaces (panel, widget, settings, cursor overlay).src/agent/*: planning, evaluation, replanning, and task progression logic.src/ai/*: model-provider abstractions and structured response handling.src/session/*: session model, history continuity, state persistence.
- API keys are persisted via OS-protected secure storage (
keytar) when available. - If keychain is unavailable, encrypted fallback storage is used through Electron safe storage.
- Renderer runs with
contextIsolation: trueandnodeIntegration: false. - Application data is stored in Electron
userDatapath under a stable app identity (OpenGuider) so updates keep local settings/history.
- Push a semantic version tag (example:
v0.2.0). - GitHub Actions runs
.github/workflows/release-build.yml. - Installers are attached to the release:
OpenGuider-windows-setup-latest.exeOpenGuider-macos-installer-latest.dmgOpenGuider-linux-latest.zip
This project is licensed under the Apache License 2.0.
See LICENSE for full terms.
Copyright (C) Metehan Kızılcık
If you create a derivative project, keep these Apache 2.0 basics:
- Include the full Apache 2.0 license text in a
LICENSEfile. - Keep copyright notices (including
Metehan Kızılcık). - Mark significant modifications clearly in changed files.
OpenGuider was originally inspired by Clicky.

