OpenGuider

Download here

OpenGuider is an Electron desktop AI assistant designed to help you complete real UI tasks on your machine.

It combines chat, planning, screenshot context, pointer hints, and optional voice features in one desktop workflow.

Quick Configuration Guides (PDF)

Turkish guide: OpenGuider - Configuration (TR)
English guide: OpenGuider - Configuration (EN)

What OpenGuider Does

Converts your goal into a step-by-step execution plan.
Uses screenshot context to reason about what is currently on screen.
Gives coordinate-based pointer guidance for "click here" style help.
Keeps session history so long tasks remain coherent across messages.
Supports multiple model providers so you can switch based on speed/cost/quality.
Adds optional speech-to-text and text-to-speech for hands-free usage.

Feature Breakdown

1) Multi-Provider AI Layer

OpenGuider supports:

Claude
OpenAI
Gemini
Groq
OpenRouter
Ollama (local)

Why this matters:

You can optimize for latency, pricing, or reasoning quality per task.
You can fail over to another provider if one API is unavailable.
You can use local models (Ollama) for privacy-sensitive workflows.

2) Planning and Task Orchestration

Instead of only returning plain text responses, OpenGuider can:

build a structured plan,
track current step,
replan when state changes,
and continue until completion.

This makes the app useful for real multi-step operations, not just simple Q/A.

3) Screen-Aware Guidance

OpenGuider can reason with screenshot context to produce actionable guidance:

identify likely UI regions,
map instructions to on-screen targets,
emit pointer hints with coordinates.

This is the core of "guide me while I use my apps" behavior.

4) Voice Input and Output

Speech-to-text options:

AssemblyAI
Whisper-compatible endpoints

Text-to-speech options:

Google TTS
OpenAI TTS
ElevenLabs

You can run chat-only, voice-only, or hybrid flows depending on your setup.

Live Preview

Downloads

Landing page: https://mo-tunn.github.io/OpenGuider/
Latest release: https://github.com/mo-tunn/OpenGuider/releases/latest
Windows installer: OpenGuider-windows-setup-latest.exe
macOS installer (DMG): OpenGuider-macos-installer-latest.dmg
Linux installer: OpenGuider-linux-latest.zip

Installation

Option A: Download Prebuilt App (Recommended)

Open the latest release page: https://github.com/mo-tunn/OpenGuider/releases/latest
Download your platform artifact:
- Windows: OpenGuider-windows-setup-latest.exe
- macOS: OpenGuider-macos-installer-latest.dmg
- Linux: OpenGuider-linux-latest.zip
Extract and run the app.

Option B: Run From Source

Install dependencies: npm install
Start the app: npm run start

Configuration Guide (Detailed)

Open Settings in the app and configure in this order.

Step 1: Choose Your Main LLM Provider

Pick one provider first (you can add others later):

Claude / OpenAI / Gemini / Groq / OpenRouter / Ollama

Then set:

provider API key (if required),
default model,
and any provider-specific endpoint fields.

Tip:

Start with a single stable provider before enabling all options.

Step 2: Select the Default Model

Choose a model based on your use case:

Fast and cheap for short daily guidance.
Stronger reasoning model for complex multi-step planning.
Recommended default for daily usage: google/gemini-3.1-flash-image-preview (via OpenRouter).

If model output quality is inconsistent, switch to a more capable model.

Step 3: Configure Voice (Optional)

If you want microphone-driven workflows:

Select your STT provider.
Set language options.
Verify system microphone permissions.
Run a short recognition test.

Practical low-cost default:

STT provider: Groq
STT model: whisper-large-v3-turbo

For spoken responses:

Select TTS provider.
Pick voice.
Test output volume and speaking speed.

Suggested ElevenLabs voice IDs:

pNInz6obpgDQGcFmaJgB (male)
EXAVITQu4vr4xnSDxMaL (female)

Step 4: Validate the Setup

Send a simple prompt first, for example:

"Open settings and guide me to configure notifications step by step."

Then try a planning-style prompt:

"Help me complete this task in 5 steps and wait for confirmation after each step."

Step 5: Add Secondary Providers (Optional)

After your main provider works, add backups for reliability:

Primary provider for default usage.
Secondary provider for fallback.
Local Ollama profile for offline/private runs.

How To Use OpenGuider Effectively

For best results, write goals in this format:

Context: what app/page you are in.
Objective: what you want to complete.
Constraints: things to avoid or mandatory requirements.

Good example:

"I am in Figma settings. Help me enable autosave and version history safely. Give one step at a time and wait."

Troubleshooting

If the AI response is generic:
- check selected model/provider,
- include clearer UI context,
- provide a fresh screenshot context by retrying the step.
If voice does not work:
- verify OS microphone permission,
- verify API key for STT/TTS provider,
- test with a shorter input phrase.
If pointer hints are off:
- capture a fresh screenshot and retry,
- avoid heavily zoomed/scaled UI when possible.

Security and Data Handling

OpenGuider is designed as a local-first desktop app. This section explains what data is stored, what may be sent to external providers, and how to operate safely.

Data Stored Locally

App settings (provider choices, model selection, preferences) are stored in the Electron userData directory.
Session/task history is stored locally to keep multi-step context coherent.
Logs are written locally for debugging and runtime diagnostics.
API keys are stored with secure storage (keytar) when available; otherwise encrypted fallback storage is used.

Data Sent to External Services

Depending on your configuration, OpenGuider may send:

prompts and conversation context to your selected LLM provider,
voice audio/text to selected STT/TTS providers,
screenshot-derived context when screen-aware guidance is used.

Important:

data is sent only to providers you explicitly configure,
there is no hidden relay server by default between your app and providers.

Screenshot and UI Context Handling

Screenshots are used to improve on-screen guidance and step suggestions.
For privacy-sensitive tasks, avoid including confidential content on screen before capture.
If required by policy, disable screen-aware workflows and use text-only guidance.

Operational Security Best Practices

Use a dedicated provider API key for OpenGuider (do not reuse high-privilege keys).
Rotate API keys periodically.
Never commit .env or key files to Git.
Prefer local model usage (Ollama) for highly sensitive workflows.
Review logs before sharing them publicly in issues.

Privacy and Compliance Notes

OpenGuider is open-source, so security behavior is auditable.
Compliance posture depends on your selected providers and their data policies.
If your team has strict requirements, define an approved provider/model list and disable non-approved endpoints.

Support and Contribution

If you want to support OpenGuider by contributing code, docs, tests, or design updates, this section is for you.

Branching Strategy for Contributors

main: stable branch used for production-ready updates.
feature/<short-name>: new features.
fix/<short-name>: bug fixes.
docs/<short-name>: README/docs-only changes.
chore/<short-name>: maintenance and tooling updates.

Examples:

feature/voice-hotkeys
fix/linux-build-artifact
docs/readme-configuration-guide

Recommended Contribution Flow

Fork the repository (or create a branch if you are a direct collaborator).
Create a new branch from main.
Keep commits focused and descriptive.
Run tests locally: npm run test.
Push your branch and open a Pull Request.

Pull Request Checklist

Explain what changed and why.
Include test notes (what you ran and results).
Add screenshots/GIF for UI changes.
Keep scope small and review-friendly.
Rebase/merge latest main if needed before final review.

Ways to Help Beyond Code

Improve docs and onboarding examples.
Report reproducible bugs with logs/steps.
Propose UX improvements for panel/widget flows.
Help test releases on Windows/macOS/Linux.

Development

Run with inspector: npm run dev
Run tests: npm run test

Build Installers (Windows/macOS/Linux)

Build all platform targets on your current OS: npm run dist
Build only Windows NSIS installer (.exe): npm run dist:win
Build only macOS installer (.dmg): npm run dist:mac
Build only Linux packages (.AppImage + .deb): npm run dist:linux
Output artifacts are written to release/

Architecture

flowchart LR
  User[User]
  UI[Renderer UI\nPanel + Widget + Settings]
  Preload[preload.js\nSecure IPC Bridge]
  Main[main.js\nElectron Main Process]
  Agent[src/agent/*\nPlanner + Orchestrator]
  AI[src/ai/*\nProvider Clients]
  Session[src/session/*\nSession State + Persistence]
  Screen[src/screenshot.js\nScreen Capture]
  Voice[src/tts/* + STT adapters]

  User --> UI
  UI --> Preload
  Preload --> Main
  Main --> Agent
  Agent --> AI
  Agent <--> Session
  Main --> Screen
  Main --> Voice
  Agent --> UI

Component Roles

main.js: app lifecycle, tray/shortcuts, IPC routing, orchestration entrypoint.
preload.js: secure boundary between renderer and main process APIs.
renderer/*: user-facing UI surfaces (panel, widget, settings, cursor overlay).
src/agent/*: planning, evaluation, replanning, and task progression logic.
src/ai/*: model-provider abstractions and structured response handling.
src/session/*: session model, history continuity, state persistence.

Security Notes

API keys are persisted via OS-protected secure storage (keytar) when available.
If keychain is unavailable, encrypted fallback storage is used through Electron safe storage.
Renderer runs with contextIsolation: true and nodeIntegration: false.
Application data is stored in Electron userData path under a stable app identity (OpenGuider) so updates keep local settings/history.

GitHub Release Automation

Push a semantic version tag (example: v0.2.0).
GitHub Actions runs .github/workflows/release-build.yml.
Installers are attached to the release:
- OpenGuider-windows-setup-latest.exe
- OpenGuider-macos-installer-latest.dmg
- OpenGuider-linux-latest.zip

License

This project is licensed under the Apache License 2.0.
See LICENSE for full terms.

Copyright (C) Metehan Kızılcık

If you create a derivative project, keep these Apache 2.0 basics:

Include the full Apache 2.0 license text in a LICENSE file.
Keep copyright notices (including Metehan Kızılcık).
Mark significant modifications clearly in changed files.

Acknowledgement

OpenGuider was originally inspired by Clicky.

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
.github/workflows		.github/workflows
docs		docs
landing		landing
renderer		renderer
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
AGENTS.md		AGENTS.md
LICENSE		LICENSE
README.md		README.md
eng.traineddata		eng.traineddata
main.js		main.js
package-lock.json		package-lock.json
package.json		package.json
preload.js		preload.js
tutorial.gif		tutorial.gif

Folders and files

Latest commit

History

Repository files navigation

OpenGuider

Quick Configuration Guides (PDF)

What OpenGuider Does

Feature Breakdown

1) Multi-Provider AI Layer

2) Planning and Task Orchestration

3) Screen-Aware Guidance

4) Voice Input and Output

Live Preview

Downloads

Installation

Option A: Download Prebuilt App (Recommended)

Option B: Run From Source

Configuration Guide (Detailed)

Step 1: Choose Your Main LLM Provider

Step 2: Select the Default Model

Step 3: Configure Voice (Optional)

Step 4: Validate the Setup

Step 5: Add Secondary Providers (Optional)

How To Use OpenGuider Effectively

Troubleshooting

Security and Data Handling

Data Stored Locally

Data Sent to External Services

Screenshot and UI Context Handling

Operational Security Best Practices

Privacy and Compliance Notes

Support and Contribution

Branching Strategy for Contributors

Recommended Contribution Flow

Pull Request Checklist

Ways to Help Beyond Code

Development

Build Installers (Windows/macOS/Linux)

Architecture

Component Roles

Security Notes

GitHub Release Automation

License

Acknowledgement

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 9

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages