An intelligent screenshot analysis tool powered by OpenAI's GPT-4 Vision API. This application captures screenshots and uses AI to provide insightful analysis, text extraction, UI descriptions, and content summaries.
Since you've already added your OPENAI_API_KEY, you're almost ready to go! Just follow these final steps:
pip install -r requirements.txt# Test basic installation
python setup_test.py
# Test vision functionality
python test_openai_vision.pypython run.py- AI-Powered Analysis: Uses GPT-4 Vision to analyze screenshots intelligently
- Multiple Analysis Modes:
- General: Overall analysis and insights
- Text: Extract and organize visible text
- UI: Describe interface elements and layout
- Summary: Provide concise content summaries
- Real-time Processing: Automatic screenshot capture and analysis
- Interactive Interface: Clean, modern overlay window
- Cursor Following: Window follows your cursor for convenient access
- Custom Prompts: Define your own analysis prompts
- Python 3.8 or higher
- OpenAI API key
- Windows 10/11 (tested platform)
-
Clone the repository:
git clone <repository-url> cd cluecursor
-
Install dependencies:
pip install -r requirements.txt
-
Set up OpenAI API key:
# Windows set OPENAI_API_KEY=your-api-key-here # Linux/Mac export OPENAI_API_KEY=your-api-key-here
python run.py- ESC: Close application
- Ctrl+A: Manual screenshot analysis
- Ctrl+M: Cycle through analysis modes
- Ctrl+P: Custom analysis with your own prompt
- Click "Analyze Now": Trigger immediate analysis
- Click "Mode": Change analysis mode
- Automatic: Analysis runs every 5 seconds
- General: Comprehensive analysis focusing on main content, purpose, and notable patterns
- Text: Extracts and organizes all visible text while preserving structure
- UI: Describes interface elements, buttons, menus, and layout patterns
- Summary: Provides concise summaries of key information and main topics
The application can be configured by modifying constants in src/imports.py:
DEFAULT_WINDOW_SIZE: Initial window dimensionsMAX_TOKENS: Maximum tokens for AI responsesTEMPERATURE: AI response creativity (0.0-1.0)
The application uses OpenAI's GPT-4 Vision API. Ensure you have:
- Valid OpenAI API key
- Sufficient API credits
- Internet connection
-
"OpenAI API key not found":
- Set the
OPENAI_API_KEYenvironment variable - Restart your terminal/command prompt
- Set the
-
"OpenAI library not available":
pip install openai
-
High API usage:
- Increase the analysis interval in
src/screen_capture.py - Use manual mode instead of automatic
- Increase the analysis interval in
- Use specific analysis modes for better performance
- Adjust window size for optimal text display
- Use custom prompts for targeted analysis
cluecursor/
├── src/
│ ├── main_app.py # Main application orchestrator
│ ├── openai_processor.py # OpenAI API integration
│ ├── screen_capture.py # Screenshot capture and analysis
│ ├── ui_components.py # User interface components
│ ├── cursor_tracker.py # Cursor tracking functionality
│ └── imports.py # Import management
├── requirements.txt # Python dependencies
├── run.py # Application entry point
└── README.md # This file
- Fork the repository
- Create a feature branch
- Make your changes
- Test thoroughly
- Submit a pull request
[Add your license information here]
- OpenAI for the GPT-4 Vision API
- Python community for excellent libraries
- Contributors and users
Note: This application requires an active internet connection and valid OpenAI API credentials to function properly.