Skip to content

duckling-ui/duckling

Duckling

Tests Deploy Docs Publish Docker Images License: MIT Python 3.10+ Node.js 22+ React Flask Code style: black TypeScript

A modern, user-friendly browser-based graphical interface for Docling - the powerful document conversion library by IBM.

Duckling Screenshot

Features

  • Drag-and-Drop Upload: Simply drag your documents onto the interface
  • URL-Based Conversion: Convert documents directly from URLs with automatic image extraction for HTML pages
  • Multi-file and folder upload: Convert multiple files or URLs at once from one drop zone—choose a folder, use Choose files…, or drag and drop; unsupported extensions are skipped client-side, and the API rejects a batch if nothing can be converted
  • Multi-Format Support: Convert PDFs, Word docs, PowerPoints, Excel files, HTML, Markdown, images, and more
  • Multiple Export Formats: Export to Markdown, HTML, JSON, DocTags, Document Tokens, RAG Chunks, or plain text
  • Image & Table Extraction: Extract embedded images and tables with CSV export
  • Image Preview Gallery: View extracted images as thumbnails with full-size lightbox viewer
  • RAG-Ready Chunking: Generate document chunks optimized for RAG applications
  • Advanced OCR: Multiple OCR backends with GPU acceleration support
  • Real-Time Progress: Watch your document conversion progress in real-time
  • Conversion History: Access previously converted documents and reload them without re-conversion
  • Document Persistence: Processed documents are stored on disk and can be reloaded from history
  • Content Deduplication: Same file with identical settings reuses stored output (instant cache hits)
  • Highly Configurable: Customize OCR, tables, images, performance, and chunking settings
  • Beautiful Dark Theme: Easy on the eyes with a modern, professional design
  • Accessibility: Keyboard-friendly slide-overs named with aria-label, focus trap and restore, labeled controls and progress, focusable scroll regions for long content, link underlines (not color alone), and docs styling/scripts described in Accessibility

Supported Input Formats

Format Extensions Description
PDF .pdf Portable Document Format
Word .docx Microsoft Word documents
PowerPoint .pptx Microsoft PowerPoint presentations
Excel .xlsx Microsoft Excel spreadsheets
HTML .html, .htm Web pages
Markdown .md, .markdown Markdown files
Images .png, .jpg, .jpeg, .tiff, .gif, .webp, .bmp Direct image OCR
AsciiDoc .asciidoc, .adoc Technical documentation
PubMed XML .xml Scientific articles
USPTO XML .xml Patent documents

Export Formats

Format Extension Description
Markdown .md Formatted text with headers, lists, links
HTML .html Web-ready format with styling
JSON .json Full document structure
Plain Text .txt Simple text without formatting
DocTags .doctags Tagged document format
Document Tokens .tokens.json Token-level representation
RAG Chunks .chunks.json Chunks for RAG applications

Quick Start

Prerequisites

  • Python 3.10+ (3.13 recommended)
  • Node.js 20.19+ or 22+ (22 recommended for Vite 7)
  • npm or yarn

Installation

  1. Clone the repository:

    git clone https://github.com/duckling-ui/duckling.git
    cd duckling
  2. Set up the backend:

    cd backend
    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
    pip install -r requirements.txt
  3. Set up the frontend:

    cd ../frontend
    npm install

Running the Application

  1. Start the backend (in one terminal):

    cd backend
    source venv/bin/activate
    python duckling.py

    The API will be available at http://localhost:5001

  2. Start the frontend (in another terminal):

    cd frontend
    npm run dev

    The UI will be available at http://localhost:3000

  3. Batch and folder uploads: Drag a folder or multiple files onto the drop zone, click the zone to pick a folder, or use Choose files… for loose files—see the Quick Start guide.

Using Docker

Documentation (MkDocs)

Duckling uses MkDocs + mkdocs-static-i18n for multilingual docs. Production docs at duckling-ui.org are built with mike for multi-version support. With Material’s integrated TOC (toc.integrate), translate section headings in localized Markdown (for example docs/fr/user-guide/features.md) so the sidebar matches each locale. Supported formats and Screenshots are mirrored under docs/{de,fr,es}/user-guide/formats.md and screenshots.md (UI strings, captions, tab titles; image paths ../../assets/...). Homepage feature tiles in docs/{de,fr,es}/index.md should link to those locale-specific heading anchors (for example #glisser-deposer on the French Features page), not English slugs. The de, fr, and es trees mirror the English layout (getting started, user guide, API, architecture, deployment, contributing, changelog): keep them in sync when you add or change docs.

  • Build (strict):
./scripts/docs-build.sh
  • Serve locally:
./scripts/docs-serve.sh

Quick Start (Build Locally):

docker-compose up --build

Using Pre-built Images:

# Download and run pre-built images
curl -O https://raw.githubusercontent.com/duckling-ui/duckling/main/docker-compose.prebuilt.yml
docker-compose -f docker-compose.prebuilt.yml up -d

Production Deployment:

docker-compose -f docker-compose.yml -f docker-compose.prod.yml up -d

Access the application at http://localhost:3000

Build Docker Images:

# Build images locally
./scripts/docker-build.sh

# Build and push to registry
./scripts/docker-build.sh --push --registry yourusername

When PRs are merged to main, images are automatically published to Docker Hub and GitHub Container Registry. See Docker Deployment Guide for details and required secrets.

Configuration

Backend Environment Variables

Create a .env file in the backend directory:

FLASK_ENV=development
SECRET_KEY=your-secret-key
DEBUG=True
MAX_CONTENT_LENGTH=104857600  # 100MB

Conversion Settings

Settings can be configured through the UI or via the API. All settings are organized into categories:

OCR Settings

Setting Description Default
enabled Enable/disable OCR true
backend OCR engine (easyocr, tesseract, ocrmac, rapidocr) easyocr
language Primary language for recognition en
force_full_page_ocr OCR entire page vs detected regions false
use_gpu Enable GPU acceleration (EasyOCR) false
confidence_threshold Minimum confidence for results 0.5

Supported Languages: English, German, French, Spanish, Italian, Portuguese, Dutch, Polish, Russian, Japanese, Chinese (Simplified/Traditional), Korean, Arabic, Hindi, Thai, Vietnamese, Turkish, Ukrainian, Czech, Greek, Hebrew, Indonesian, Malay, Swedish, Danish, Finnish, Norwegian

Table Settings

Setting Description Default
enabled Enable table detection true
mode Detection mode (fast/accurate) accurate
do_cell_matching Match cell content to structure true
structure_extraction Preserve table structure true

Image Settings

Setting Description Default
extract Extract embedded images true
classify Classify and tag images true
generate_page_images Create page images false
generate_picture_images Extract pictures as files true
generate_table_images Extract tables as images true
images_scale Scale factor (0.1-4.0) 1.0

Performance Settings

Setting Description Default
device Processing device (auto/cpu/cuda/mps) auto
num_threads CPU threads (1-32) 4
document_timeout Max processing time (seconds) null

RAG Chunking Settings

Setting Description Default
enabled Enable document chunking false
max_tokens Maximum tokens per chunk 512
merge_peers Merge undersized chunks true

API Endpoints

Conversion

Endpoint Method Description
/api/convert POST Upload and convert a document
/api/convert/batch POST Batch convert multiple documents
/api/convert/<job_id>/status GET Get conversion status
/api/convert/<job_id>/result GET Get conversion result
/api/convert/<job_id>/images GET List extracted images
/api/convert/<job_id>/images/<id> GET Download extracted image
/api/convert/<job_id>/tables GET List extracted tables
/api/convert/<job_id>/tables/<id>/csv GET Download table as CSV
/api/convert/<job_id>/chunks GET Get document chunks
/api/export/<job_id>/<format> GET Download converted file

Settings

Endpoint Method Description
/api/settings GET/PUT Get/update all settings
/api/settings/reset POST Reset to defaults
/api/settings/formats GET List supported formats
/api/settings/ocr GET/PUT OCR settings
/api/settings/tables GET/PUT Table settings
/api/settings/images GET/PUT Image settings
/api/settings/performance GET/PUT Performance settings
/api/settings/chunking GET/PUT Chunking settings
/api/settings/output GET/PUT Output settings

History

Endpoint Method Description
/api/history GET List conversion history
/api/history/<job_id> GET Get history entry
/api/history/stats GET Get conversion statistics (includes processing time, OCR/format breakdowns, queue depth)
/api/history/search GET Search history

Development

Running Tests

Backend tests:

cd backend
pytest

Frontend tests:

cd frontend
npm test

CI runs the same frontend test command; for Vitest + React components that listen on window after async data loads, see Frontend Tests in CONTRIBUTING.md (effect scheduling / act).

Project Structure

duckling/
├── backend/
│   ├── duckling.py         # Flask application
│   ├── config.py           # Configuration & defaults
│   ├── models/             # Database models
│   ├── routes/             # API endpoints
│   │   ├── convert.py      # Conversion endpoints
│   │   ├── settings.py     # Settings endpoints
│   │   └── history.py      # History endpoints
│   ├── services/           # Business logic
│   │   ├── converter.py    # Docling integration
│   │   ├── file_manager.py # File handling
│   │   └── history.py      # History management
│   └── tests/              # Backend tests
├── frontend/
│   ├── src/
│   │   ├── components/     # React components
│   │   │   ├── DropZone.tsx
│   │   │   ├── ConversionProgress.tsx
│   │   │   ├── ExportOptions.tsx
│   │   │   ├── SettingsPanel.tsx
│   │   │   └── HistoryPanel.tsx
│   │   ├── hooks/          # Custom hooks
│   │   │   ├── useConversion.ts
│   │   │   └── useSettings.ts
│   │   ├── services/       # API client
│   │   └── types/          # TypeScript types
│   └── tests/              # Frontend tests
├── uploads/                # Temporary uploads
├── outputs/                # Converted files
└── docker-compose.yml      # Docker configuration

Documentation

Comprehensive documentation is available using MkDocs:

In-App Documentation

The documentation is available directly in the Duckling UI. Click the Documentation button in the header to open the docs panel.

UI + Docs Languages

  • UI: English (en), Spanish (es), French (fr), German (de)
  • Docs: Built and served with language prefixes at /api/docs/site/en/, /api/docs/site/es/, /api/docs/site/fr/, /api/docs/site/de/

To add a new language:

  • UI: add frontend/src/locales/<lang>/common.json and wire it into frontend/src/i18n.ts
  • Docs: add docs/<lang>/... (mirroring the English structure), then add a language entry under the i18n plugin in mkdocs.yml

Note: The documentation must be built before it can be viewed in the UI. You can either:

  1. Build manually (recommended for development): Install backend/requirements.txt, then from the repo root run mkdocs build. That file includes the MkDocs stack ( pymdown-extensions 10.21.2+ ) so Pygments does not fail on fenced code without a title.

  2. Build from the UI: Click the "Build Documentation" button in the docs panel if the site hasn't been built yet.

  3. Auto-build: The backend will automatically attempt to build the docs if MkDocs is installed and the site doesn't exist.

View Documentation Locally

# MkDocs is included in backend/requirements.txt — activate that venv.

# Serve documentation with live reload (for editing)
mkdocs serve

Then open http://localhost:8000 in your browser.

Build Documentation

mkdocs build

The built documentation will be in the site/ directory and served at /api/docs/site/ by the backend.

Documentation Sections

Contributing

See CONTRIBUTING.md for guidelines.

Security

See SECURITY.md for security policies.

  • API hardening: History document reload (GET /api/history/{job_id}/load) validates job_id using a strict allowlist and enforces safe output directory resolution to prevent path traversal.

License

This project is licensed under the MIT License - see LICENSE for details.

Acknowledgments

About

UI for Docling

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors