A library for converting HTML into PDFs using ReportLab
-
Updated
Jan 19, 2026 - Python
A library for converting HTML into PDFs using ReportLab
Collection of PDF parsing libraries like AI based docling, claude, openai, gemini, meta's llama-vision, unstructured-io, and pdfminer, pymupdf, pdfplumber etc for efficient snapshot, text, table, and metadata extraction.
Search and replace text in PDF files with PyPDF.
Batch-convert pdf to text, extract data from pdf in python
A script to convert MS Office PPT/PPTX files to PDF files and then merge all the PDF files to a single PDF file.
This project aims to simplify and summarize scientific data , convert it to a audio format as a podcast , and create a power point presentation from the paper. This helps researchers, academics and students altogether.
Pydf - Pyrogram Document File Bot, a modular Telegram Bot which provides Pdf Tools Works using Pypdf2
A lightweight Python CLI tool to merge multiple PDF files into one. Built using the pypdf library, this script prompts users for input and merges selected PDFs into a single output file with a configurable name.
Extract meaningful words from a collection of PDF documents and count their frequencies
📄 Extract BibTeX entries from PDFs automatically, generating a complete bibliography without manual input or reliance on external APIs.
Built a FastAPI backend for an AI chatbot with RAG capabilities, supporting website, PDF, and text data ingestion. Implemented FAISS-based retrieval with embeddings and integrated a Groq LLM for context-aware responses.
Open WebUI tool for extracting text from PDFs and images using Tesseract OCR. Supports text-based and scanned PDFs, multi-language OCR (English + Swedish), fully offline.
PDF Merger Pro adalah sebuah alat bantu berbasis terminal (CLI) yang ditulis dengan Python untuk menggabungkan beberapa file PDF menjadi satu.
This application is built for employers looking for candidates against a particular job description .
Remplissage automatique des demandes de remboursement pour les clubs étudiants de l'ÉTS.
Upload PDFs → ask questions → get grounded answers.
A Pure-Python library built as a PDF toolkit. It is capable of: extracting document information (title, author, …) splitting documents page by page merging documents page by page cropping pages merging multiple pages into a single page encrypting and decrypting PDF files and more! By being Pure-Python, it should run on any Python platform withou…
Add a description, image, and links to the pypdf topic page so that developers can more easily learn about it.
To associate your repository with the pypdf topic, visit your repo's landing page and select "manage topics."