pypdf

Here are 180 public repositories matching this topic...

xhtml2pdf / xhtml2pdf

A library for converting HTML into PDFs using ReportLab

python pdf pdf-converter html-to-pdf pdf-generation reportlab pypdf html-pdf html-to-pdf-converter reportlab-pdf html-pdf-converter

Updated Jan 19, 2026
Python

genieincodebottle / parsemypdf

Star

Collection of PDF parsing libraries like AI based docling, claude, openai, gemini, meta's llama-vision, unstructured-io, and pdfminer, pymupdf, pdfplumber etc for efficient snapshot, text, table, and metadata extraction.

ocr openai claude camelot pymupdf pypdf ocr-python markitdown gemini-pro gemini-ai llama-parse omniai unstructured-io docling llama-vision mistral-ocr smoldocling llama4

Updated Mar 27, 2026
Python

hoehermann / pypdf_strreplace

Star

Search and replace text in PDF files with PyPDF.

pdf pypdf pdf-document-processor

Updated Oct 22, 2025
Python

shine-jayakumar / Extract-Data-From-PDF-In-Python

Star

Batch-convert pdf to text, extract data from pdf in python

Updated Sep 29, 2021
Python

lukefire5156 / PPTs_TO_PDFs_AND_Merger

Star

A script to convert MS Office PPT/PPTX files to PDF files and then merge all the PDF files to a single PDF file.

python python-script pptx pypdf2 pdf-merge ppt pdf-merger merge-pdf merger pypdf ppts pypdf2-lib ppts-to-pdf ppt-merger ppts-to-pdf-merger merge-study-material merge-ppts-to-pdf comptypes-lib

Updated Dec 27, 2020
Python

xreedev / Research-Asist-Tool

Star

This project aims to simplify and summarize scientific data , convert it to a audio format as a podcast , and create a power point presentation from the paper. This helps researchers, academics and students altogether.

react python js word2vec requests bart summarization beautifulsoup tf-idf pptx vectorization btech scientific-papers pypdf btech-project btech-projects

Updated Aug 27, 2024
JavaScript

nuhmanpk / pyDF-Bot

Sponsor

Star

Pydf - Pyrogram Document File Bot, a modular Telegram Bot which provides Pdf Tools Works using Pypdf2

bot pdf tools telegram telegram-bot pypdf2 pypdf pyrogram pyrogram-bot pypdf2-lib

Updated Mar 24, 2022
Python

SaurabhSSB / PDFMergerCLI

Star

A lightweight Python CLI tool to merge multiple PDF files into one. Built using the pypdf library, this script prompts users for input and merges selected PDFs into a single output file with a configurable name.

cli open-source pdf python-script file-management command-line-tool pdf-merger merge-pdf pypdf document-automation

Updated Jun 11, 2025
Python

nanxstats / pdf-word-extraction

Star

Extract meaningful words from a collection of PDF documents and count their frequencies

natural-language-processing spacy wordcloud research-paper pypdf ftfy

Updated Jun 23, 2024
Python

peinan / pdfchat

Sponsor

Star

Gradio demo of LLM chatbot using RAGs

pdf chatbot openai gradio pypdf rag huggingface qdrant llm langchain

Updated Apr 16, 2024
Python

Tobi208 / pypdf-cli

Star

A Python-based CLI that allows for comfortable every-day PDF manipulation with pypdf.

python cli pdf edit command-line-tool pypdf

Updated May 11, 2024
Python

rehanvhora778 / bibtex-extraction

Star

📄 Extract BibTeX entries from PDFs automatically, generating a complete bibliography without manual input or reliance on external APIs.

pdf automation latex bibtex analysis data-visualization pyhton metadata-extraction academic-writing bibliometrics pypdf bibliometric-analysis reference-management research-tools langchain

Updated Apr 16, 2026
Python

kumar-kiran-24 / chatbot

Star

Built a FastAPI backend for an AI chatbot with RAG capabilities, supporting website, PDF, and text data ingestion. Implemented FAISS-based retrieval with embeddings and integrated a Groq LLM for context-aware responses.

python websoup pypdf rag huggingface groq streamlit llm generative-ai langchain rag-chatbot

Updated Apr 1, 2026
Python

TrueLipstick / TrueLipstick-pdf-image-ocr-extractor

Star

Open WebUI tool for extracting text from PDFs and images using Tesseract OCR. Supports text-based and scanned PDFs, multi-language OCR (English + Swedish), fully offline.

multilingual python docker pdf ocr tool tesseract text-extraction fitz pymupdf pypdf pdf-extraction image-ocr open-webui

Updated Mar 8, 2026
Python

ilramdhan / PDF-Merger-Pro

Sponsor

Star

PDF Merger Pro adalah sebuah alat bantu berbasis terminal (CLI) yang ditulis dengan Python untuk menggabungkan beberapa file PDF menjadi satu.

python cli pdf pyinstaller pdf-merger tqdm pyhton3 pypdf

Updated Jan 27, 2026
Python

farvath / Resume-Parser-and-Analysis

Star

This application is built for employers looking for candidates against a particular job description .

nlp natural-language-processing torch nltk doc2vec gensim-word2vec pypdf gemini-api streamlit huggingface-transformers

Updated Jun 20, 2024
Python

ClubCedille / rapport_eirik

Star

Remplissage automatique des demandes de remboursement pour les clubs étudiants de l'ÉTS.

python pdf pdf-files pdf-document pdf-generation pypdf2 pypdf

Updated Feb 11, 2022
Python

leomarkcastro / Puzzle-Maker

Star

A puzzle generator / game and pdf maker. Pretty advance stuff for me. Uses Sciter for UI, pygame for puzzle game and puzzle images.

python pygame sciter tiscript fpdf pypdf pysciter

Updated Nov 22, 2020
Python

ARUNAGIRINATHAN-K / pdf-RAG-question-answering

Star

Upload PDFs → ask questions → get grounded answers.

python pipeline pypdf faiss huggingface streamlit sentence-transformers langchain rag-chatbot

Updated Jan 14, 2026
Python

Ranjan2104 / Create-Audio-Book-from-pdf

Star

A Pure-Python library built as a PDF toolkit. It is capable of: extracting document information (title, author, …) splitting documents page by page merging documents page by page cropping pages merging multiple pages into a single page encrypting and decrypting PDF files and more! By being Pure-Python, it should run on any Python platform withou…

python3 pyttsx3 pypdf

Updated May 24, 2021
Python

Improve this page

Add a description, image, and links to the pypdf topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the pypdf topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pypdf

Here are 180 public repositories matching this topic...

xhtml2pdf / xhtml2pdf

genieincodebottle / parsemypdf

hoehermann / pypdf_strreplace

shine-jayakumar / Extract-Data-From-PDF-In-Python

lukefire5156 / PPTs_TO_PDFs_AND_Merger

xreedev / Research-Asist-Tool

nuhmanpk / pyDF-Bot

SaurabhSSB / PDFMergerCLI

nanxstats / pdf-word-extraction

peinan / pdfchat

Tobi208 / pypdf-cli

rehanvhora778 / bibtex-extraction

kumar-kiran-24 / chatbot

TrueLipstick / TrueLipstick-pdf-image-ocr-extractor

ilramdhan / PDF-Merger-Pro

farvath / Resume-Parser-and-Analysis

ClubCedille / rapport_eirik

leomarkcastro / Puzzle-Maker

ARUNAGIRINATHAN-K / pdf-RAG-question-answering

Ranjan2104 / Create-Audio-Book-from-pdf

Improve this page

Add this topic to your repo