Linux Mint - Community

add an OCR text layer to PDF files

https://github.com/jbarlow83/OCRmyPDF
0
6 reviews

OCRmyPDF generates a searchable PDF/A file from a regular PDF containing only images, allowing it to be searched.

It uses the Tesseract OCR engine and so supports all the languages that Tesseract does.

Some other main features:

* Places OCR text accurately below the image to ease copy / paste * Keeps the exact resolution of the original embedded images * When possible, inserts OCR information as a lossless operation without rendering vector information * Keeps file size about the same * If requested deskews and/or cleans the image before performing OCR * Validates input and output files * Provides debug mode to enable easy verification of the OCR results * Processes pages in parallel when more than one CPU core is available * Battle-tested on thousands of PDFs, a test suite and continuous integration.

Latest reviews

eluke 10 months ago

Mint 22.1. Installed from software manager, works fine. It is a command line tool. It does not apear in menu but works perfectly fine from command line: "ocrmypdf inputfile.pdf outputfile.pdf" Documentation: https://ocrmypdf.readthedocs.io/ (CoolHappyGuy's link 404s for me)

CoolHappyGuy 1 year ago

Installed but does not launch. Instead of installing from Software Manager, I recommend this approach. It also describes how ocrmypdf operates: https://www.talido.com/blog/how-to-ocr-pdf-files-on-linux-using-ocrmypdf/

zuzu 2 years ago

Not starting. Installed from software manager.

kezerd 2 years ago

Mint 21.2. Installed from software manager. Would not start. Not anywhere in menu.

SkidMark 3 years ago

Mint19.3 Worked amazingly well. Had a pdf with fine technical details that was stored as an image with encryption. Printed to pdf to remove encryption, then used this to OCR to a new file to make the text searchable. Worked like a champ!

advolex 4 years ago

I needed this app for the german lawyer electronic postal service (beA) and it worked PERFECTLY with tesseract, which I installed before. Really a great relief!!