Scantools for Linux - add OCR to existing PDF or create ocr'd PDF's from scans

Share your software workflow. Write up your tips and tricks on how to scan, digitize, OCR, and bind ebooks.

Moderator: peterZ

Post Reply
Krokkie
Posts: 2
Joined: 16 Jan 2020, 05:05
E-book readers owned: Nook
Number of books owned: 0
Country: UK

Scantools for Linux - add OCR to existing PDF or create ocr'd PDF's from scans

Post by Krokkie »

Scantools for Linux - convert to PDF with OCR

It may interest some users in the community to produce OCR'd PDF's. There are already some solutions in place for this (such as pdfbeads or pdf.py) but how about just adding OCR on the fly by processing an existing scan to PDF or just add OCR to an existing PDF?

Scantools is a set of Linux PDF/A tools with the ability to perform OCR.

Scantools
https://cplx.vm.uni-freiburg.de/scantools/

Downloads here
https://software.opensuse.org/package/scantools


Usage Examples:


Add OCR to existing PDF with ocrPDF

ocrPDF book.pdf -o bookocr.pdf

Produce OCR'd PDF from a JPG scan with image2pdf

image2pdf scan.jpg -p A4 -r fit -b -o scanocr.pdf
Post Reply