How to edit the text layer of a PDF?

hacecalor · Post by **hacecalor** » 29 Sep 2016, 08:34

Hi all,

I'm in the process of building my scanner, so I haven't completed any projects yet, but I have made a couple of PDFs using photos from the Internet Archive.

My question is about how one would modify the text layer of a PDF. Google doesn't seem to be very helpful about it.

I'm using Tesseract with hOCR via ocrmypdf and the results are good, but need a few corrections here and there.

It'd be nice if there were a tool that let you see the text layer of a PDF and edit it on the fly. Anyone know of a program (that runs on Linux, preferably free) that can do something like this?

qqmxdpo · Post by **qqmxdpo** » 29 Sep 2016, 10:35

Hi
I even made somes experiments abouts hows the OCR of some software identifies the words of some pictures。it is not 100% successful。 sorry。

L.Willms · Post by **L.Willms** » 05 Mar 2018, 03:36

hacecalor wrote: ↑29 Sep 2016, 08:34 I'm in the process of building my scanner, so I haven't completed any projects yet, but I have made a couple of PDFs using photos from the Internet Archive.
[...]
It'd be nice if there were a tool that let you see the text layer of a PDF and edit it on the fly.

I know that version 14 of ABBYY Fine Reader can do this, and of course Adobe Acrobat Pro, starting with a higher level beyond version 8. I have Acrobat Pro 8, and this can do OCR of an image PDF, but provides no means to edit the recognized text. I know that later versions can do that, but don't know with which level on that capability is provided.

b0bcat · Post by **b0bcat** » 06 Mar 2018, 01:52

I may be sending you on a wild goose chase but this may have some leads:

https://github.com/manisandro/gImageReader/

I haven't looked at it for some time but I recall a fairly recent version allowed input image pdf to be output with a text layer instead of ocr-only and (maybe) the recognized text could also be edited - whether that applies also to a text layer as opposed to text-only, I don't know. Happy exploring!

DIY Book Scanner

How to edit the text layer of a PDF?

How to edit the text layer of a PDF?

Re: How to edit the text layer of a PDF?

Re: How to edit the text layer of a PDF?

Re: How to edit the text layer of a PDF?