How to save scans capable of being interrogated.

Share your process here - how to build something, scan something, or use something.

Moderator: peterZ

BML
Posts: 15
Joined: 26 Mar 2020, 07:17
Number of books owned: 0
Country: United Kingdom

Re: How to save scans capable of being interrogated.

Post by BML »

First of all, many thanks for your help. Secondly, all I have now is a 35 page collection of image scans of text pages. I know that there are quite a few advertisements for "Free" OCR applications but expereience has shwon me that many of these when opened turn out to be cut down OCR applications only free for a limited time and with a limited level of flexibility. So, are there any reasonably effective free OCR applications?
cday
Posts: 447
Joined: 19 Mar 2013, 14:55
Number of books owned: 0
Country: UK

Re: How to save scans capable of being interrogated.

Post by cday »

BML wrote: 23 Oct 2021, 18:53 I have not touched scanning documents for a long time but I did so today intending to add my comments in between the paragraphs using copy and paste.
BML wrote: 24 Oct 2021, 18:02 All I have now is a 35 page collection of image scans of text pages.
If you need to insert text comments between the existing image scans, one way would be to use a software that enables PDF files to be edited in a way that allows that, rather than only editing the text in a file. I don't have any experience of software that would enable that but someone else might.

My way of doing what you need would be to use a word processor such as MS Word or Libre Office Writer to place the scan text images where you want them, then type the text you wish to enter between the images. That file could then be output either as a word processor file or exported as a PDF. It would then be necessary to OCR that output file to make not only the entered text but also the scanned text images searchable.

At a detailed level, if you still have the original scan images files they could be inserted directly into the new word processor file, if you no longer have them they could be extracted from your existing multipage PDF file using freeware software. I created a minimal word processor file to test the above method and tested making the whole file searchable using several softwares that I have, and Abbyy FineReader 12 and Nitro Pro 8 output a fully searchable PDF file, but Adobe Acrobat Standard XI will not make a file containing existing text searchable.

Those are all rather old versions of the respective programs and it is a pity that price rises now make the current versions significantly less affordable. I might be able to convert a file for you once if privacy is not a concern, but imagine that probably wouldn't be a very practical option!
BML wrote: 24 Oct 2021, 18:02 I know that there are quite a few advertisements for "Free" OCR applications but experience has shown me that many of these when opened turn out to be cut down OCR applications only free for a limited time and with a limited level of flexibility. So, are there any reasonably effective free OCR applications?
Any suggestions, taking into account that the file to be made searchable would contain both images and word processor text?
cday
Posts: 447
Joined: 19 Mar 2013, 14:55
Number of books owned: 0
Country: UK

Re: How to save scans capable of being interrogated.

Post by cday »

Are your existing 35 pages of scans images of typical book pages, or possibly scans of, for example, notes you or someone else has typed?

If the existing images are not of book pages, and the content is needed but the formatting doesn't need to be preserved accurately, there may be a very low budget solution. It would possibly be moderately labour intensive initially, but overall possibly less so than using one of the more conventional solutions.
BML
Posts: 15
Joined: 26 Mar 2020, 07:17
Number of books owned: 0
Country: United Kingdom

Re: How to save scans capable of being interrogated.

Post by BML »

The existing 35 pages are A4 typing paper size with typed words and I scanned them. I tried to join Libre and registered but it then refused to recognise me. Libre makes it impossible to contact them, unless that is, you know how to.
cday
Posts: 447
Joined: 19 Mar 2013, 14:55
Number of books owned: 0
Country: UK

Re: How to save scans capable of being interrogated.

Post by cday »

BML wrote: 25 Oct 2021, 05:40Libre makes it impossible to contact them, unless that is, you know how to.
That was my experience when I thought there might be a forum where I could suggest that a Libre PDF software might be a useful addition to the Libre suite!
The existing 35 pages are A4 typing paper size with typed words and I scanned them.
Is the text currently searchable? If so, a zero budget solution might be to select the text on each page (or possibly even all the text in the PDF file) and then paste it into whichever word processor you are using. A practical consideration is that each line will be displayed separately and end, if you enable the display of hidden characters, with a paragraph mark. If you try it you will see what I mean.

If the existing 35 pages are not currently searchable, the very low budget solution would be to purchase the Abbyy Screenshot Reader tool (£8.99!) which can be used to convert an image on the screen to text which can be pasted into a word processor. A practical consideration apart from the time and care required to capture all 35 pages accurately, is that the output would probably need to be proof read to catch any recognition errors.
BML
Posts: 15
Joined: 26 Mar 2020, 07:17
Number of books owned: 0
Country: United Kingdom

Re: How to save scans capable of being interrogated.

Post by BML »

I attempted to use ABBYY Fine Reader and the following came up,"This is protected by user password." Does that mean I'm stuffed?
cday
Posts: 447
Joined: 19 Mar 2013, 14:55
Number of books owned: 0
Country: UK

Re: How to save scans capable of being interrogated.

Post by cday »

BML wrote: 25 Oct 2021, 07:41 I attempted to use ABBYY Fine Reader...
ABBYY Screenshot Reader?

... and the following came up,"This is protected by user password." Does that mean I'm stuffed?
You used it on your existing PDF containing the 35 images of typewritten pages?

If so, did you password protect it, and have you lost the password you used?

If that is the case there is probably a way to remove the password, I can probably help you with that having done it myself on a PDF file of my medical records with the pages in an incorrect order which I was assured had to be locked!
BML
Posts: 15
Joined: 26 Mar 2020, 07:17
Number of books owned: 0
Country: United Kingdom

Re: How to save scans capable of being interrogated.

Post by BML »

I'm giving up on it and am going to ask for an open copy of the report that I can add my comments to. Many thanks for your help.
Post Reply