Confirm single camera post-processing workflow, please!

Don't know where to start, or stuck on a certain problem? Drop by and tell us about it. Feel like helping others? Start here.

Moderator: peterZ

Post Reply
Posts: 24
Joined: 09 Feb 2024, 22:21
E-book readers owned: Nook Glowlight 4, Kindle Fire 5th gen
Number of books owned: 100
Country: USA

Confirm single camera post-processing workflow, please!

Post by nightshift »

Yay, I've finally got some usable scans, wooohoo. Time to get started on post processing, and I think I'm a little lost.
I've got a single camera scanner, do all right hand pages (take a break and let camera battery charge, transfer images to computer directory structure

Code: Select all

 - book-title
  - right
  - left 
then come back to do the left pages - at some point I'll get the adapter to run the camera plugged in, maybe)

Post processing steps I think I need to take are:
  • Rename files and move them all to a single directory
  • Run through Scan Tailor Advanced to rotate, split, deskew, crop, etc
  • Run OCR
  • Make pdf
Am I missing anything? Also, a clarifications on one of those steps, if possible.

For the rename, my camera names it's files like IMG_####.JPG with leading zeros. Current workflow and camera settings allow for both right and left pages to start at 0001, but, I might be switching to remote shooting which will probably change that. When I do the rename, what should the file names look like, IMG_####a.JPG and IMG_####b.jpg? Or just ####a.JPG and ####b.JPG? (I'm on Linux, so can't use Bulk Rename, having to write my own script unless one already exists that I haven't found yet)

I'm sure I'll have some questions on the OCR and making of the PDF at some point, but, I'm not there yet.
Posts: 63
Joined: 22 Dec 2016, 06:07
E-book readers owned: Tolino, Kindle
Number of books owned: 600
Country: Poland

Re: Confirm single camera post-processing workflow, please!

Post by zbgns »

Sorry for the delay in responding, and I hope it is still of some use.

The post-processing steps you describe are similar to mine. I also use a single camera scanner. The main difference is that you take the left and right pages separately, whereas I take them in sequential order, so there is no need to rename them in my case. Nevertheless, I can imagine that my approach to the step when renaming files is required would be as follows.

1. The starting point is that there are two sub-folders in which right and left pages are stored separately.

2. The first step would be to check if there are any missing or duplicated pages in these folders. Since you are on Linux, I would suggest using the gThumb image viewer application, which has a thumbnail preview in the bottom panel. It is possible to adjust the width of the gThumb window to have 10 thumbnails in a row. Then it is very easy to check every tenth (actually twentieth) page as they should be in the same columns. If the numbers are shifted in some places, this is a direct indication that you have omitted or doubled pages and that should be fixed.

3. If all the pages are in place, the file names should be changed. There is no need for a separate application or script as gThumb has this function implemented. The both IMG_####a.JPG and IMG_####b.jpg or ####a.JPG and ####b.JPG patterns for left and right pages respectively seems to be an appropriate solution. It is also necessary to have the same number of files in both folders. So do not omit any blank pages when scanning, otherwise the odd and even pages will not be in the right places when copied to a folder.

4. The files from both subdirectories can then be copied into one folder and should be in the correct order, provided that all pages (including both sizes of front cover and blank pages) have been scanned.

5. The next step would be to check that the numbers on the pages match the sequential order of the files. For example, the tenth file in a row should be numbered 10, the fifteenth - 15, and so on. In a typical book, you have a front cover, then a series of preliminary pages interspersed with blank pages, and then numbered pages. Usually it is sufficient to delete some blank pages at the beginning to have numbered pages corresponding to the sequential order of the files. This can be done using gThumb as described in step 2. There is no need to rename the files afterwards as they should still be in the correct order.

The files can then be run through Scan Tailor Advanced, OCRed, converted to PDF and so on. Please note that all of the above steps can be done using only one application.
Post Reply