Scan Tailor is an interactive post-processing tool for scanned pages. It performs operations such as page splitting, deskewing, adding/removing borders, and others. You give it raw scans, and you get pages ready to be printed or assembled into a PDF or DJVU file. Scanning, optical character recognition, and assembling multi-page documents are out of scope of this project.
Scan Tailor
Moderator: peterZ
- daniel_reetz
- Posts: 2810
- Joined: 03 Jun 2009, 13:56
- E-book readers owned: Used to have a PRS-500
- Number of books owned: 600
- Country: United States
- Contact:
Scan Tailor
My friend Mary M. just pointed me to this very interesting software package, Scan Tailor.
-
- Posts: 596
- Joined: 06 Jun 2009, 23:57
Re: Scan Tailor
This looks good. A bit slow, but most of the pages are adequately recognized in automatic mode -- problems with some page numbers being clipped out, a few pages "recognizing" bits of the facing page, and a problem with blank pages that can be made moot by running Scan Tailor after merging the left and right views -- but it's sufficiently robust and sufficiently flexible that I'll probably stop using YAPP. I'll still be using ImageMagick -- Scan Tailor, as far as I can tell, only puts out TIFF files, and I still need to convert them to PDFs. Also, it's possible to bloat the original images by 10-20 times by choosing output parameters poorly -- color and 600 DPI takes a 1.5 MB JPEG and turns it into a 25 MB TIFF -- but the "mixed" mode does a good job of putting out crisp text and still preserving greyscale images.
I need to play with it some more, but I think this is going to become my main post-processing engine, at least until something better comes along. Thanks for the tip.
I need to play with it some more, but I think this is going to become my main post-processing engine, at least until something better comes along. Thanks for the tip.
Re: Scan Tailor
That's a great find. Automatically splits page pretty well. Too bad there's no option to use only one feature like split page alone. You have to run your pages through the whole process which is very time consuming.
- daniel_reetz
- Posts: 2810
- Joined: 03 Jun 2009, 13:56
- E-book readers owned: Used to have a PRS-500
- Number of books owned: 600
- Country: United States
- Contact:
Re: Scan Tailor
Mary's pointed me to a few interesting things now. Thanks for taking the time to check this out and come back with your experiences.
You know, if Scan Tailor had a few extra features, and especially if it had a "camera model" -- in other words, taking into account focal length and lens distortion, it could really be a killer processor. You could probably get this done with Fulla from the Hugin suite, or some other panotools prog.
His page says he's looking for developer help. If only I had any worthwhile programming skillz...
You know, if Scan Tailor had a few extra features, and especially if it had a "camera model" -- in other words, taking into account focal length and lens distortion, it could really be a killer processor. You could probably get this done with Fulla from the Hugin suite, or some other panotools prog.
His page says he's looking for developer help. If only I had any worthwhile programming skillz...
- rob
- Posts: 773
- Joined: 03 Jun 2009, 13:50
- E-book readers owned: iRex iLiad, Kindle 2
- Number of books owned: 4000
- Country: United States
- Location: Maryland, United States
- Contact:
Re: Scan Tailor
Fascinating... I'm going to take a look!
The Singularity is Near. ~ http://halfbakedmaker.org ~ Follow me as I build the world's first all-mechanical steam-powered computer.
-
- Posts: 24
- Joined: 28 Jul 2009, 01:27
- E-book readers owned: lBook V8, lBook V3
- Number of books owned: 0
- Location: Sofia, Bulgaria
Re: Scan Tailor
May be you don't know, but ScanTailor is written as "reply" to Scan Kromsator - very powerfull, but very sophisticated and not-well-documented program. Many users in Russia used SK for post-scan image processing.
- daniel_reetz
- Posts: 2810
- Joined: 03 Jun 2009, 13:56
- E-book readers owned: Used to have a PRS-500
- Number of books owned: 600
- Country: United States
- Contact:
Re: Scan Tailor
That is super-interesting, Mandor. I just found the abbreviated guide to Kromsator. I speak enough Russian to understand the instructions, but I don't recognize or understand the word "kromsator". Does it sound like anything to you?
-
- Posts: 24
- Joined: 28 Jul 2009, 01:27
- E-book readers owned: lBook V8, lBook V3
- Number of books owned: 0
- Location: Sofia, Bulgaria
Re: Scan Tailor
Well, you can use Толковый Ñловарь руÑÑкого Ñзыка:
and sounds like: "roughly, neglect cutting in pieces".КРОМСÐТЬ, аю, аешь; кромÑанный; неÑов., что (разг.). Грубо, неаккуратно резать на чаÑти. К. хлеб
- daniel_reetz
- Posts: 2810
- Joined: 03 Jun 2009, 13:56
- E-book readers owned: Used to have a PRS-500
- Number of books owned: 600
- Country: United States
- Contact:
Re: Scan Tailor
Thanks for the link and explanation. I usually useKatzner's dictionary, but it's in a box out in my workshop. I'll use the Толковый Ñловарь from now on... certainly looks more complete than the Promt online engine...
I love Russian for words like "неаккурат()".
I love Russian for words like "неаккурат()".
- rob
- Posts: 773
- Joined: 03 Jun 2009, 13:50
- E-book readers owned: iRex iLiad, Kindle 2
- Number of books owned: 4000
- Country: United States
- Location: Maryland, United States
- Contact:
Re: Scan Tailor
Ha, the only Russian I know is, божемой!
Anyway, I compiled scantailor on OSX, and it seems pretty interesting, but it does not seem to take care of the two major problems using cameras, which are splitting the page properly (almost always chooses the wrong side for the page [EDIT: I misinterpreted Scan Tailor's output, and found it was actually selecting the proper side]), and keystone correction (as in, there is none). Here's an example of the auto-deskewed version of a page. Notice that there is no fixed skew amount that will correct a keystoned image.
I really should work on PostProcessor again...
--Rob
Anyway, I compiled scantailor on OSX, and it seems pretty interesting, but it does not seem to take care of the two major problems using cameras, which are splitting the page properly (almost always chooses the wrong side for the page [EDIT: I misinterpreted Scan Tailor's output, and found it was actually selecting the proper side]), and keystone correction (as in, there is none). Here's an example of the auto-deskewed version of a page. Notice that there is no fixed skew amount that will correct a keystoned image.
I really should work on PostProcessor again...
--Rob
The Singularity is Near. ~ http://halfbakedmaker.org ~ Follow me as I build the world's first all-mechanical steam-powered computer.