Scan Tailor

Scan Tailor specific announcements, releases, workflows, tips, etc. NO FEATURE REQUESTS IN THIS FORUM, please.

Re: Scan Tailor

Postby Tulon » 29 Jan 2010, 20:07

I've built a snapshot from the current sources. You can download it from here:
http://www.onlinedisk.ru/file/334746/

It has the code to handle complex non-uniform illumination cases such as Antoha-spb's ones. It will automatically decide on which algorithm to use for that purpose.

It also has the updated Rob's dewarping algorithm, though it's not complete yet. First, its parameters are hardcoded and not changable, then it only works in B/W output mode and 600 dpi output resolution. It's also slow, though I see a few ways to speed it up. All of the above limitations I am able to fix myself, and will eventually do so.
When Scan Tailor asks you to enter DPIs manually, never enter arbitrary values. The video tutorial shows how to estimate the real DPI.
Tulon
 
Posts: 536
Joined: 03 Oct 2009, 06:13
Location: London, UK

Re: Scan Tailor

Postby rob » 29 Jan 2010, 23:29

Awesome! I was hoping you had some ideas for speeding up the algorithm...
The Singularity is Near. ~ http://halfbakedmaker.org ~ Follow me as I build the world's first all-mechanical steam-powered computer.
User avatar
rob
 
Posts: 770
Joined: 03 Jun 2009, 13:50
Location: Maryland, United States

Re: Scan Tailor

Postby laszlo » 30 Jan 2010, 17:14

Maybe it is a bit offtopic question here. But is there an ebook reader on the market, which can handle the scanned books?
So its readable on the screen and also usable (page turning, zooming, etc).

There are so many ebook readers already, maybe one fulfill the requirements
(kindle, barnes and noble, bebook and its clones, sony ebook reader, etc, etc)?
laszlo
 
Posts: 18
Joined: 16 Dec 2009, 16:35

Re: Scan Tailor

Postby Tulon » 30 Jan 2010, 17:42

I would personally only consider devices you can put opensource firmware on. I googled for such a firmware and found the OpenInkpot project. If there is a device friendly to self-assembled ebooks, I would bet on a device with this firmware.
When Scan Tailor asks you to enter DPIs manually, never enter arbitrary values. The video tutorial shows how to estimate the real DPI.
Tulon
 
Posts: 536
Joined: 03 Oct 2009, 06:13
Location: London, UK

What to do about keystoning?

Postby StevePoling » 31 Jan 2010, 19:24

I have a question about Scan Tailor. I don't think this is a warping question, or perhaps I'm just ignorant of the right terminology. Is there a way that Scan Tailor (or perhaps some post-process) can remove keystoning from an image? (When I ran a film projector in college, they called such an effect "keystoning" because the projected image took on the trapezoidal shape of the keystone of an arch.)

If the focal plane of my camera is not parallel to the plane of the page being imaged, my picture of the page will be distorted resulting in a skewed image. The lines of text on the page deviate from the original almost-rectangular configuration and take on a trapezoidal arrangement. Scan Tailor does a great job of rotating images. But in such cases the lines at the top of the page run uphill, the lines in the middle are level, and the lines at the bottom of the page run downhill.

Is there any mechanism to remove this form of skewing via Scan Tailor? If not, is there some other program I can run before it or after it?

Thanks in advance,

steve
StevePoling
 
Posts: 290
Joined: 20 Jun 2009, 12:19
Location: Grand Rapids, MI

Re: Scan Tailor

Postby sdati » 31 Jan 2010, 20:41

The dewarping code from Rob does a fine job of removing keystoning, at least in my limited experience. There would be much simpler, faster algorithms that would work to remove keystoning, but if you're willing to wait, the existing code works.

There are some serious limitations to Rob's code in certain cases -- essentially it does best when there is solid text across the page. Cases where the margins are uneven, or there are blocks of text with different layout are handled poorly -- I think the simplest solution would be to run some page layout analysis first and run the dewarping on each section independently (this wouldn't work in some extreme cases where layout analysis would fail anyway...).
sdati
 
Posts: 25
Joined: 20 Jan 2010, 14:03

Re: Scan Tailor

Postby rdoug » 01 Feb 2010, 11:30

Hi All,

I wanted to introduce myself as another person tinkering with Scan Tailor and with great interest in the progress on the dewarping algorithm and general enhancements. I am employing Scan Tailor not just on books but on 20,000ish pages of documents gathered at national archives that need to OCRed.

I have two main priorities I'd like to discuss.

The first is the dewarping algorithm
Unfortunately Tulon's Jan 29 build crashes on my Win 7 64bit machine just after creating a new project and fixing the dpi. If interested I can forward the debug report.

The second is batch processing with Scan Tailor.
I have dozens of folders that need to be converted while maintaining the file structure. Does anyone know of a simple way to do this with Scan Tailor? Are there plans for a simple command line interface where I can just pass in the input directory, output directory and tell it to do everything else with its best judgement?
On my own I considered automatically generating the xml for a saved project, but I would still have to manually run it.
I settled on AutoHotkey script that will reliably start a job from scratch but doesn't reliably run one job after another. If anyone is interested I can share my script.

If anyone has a build that solves either of these problems they would like to share/want someone to debug or test on a large group of varied documents let me know.
rdoug
 
Posts: 4
Joined: 01 Feb 2010, 07:16

Re: Scan Tailor

Postby Tulon » 01 Feb 2010, 12:53

rdoug wrote:Unfortunately Tulon's Jan 29 build crashes on my Win 7 64bit machine just after creating a new project and fixing the dpi. If interested I can forward the debug report.

The latest build was done against Qt 4.6.1. It looks like Qt 4.6.x has a bug that's responsible for this crash. It's triggered by launching batch processing, which means you can still use it for testing, as long as you don't launch batch processing. I'll make a new build when I am back from my vacation.

rdoug wrote:Are there plans for a simple command line interface where I can just pass in the input directory, output directory and tell it to do everything else with its best judgement?

A command-line based project creation seems like a reasonable thing, though it's a very low priority for me.
When Scan Tailor asks you to enter DPIs manually, never enter arbitrary values. The video tutorial shows how to estimate the real DPI.
Tulon
 
Posts: 536
Joined: 03 Oct 2009, 06:13
Location: London, UK

Re: Scan Tailor

Postby kerekes » 09 Feb 2010, 18:23

Hi!

- Help! New technik for Scan Tailor in Image (for_scan_talor_web.jpg). on funkcion faild in scan tailor

- Bitte helf mir, ich arbeite fiel mit scen tailor, und ich benötige ein Funktion in scan tilor., weil so könnte ich schneller arbeiten!!

Danke – Thenx!
Attachments
for_scan_talor_web_800x500.jpg
for_scan_talor_web_800x500.jpg (38.24 KiB) Viewed 2389 times
kerekes
 
Posts: 1
Joined: 10 Dec 2009, 11:59

Re: Scan Tailor

Postby daniel_reetz » 12 Feb 2010, 10:19

kerekes, as I recall from earlier in this thread, some functions in Scan Tailor cannot be applied to all pages, and that is by design. If you have time, read back through the last few pages to see the discussion with Tulon.

I have the faint impression that it is possible to apply one setting to all pages by editing the actual project file produced by Scan Tailor. Can anyone confirm this?
User avatar
daniel_reetz
 
Posts: 2490
Joined: 03 Jun 2009, 13:56

PreviousNext

Return to Scan Tailor

Who is online

Users browsing this forum: No registered users and 1 guest