Scan Tailor

Scan Tailor specific announcements, releases, workflows, tips, etc. NO FEATURE REQUESTS IN THIS FORUM, please.

Re: Scan Tailor

Postby spamsickle » 21 Dec 2009, 14:34

There is still the issue of pages with no text, or with text along with pictures that confuse things. I've found that pictures with "rectangles in perspective" (buildings, doors, tables, bookshelves, etc.) cause lots of problems with ScanTailor's deskewing step. I expect the same issues would reappear in a dewarping context.

I haven't done any work at all with dewarping, but I've done some work with interpolation (splines and scaling mostly). Would it be completely unworkable to eliminate the line detection step altogether, and simply interpolate a dewarp function from the curve at the top of the page to the curve at the bottom?
spamsickle
 
Posts: 572
Joined: 06 Jun 2009, 23:57

Re: Scan Tailor

Postby rob » 21 Dec 2009, 14:45

Interestingly, that's exactly how the de-keystoning algorithm works. It detects the right and left sides of the text, computes a line for each side, and then straightens the line using a bilinear transformation of the image.

Are you saying, though, that you would look at the physical top and bottom of the page rather than the text? Hmm.....!
The Singularity is Near. ~ http://halfbakedmaker.org ~ Follow me as I build the world's first all-mechanical steam-powered computer.
User avatar
rob
 
Posts: 770
Joined: 03 Jun 2009, 13:50
Location: Maryland, United States

Re: Scan Tailor

Postby spamsickle » 21 Dec 2009, 16:26

Yes, the physical top and bottom of the page. I know identifying that could get complicated too, with colored book covers that wrap around to the inside, but the suggestion that someone (maybe you) made previously to mask the cover with a black sheet would seem like it would help. At that point, it would be largely a matter of distinguishing the "top" page from the block of pages behind it (or moving the black mask directly under each page as it's being scanned, which might be more trouble than it's worth, since you're shooting two pages at once). I just know that with a lot of the pages I've scanned, trying to figure out page geometry by finding horizontal lines of text will fail -- there's either not enough text, or there are other elements that will make identifying the text difficult.

As I've said, dewarping (and even de-keystoning) isn't really a big concern to me. I use the platen to press the pages pretty flat, which takes care of the warp, and while I undoubtedly get some keystoning, it's so minor that I don't notice it.

Maybe those people for whom it is important wouldn't mind specifying a spline at the top and bottom of each page, or adjusting a spline that the software identifies initially. I've gotten used to adjusting the "select content" box on problem pages, and something similar might work for "problem pages" in a separate dewarp step too. That way, the software could automate most of it, and only the exceptional cases would require manual intervention.
spamsickle
 
Posts: 572
Joined: 06 Jun 2009, 23:57

Re: Scan Tailor

Postby daniel_reetz » 21 Dec 2009, 16:36

One of the original plans for Page Builder was to put a spline in every few hundred pages. You could make a reasonable assumption that you could interpolate between them as the book shifted shape left-to-right.
User avatar
daniel_reetz
 
Posts: 2482
Joined: 03 Jun 2009, 13:56

Re: Scan Tailor

Postby laszlo » 22 Dec 2009, 06:13

rob wrote:The biggest problem with line detection is that you will inevitably end up with a squiggly line, rather than a smooth line. Even the images provided by laszlo: what's the next step? Use the bottoms of each line? Tops? Averages? All of those are valid, but the line still ends up squiggly or jagged.


I would put bezier curve at the top and at the bottom of each line. However some more cleaning may be still necessary.

The line start and endings: I would use the original image to determine this.

If I find time I will put a mockup how I would imagine it.
laszlo
 
Posts: 18
Joined: 16 Dec 2009, 16:35

Re: Scan Tailor

Postby StevePoling » 23 Dec 2009, 02:44

laszlo wrote:I would put bezier curve at the top and at the bottom of each line. However some more cleaning may be still necessary.


(I'm speaking theoretically, not practically.) Let's assume that the text runs horizontally across the page and that the page is a bent flexible sheet held at an angle to the camera. (Further assume each line is of uniform height.) From the physics of the sheet's bending, I believe one an derive a family of curves to which the lines of text belong. It remains to ascertain the maximum likelihood parameterization of those curves given the "greeked" text coming out of preprocessing.

This could be hard. One can crib from the statistical pattern recognition guys, create an idealized "template" of the page, then match the image against the template, then successively warp the template according to the physics of how pages bend until the template-match score is highest.

To construct the template, imagine a gray bar of fixed height, which the darkness of the gray varying from top to bottom according to a frequency of pixel activation derived from a large sample of text rendered in a reasonable font. I.e. all the white and black pixels of the lines of text of a big sample averaged together. Further imagine a stack of these gray bars spaced with the same leading as the typeset page. A flat page will line up with this template and it should have a nice sharp peak in the autocorrelation function of image against template.

But since the page is bent, the autocorrelation function will be smeared. Warping the template until the autocorrelation function gets a sharp peak. Then apply the inverse warp to the page.

I don't know if it'll work. But it's an amusing gedanken experiment.
StevePoling
 
Posts: 290
Joined: 20 Jun 2009, 12:19
Location: Grand Rapids, MI

Re: Scan Tailor

Postby rob » 24 Dec 2009, 18:24

Another attempt at a dewarper. This is a standalone Java program intended to be used with the bilevel TIFF images output from ScanTailor. It's not production quality: it only takes a single file, outputs a bunch of diagnostics, outputs a bunch of diagnostic images, and the resulting image is always saved as orig-uncurled.png.

You will need:

1. Java 1.6+
2. JAI 1.1.3+ (OSX already has it, otherwise from here: https://jai.dev.java.net/binary-builds.html)
3. The dewarp1.0.jar file, which you get from the dewarp1.0.jar.zip file attached.
4. TIFF files from ScanTailor.

To run it: java -Xmx128M -jar dewarp1.0.jar <tiff-file>

Samples:

laszlo_undistort.gif
laszlo_undistort.gif (284.67 KiB) Viewed 6071 times

robA_undistort.gif
robA_undistort.gif (124.04 KiB) Viewed 6071 times

robB_undistort.gif
robB_undistort.gif (129.45 KiB) Viewed 6071 times


(Tulon, sadly I'm still having difficulty with your images. But I'm closer!)

The images that are output:

  • orig.png: the original file as a PNG
  • connected.png: the original file as connected components
  • connected-bottoms.png: only the bottoms of the connected components (not used in algorithm)
  • connected-topleft.png: only the top left corners of the connected components (not used in algorithm)
  • connected-mean.png: the mean horizontal component of each vertical slice of each connected component
  • lined-candidates.png: the raw data which needs to be matched against smooth lines. Short lines and lines that exceed 10% of the height of the image are removed.
  • line-margins.png: the raw data showing the left and right margins found.
  • corrected-points.png: the lined-candidates data after dekeystoning (straightening the margins).
  • orig-transformed.png: the original image after dekeystoning.
  • lined-estimate.png: smooth lines fitted to the raw data overlaid on the dekeystoned image.
  • lined-grid.png: interpolated lines every 50 vertical pixels overlaid on the dekeystoned image.
  • orig-uncurled.png: the final dewarped image.

dewarp1.0.jar.zip
Dewarp jar file
(169.72 KiB) Downloaded 428 times

dewarp1.0-src.zip
Dewarp source code
(170.27 KiB) Downloaded 296 times
The Singularity is Near. ~ http://halfbakedmaker.org ~ Follow me as I build the world's first all-mechanical steam-powered computer.
User avatar
rob
 
Posts: 770
Joined: 03 Jun 2009, 13:50
Location: Maryland, United States

Re: Scan Tailor

Postby laszlo » 26 Dec 2009, 19:38

rob wrote:Another attempt at a dewarper.


Awesome. I will take some photos and try out your program when Im back home from holiday.

Thank you very much for you effort. Really.
laszlo
 
Posts: 18
Joined: 16 Dec 2009, 16:35

Re: Scan Tailor

Postby rob » 28 Dec 2009, 11:41

I keep looking at this post, and it disturbs me. It looks like the page images are breathing!
The Singularity is Near. ~ http://halfbakedmaker.org ~ Follow me as I build the world's first all-mechanical steam-powered computer.
User avatar
rob
 
Posts: 770
Joined: 03 Jun 2009, 13:50
Location: Maryland, United States

Re: Scan Tailor

Postby rob » 30 Dec 2009, 22:33

New version of Dewarp.

* An improved line finder.
* corrected a bug in interpolation which causes image to shift
* Four options to control how hard the program works!

As usual, unzip the dewarp1.1.jar.zip file to dewarp1.1.jar, and then run as:

java -Xmx128M -jar dewarp1.1.jar

Dewarp v1.1 (build date 20091230A) (c) 2009 Robert Baruch
This program is distributed under the terms of the GNU General Public License.

Usage: Dewarp [b][2][w][a] <tiff image file>

b, 2, w, and a are flags that you may add. You may choose none, or all of them.

b will help smooth out variations in distortion from line to line,
but line estimation will take twice as long. Highly recommended if
the result seems to have squeezed and expanded areas.

2 will also help smooth out variations in distortion from line to line,
and will also double line estimation time.

w will add a sinusoidal component to line estimation. This adds
time to line estimation, but can be useful if the result without
it looks a little wavy. Don't use this if the output is good without it.

a will cause the program to assume that above the first detected line, and
below the last detected line, the distortion increases linearly. Without
this option, the assumption is that the distortion remains the same. Use
this only if there are graphics or short lines at top or bottom, and the
program does not seem to correct the distortion there. This option can
also make things worse, so try first without it.

dewarp1.1.jar.zip
Jar file.
(168.71 KiB) Downloaded 585 times
dewarp1.1-src.zip
Source code
(168.77 KiB) Downloaded 380 times


Samples:

Using w option:

laszlo.gif
laszlo.gif (170.94 KiB) Viewed 5976 times


Using no options:

keystoned.gif
keystoned.gif (133.47 KiB) Viewed 5976 times


Using ab2w options: (still a bit curled at the bottom due to the conservative assumptions of the line finding algorithm)

tulonB.gif
tulonB.gif (200.29 KiB) Viewed 5976 times


Using ab2 options: (adding the w option is worse!)

tulonA.gif
tulonA.gif (158.84 KiB) Viewed 5976 times
The Singularity is Near. ~ http://halfbakedmaker.org ~ Follow me as I build the world's first all-mechanical steam-powered computer.
User avatar
rob
 
Posts: 770
Joined: 03 Jun 2009, 13:50
Location: Maryland, United States

PreviousNext

Return to Scan Tailor

Who is online

Users browsing this forum: No registered users and 0 guests