I don't have a name for it... I guess "Rob's Book Postprocessing Software" will do for now. The software is written in Java, and makes heavy use of the Leptonica C library.
First, we have the original image straight from the camera. This is originally 285 ppi, 24 bpp. I measured the size of the book and then measured its image in order to derive the ppi. The following image is a sample page. It has been reduced to fit within the filesize restrictions of the forums.
Note that it seems dark. In actuality, there is 400W of halogen light streaming onto the page. It's bright. When I stop using my scanner, I have to let my eyes adjust for a minute or so before I can see again! The camera is set to ISO 100, and it selects a shutter speed of 1/200 second and aperture of F/10. Despite that, the image still looks dark. The highest level of any significance is 189 (on a scale from 0 to 255). In any case, this is the image we have to work with.
Several important features which the software requires:
1. A dark area surrounds the top, bottom, and edge of the page.
2. The spine (i.e. the join in the middle of the platen) appears in the image.
3. The other page also appears in the image. Reflections in the other page side are OK.
I have to tell the software that this is a left-hand page so that it knows which end is up. The software then:
1. rotates the page to the proper orientation,
2. converts to gray,
3. generates a histogram,
4. computes a threshold (based on the Otsu method),
5. thresholds the image,
6. deskews the image (based on the Postl method),
7. dekeystones the image (based on making all text lines horizontal).
Here is the result on the sample page:
The next step is to denoise and "blockify" the text. I do this by three morphological operations:
1. Closure, using a sel of 25x1
2. Erode, using a sel of 3x1
3. Erode, using a sel of 3x3
Here is the result:
Note that a lot of the white specks in the black background were removed.
The next step is to find the page outline in this image by first locating the spine and edge. First we find the top and bottom of the white area of each pixel column, and then using that data, starting from the black area, find the "cliff" at the edge of the page, then march along the flat of the page until we find the hill, which we declare to be the spine.
To determine the top and bottom of the page, we add up the white pixels for each pixel row, and then from the top and bottom of the image, march down and up, respectively, until we find a row that has more than 50% of its pixels being white. We then declare the top and bottom of the page.
Knowing the edges of the page, we can apply the measured deskew and dekeystone to the original image, and clip to the page limits. Here is what a closeup view of this looks like:
It's not too bad. Notice the particularly bad vertical fracturing in the letter 'e' in 'ranked' in the first line, and the awful horizontal fracturing through the third line.
The software then binarizes this by upscaling by a factor of 4, and then thresholding via Otsu again. Specks are then removed by morphological closing with a sel of 3x3. The image is then kept at this upscaled factor because I found it looks much better when both viewed and printed this way. Here is a part of the resulting image:
While there are still fractures, they are barely noticeable due to the increased ppi. In addition, this image does the most justice to the original font used in the book. Keeping the original ppi (i.e. not upscaling) results in terrible quality.
