Feel free to comment with new ideas or better resources. 1. Look at the lines of text or borders of images on a page and extract the page curvature from them.
Apps that do this: Scan Tailor
(is it still using coupled snakes
I know there are other examples of this technique, does anyone remember?
For flat documents, there is a similar approach by unpaper
.2. Look at the borders of the book and extract curvature that way.
Apps that do this: Atiz Snapter
As far as I can see, Snapter is currently unavailable. My personal testing found it to be totally unreliable. 3. Project an infrared pattern on the page, photograph it in infrared, in stereo, convert stereo IR information to 3D, and then dewarp.
Apps that do this: None publicly available. This is the Google Books scanning method. Article about their technique
Examples (all of Google Books
)4. Use two cameras to photograph both pages with overlapping information. Use this stereo pair to determine 3D structure for dewarping.
Apps that do this: Decapod (it's not clear that there is presently any implementation of this in Decapod, but it was the original idea). From their wiki
, it appears that right now the two cameras are treated independently and "calibration" consists of simply rotating each camera into position. 5. Using the Kinect for direct depth sensing of the book surface.
Apps that do this: Not exactly an app, but the libfreenect/OpenKinect
driver gives the depth image. Rob proposed the idea here
and I got the first few depth images of books here
-- there's a long way to go on this project and we could use a little help to see if the data straight from the device are worthwhile. It may also be possible to get a close-range PrimeSensor. I will be contacting PrimeSense to feel out the possibilities.6. Using Sharp sensors for extracting the curvature at several lines on a page.Spamsickle proposed this here
and though at first, I didn't like the idea, after discussing it more with Spam and Rob, I have come to really like it, it is simple, efficient, and might work (if the Sharp sensors weren't so awfully noisy/messy). I have the Sharp sensors laying around in a box and just need to build a rig for testing. The idea right now is to have a rod extending over the book with two of these sensors. By sweeping them across the surface of the book, you'd get the distance exactly. 7. Using a laser line to get a reliable line to follow for dewarping.
A laser pointer or diode can easily be made into a laser line by using a cylinder lens to expand the beam. The laser line, when projected on the book surface, distorts according to the page curvature. Using this laser line, we should be able to make a good guess at the 3D structure of the page and do dewarping. Or perhaps we could make a modified version of Scan Tailor that searches for bright lines. In any case, it is a promising area of research suggested by many including Rob, myself, and Vitorio.
I decided to try this out this morning (got up at 1AM, couldn't sleep!) and the results looked very promising.
I didn't have any cylinder lenses laying around (aaghhh!!!), so what I did was took a piece of "turning film" from the back of a cellphone display and put it in front of the laser pointer.
Laser pointer by itself:
Laser pointer plus turning film.
Then, I pointed the laser, from the side, toward the book. From straight down, obviously the laser beam will appear straight. However, if we project it from the side, we get something like this (actually this is two photographs of two projections superimposed on each other):
Laser image by itself (it's noisy because I used the wrong camera settings but didn't care to take the image a second time)
Image of the book:
Laser beams superimposed on book:high res images
OK, the laser beam is not perfect because of the nature of turning film. A brighter laser with a better lens would give much better results. If you had two lasers, you could take just two shots -- a laser beam shot, and a normal shot. Using the info from the two, you could obviously dewarp the page. I think this method is a winner. Cheap, handy, uses a single camera and a handful of solid state parts. Books which can lay flat are easy targets -- not so sure about books in a cradle (that's up next). 8. Using depth-from-defocus.
This technique is a bit subtle. Essentially it makes the assumption that what is in focus in a picture with shallow DoF is all in one plane. By shifting the the focus through a scene, the depth of each object can be recovered by watching for high frequency information. Unfortunately this method suffers for compact cameras because they do not have shallow DoF, and it fails in general because not all book pages contain high frequency content. An additional problem is that it requires many photographs of a page to work. EVEN SO, I was very, very excited to see Gerard try out this technique here, with the help of Spamsickle
. They did some great work, and I hope we end up trying all of these to at least that kind of level. 9. Using a coded aperture camera.
There is a new field called "computational photography" and many of the imaging schemes for CP inherently recover a depth map. Coded aperture imaging is explained here
. I am building a coded aperture camera for other reasons, but honestly expect the depth resolution to be too coarse for book scanning.10. Using RGB lighting to get the curvature of the book.
This is an idea I had just a week or so ago. If you mix a red, green, and blue light, you get white. White light is nice for scanning books, so we're already +1. Now, if you put your lights at different points in space, when you interrupt them, you will get colored shadows. In this way, you can make colored shadows that reflect the shape of the book edge, and also identify the orientation of the lighting relative to the book. I think pictures show this idea best, so I mocked it up in Maya: 11. Difference-based lighting. Use light control to get better depth information from photographs.
Humans use the direction of light as a cue to depth. Most of our scanning rigs have two or more lights. There's no reason we can't use these lights in a smarter way to get better depth information. In particular, I'm thinking of Anonymous's page splitter idea
. The same idea has been proposed under numerous guises before, but I think it would work a lot better if we made better use of the lights.
So imagine that we have two lights.
Turn the left one on.
Then turn the right one on.
Now take the difference between the two -- the page edges are clearly highlighted:
Now, you can make a virtual third light. Add the left and right images:
Looks pretty good!
Now you can play all kinds of games. Add the difference of each back to the original image, or something - edges and the center become highlighted.
Screwing around with contrast and stuff can get you even better data:
etc etc. The nice thing is that these are all easy to control (it's easy to switch lights on and off) it's only two shots per capture, and the image math is all dead-simple to start with, just addition and subtraction.
Here are the original images if
you'd like to play with them.