Methods To Sense The 3D Surface/Structure Of A Book
Posted: 05 Jan 2011, 07:35
This thread is about the various ways we could "see" the 3D structure of a book, so we could potentially do perfect dewarping. I'm particularly interested in recovering the 3D shape of a book, not just flat documents, so this list will be heavily biased in that direction. For people unfamiliar with the topic, dewarping is taking an image of a curved page or of an entire book, and using some algorithm to make the image "flat" -- as though it had been scanned on a perfectly flat surface. Right now, DIY Book Scanners can be considered "dewarping in hardware", because the flat platen glass flattens the page.
I've long maintained that image-based dewarping is a flawed solution, because books are all so different. However, with the advent of improved algorithms in Scan Tailor and direct 3D sensing like the Microsoft Kinect/PrimeSensor, I've changed my mind and think it's time to take a fresh look at dewarping technologies. Eventually, I would like this thread to be a canonical resource on the topic, so I will continuously update this post as I learn new things. Also, this is a hardcore topic involving math, cameras, computer science, etc and so it has significant academic interest. As a result, some of the best information is locked up in academic journals, and some of the reading will be hard. For an overview of the state of the art of dewarping (as of a few years ago) see this document. To see where things have gone since then, see this Google Search.
Because this post is to be an information resource for the community, I'd prefer that comments in this particular thread be informational. That means that if you know of another technique or program, please post it and I will probably add it to the list. If you are wondering if dewarping is a good idea, please post that somewhere else. To be clear, comments on the specifics of an algorithm, your ideas and so on, are absolutely requested and desired, and if you want to start working on one of these here, go right ahead. BUT if you want to talk about these things in some general sense, let's do that somewhere else.
Edit: Updated Google Scholar link, thanks for the help, Mark Main.
I've long maintained that image-based dewarping is a flawed solution, because books are all so different. However, with the advent of improved algorithms in Scan Tailor and direct 3D sensing like the Microsoft Kinect/PrimeSensor, I've changed my mind and think it's time to take a fresh look at dewarping technologies. Eventually, I would like this thread to be a canonical resource on the topic, so I will continuously update this post as I learn new things. Also, this is a hardcore topic involving math, cameras, computer science, etc and so it has significant academic interest. As a result, some of the best information is locked up in academic journals, and some of the reading will be hard. For an overview of the state of the art of dewarping (as of a few years ago) see this document. To see where things have gone since then, see this Google Search.
Because this post is to be an information resource for the community, I'd prefer that comments in this particular thread be informational. That means that if you know of another technique or program, please post it and I will probably add it to the list. If you are wondering if dewarping is a good idea, please post that somewhere else. To be clear, comments on the specifics of an algorithm, your ideas and so on, are absolutely requested and desired, and if you want to start working on one of these here, go right ahead. BUT if you want to talk about these things in some general sense, let's do that somewhere else.
Edit: Updated Google Scholar link, thanks for the help, Mark Main.