Hello everyone,
I'm new to book scanners, and have build a simplified version of the standard book scanner. No drawers, no platen (I need to build one). I did shoot a book and tried to get something usable with my "raw" pictures.
I was disapointed by scantailor (it's promising, but there is no handling of keystoning, which I need badly) and bsw did not work for me (I wanted to crop + de-keystone every picture separately). Both projects seem to be dead.
As I wanted to play around with Qt, I started my own software, which I called "Yet Another Scan Wizard" (you have to call the project in Qt Creator...) This piece of code is not releasable and I don't know if it will ever be, as I can only work max. 1 hour a day on it.
Now I read this thread and got a look a Scantailor's code. Scantailor is a very good basis and is written in Qt, but the code is not easy to read : there are very few comments, and tons of classes I don't event know what they are for, like SystemLoadWidget. My idea would have been to replace/complete the "rotation" step with a de-keystoning+rotation (as in bsw), but I did not even found the class corresponding to it... I ought to find it by reading the code precisely, but this is *not* an easy task.
So I think I will continue working on "Yet Another Scan Wizard", and if I get something valuable, I'll share it with the community.
And I've got to shoot better pictures (and build a platen) to ease the post processing
Postprocessing: The hardware/software divide
Moderator: peterZ
- daniel_reetz
- Posts: 2812
- Joined: 03 Jun 2009, 13:56
- E-book readers owned: Used to have a PRS-500
- Number of books owned: 600
- Country: United States
- Contact:
Re: Postprocessing: The hardware/software divide
I moved this topic into general discussion because, well, it's no longer ST specific.
tibob, I'm interested in your new package... but just out of curiosity... would you be interested in implementing keystone correction into ST instead? As I mentioned, I'll trade you a scanner frame for your efforts. In any case I'm interested in what you are producing.
BTW, be careful calling things dead around here. we have a real way of bringing up old threads and digging up old software.
tibob, I'm interested in your new package... but just out of curiosity... would you be interested in implementing keystone correction into ST instead? As I mentioned, I'll trade you a scanner frame for your efforts. In any case I'm interested in what you are producing.
BTW, be careful calling things dead around here. we have a real way of bringing up old threads and digging up old software.
Re: Postprocessing: The hardware/software divide
Mmm, actually dekeystoning, being the special case of dewarping is there. It's a bit discouraging when something you worked on for a whole year goes unnoticed. It's not on the Deskew stage, where it logically should be, as putting it there requires more work both on the architectural and on image processing sides.
Scan Tailor is not completely dead either. I just spend significantly less time on it than I used to. I am also not accepting any feature requests, but that's hardly news.
Scan Tailor is not completely dead either. I just spend significantly less time on it than I used to. I am also not accepting any feature requests, but that's hardly news.
Scan Tailor experimental doesn't output 96 DPI images. It's just what your software shows when DPI information is missing. Usually what you get is input DPI times the resolution enhancement factor.
Re: Postprocessing: The hardware/software divide
Sorry about calling scantailor "dead": I misunderstood Tulon and the devel-mailing list has not been active for months.
@Tulon: I know the dewrapping functionality of scantailor, and it is great (and I really mean it). The problem is, as you said, that it comes too late in the processing to be useful to me: I need dewrapping/dekeystoning before cropping.
@Daniel: for now, I won't code dekeystoning in scantailor because I don't want to change its architecture (to much work). But I'm still playing with "yasw", and will share it as soon as I have something working.
@Tulon: I know the dewrapping functionality of scantailor, and it is great (and I really mean it). The problem is, as you said, that it comes too late in the processing to be useful to me: I need dewrapping/dekeystoning before cropping.
@Daniel: for now, I won't code dekeystoning in scantailor because I don't want to change its architecture (to much work). But I'm still playing with "yasw", and will share it as soon as I have something working.
- daniel_reetz
- Posts: 2812
- Joined: 03 Jun 2009, 13:56
- E-book readers owned: Used to have a PRS-500
- Number of books owned: 600
- Country: United States
- Contact:
Re: Postprocessing: The hardware/software divide
Well, I for one am very interested in what you come up with!
Re: Postprocessing: The hardware/software divide
Good discussion!
I'm not worried about using several programs. I think it might be better to concentrate on making the transition between them as quick and easy as possible. For that it would be useful to collect and share scripts for renaming, sorting, rotating and moving the images and preprocessing (cropping) for ScanTailor. And also collecting and sharing workflows, like JonEP does in the OP.
I'm not worried about using several programs. I think it might be better to concentrate on making the transition between them as quick and easy as possible. For that it would be useful to collect and share scripts for renaming, sorting, rotating and moving the images and preprocessing (cropping) for ScanTailor. And also collecting and sharing workflows, like JonEP does in the OP.
I agree that ST not autodetecting the page/content correctly is an issue that needs to be worked around. But more careful monitoring of exact camera position/zoom can also be time consuming, at least when using simple cardboard type cradles that tend to move somewhat. I think this would be a more general solution: put a special color and/or pattern on the cradle that some software can detect and use for autocropping, as a ST preprocessing step.JonEP wrote: I have finally figured out that it is very important not to allow the entire 1/2 platen to be included in the image taken by the camera, as this poses problems for scan tailor (it cant accurately find the book edges) [note --I noticed Daniel's new machine is taking the entire 1/2 platen image).
Re: Postprocessing: The hardware/software divide
Tangenting back to the discussion of developing individual or platform tools, has anyone looked at the tools used by Project Gutenberg's Distributed Processing staff?
As I see it, their tools are less image, and more text related (I think they are assuming that the initial OCR is as good as it will get, so they start from the position of working from that text, rather than doing image correction), but many of their processes I think overlap with what we do here, and their tools might as well.
Things like:
* Sorting Images
* OCR
* Swapping out common OCR errors
They also seem to have had a number of discussions over the same topics we are covering (cli vs. gui particularly). I am not sure they have solved all those themselves, but the reading was educational for me, and I thought I would pass the links along.
http://www.pgdp.net/phpBB2/viewforum.php?f=13
(you might have to sign up).
As I see it, their tools are less image, and more text related (I think they are assuming that the initial OCR is as good as it will get, so they start from the position of working from that text, rather than doing image correction), but many of their processes I think overlap with what we do here, and their tools might as well.
Things like:
* Sorting Images
* OCR
* Swapping out common OCR errors
They also seem to have had a number of discussions over the same topics we are covering (cli vs. gui particularly). I am not sure they have solved all those themselves, but the reading was educational for me, and I thought I would pass the links along.
http://www.pgdp.net/phpBB2/viewforum.php?f=13
(you might have to sign up).
-
- Posts: 56
- Joined: 17 Apr 2011, 21:20
- Number of books owned: 0
- Location: Charlottesville, Virginia
Re: Postprocessing: The hardware/software divide
An interesting, thoughtful, thread. I suspect there will be more in this section of the forum when more of us get here. I've just finished a scanner and have started reading with the idea of beginning scans.
The state of book scanner software reminds me of some adventures I had cnc-ing a small milling machine nearly 20 years ago now. It was quite frustrating, but also rewarding when one finally understood a small part of what was going on.
My sense is that it would be a mistake to try and come up with one large software package. It simply asks too much of too few people.
So, we're going between pieces of software. Consequently, I'm finding the posts describing current workflows to be the most helpful, and I read them as much as I can.
I think it might be most helpful if we had a small "diy book scanner fair" on either coast of the US sometime this year, or maybe piggyback on a maker fair? We could bring a couple scanners together and show each other exactly how we do things and why. We could skype with folks overseas at such events and start to connect some names with faces.
Actually, get this, I think the biggest impediment to the diy book scanner movement currently is that so many of us do not list our locations. If I knew you lived next door, or even in the next state, it would help. I would know, for example, whether it would even make sense to get together in one area or whether we should all get better at youtube videos...
Finally, I'd also like to suggest that if any of us find something like ST helpful, ship the guy some cash to say thanks.
Charles
The state of book scanner software reminds me of some adventures I had cnc-ing a small milling machine nearly 20 years ago now. It was quite frustrating, but also rewarding when one finally understood a small part of what was going on.
My sense is that it would be a mistake to try and come up with one large software package. It simply asks too much of too few people.
So, we're going between pieces of software. Consequently, I'm finding the posts describing current workflows to be the most helpful, and I read them as much as I can.
I think it might be most helpful if we had a small "diy book scanner fair" on either coast of the US sometime this year, or maybe piggyback on a maker fair? We could bring a couple scanners together and show each other exactly how we do things and why. We could skype with folks overseas at such events and start to connect some names with faces.
Actually, get this, I think the biggest impediment to the diy book scanner movement currently is that so many of us do not list our locations. If I knew you lived next door, or even in the next state, it would help. I would know, for example, whether it would even make sense to get together in one area or whether we should all get better at youtube videos...
Finally, I'd also like to suggest that if any of us find something like ST helpful, ship the guy some cash to say thanks.
Charles
Re: Postprocessing: The hardware/software divide
Okay, you can have a first look at my work at https://github.com/tibob/yasw
you need git, qt and here are the (short) instructions for linux:
git clone git://github.com/tibob/yasw.git
cd yasw/src/
qmake
make
(copy a few scanned pages here in yasw/src)
./yasw
Select an image,
- the first tab (Base Filter) does nothing, its the base Class all filter inherit.
- the second tab (Rotation) is to rotate the image
- the third tab (Dekeystoning) is transform the polygon (drag and drop the edges) into a rectangle.
Use the "preview" check box to see the result
A new filter can be created by subclassing BaseFilter. See the (incomplete) documentation (run "make" in yasw/documentation, you will need doxygen) in documentation/doxygen or read yasw/src/filter/rotation/* for a simple class).
This is very early work, my next steps are:
- implement cropping
- handling of a project (choose and sort source images; load and save parameters from filters; load and save projects)
- handling of output
- develop/port more filters like autocalibration (see the checkerboard thread), color adjustment.
you need git, qt and here are the (short) instructions for linux:
git clone git://github.com/tibob/yasw.git
cd yasw/src/
qmake
make
(copy a few scanned pages here in yasw/src)
./yasw
Select an image,
- the first tab (Base Filter) does nothing, its the base Class all filter inherit.
- the second tab (Rotation) is to rotate the image
- the third tab (Dekeystoning) is transform the polygon (drag and drop the edges) into a rectangle.
Use the "preview" check box to see the result
A new filter can be created by subclassing BaseFilter. See the (incomplete) documentation (run "make" in yasw/documentation, you will need doxygen) in documentation/doxygen or read yasw/src/filter/rotation/* for a simple class).
This is very early work, my next steps are:
- implement cropping
- handling of a project (choose and sort source images; load and save parameters from filters; load and save projects)
- handling of output
- develop/port more filters like autocalibration (see the checkerboard thread), color adjustment.
Re: Postprocessing: The hardware/software divide
Any news on any of this?