I've seen lines like that show up in scanner images before, especially old scanners and modded scanners - see the artifacts in Dario Morelli's DIY scanner camera on Flickr
. (And for good measure, a photo of the scanner in the field
- looks awesome, doesn't it?) Could be an artifact caused by wear, or a result of the array getting overloaded on that particular line.
It looks quite nice, even if it's a bit noisy and dark with some black crushing. I haven't seen V30 scanned images before, so I don't know whether those are normal qualities out of this scanner or if it has to do with the lighting environment or something.
It looks like it displays great on your Kindle! Do you binarize/Scan Tailor process, or is the Kindle using images straight out of the scanner?
The image is processed with Scan Tailor, if there are only texts or formula in the page, I will just binarize it, and use cbj2 to convert it into djvu. If there are also photos in the page, I will use mixed mode, and check if Scan Tailor can find the photo region correctly, otherwise I will do it by myself, since most parts of the technical books are texts or formula, it will not be too much work to crop out the photos. For mixed pages, I use didjvu (http://code.google.com/p/didjvu/downloads/list
) to convert them into djvu. Since I found DjVuLibre doesn't really do separation of text and photos as foreground and background that is the key feature of djvu, only some commercial softwares do it, that is how they make living. Honestly, I don't think the Document Express Desktop by Caminova did a good job, it is slow, and sometimes some pages are lost that drives me crazy. So, how do you know which page is binarized and which is mixed? very easy! the file size is different
So, if the file size is smaller than certain value, the shell script runs cbj2, otherwise didjvu. After all the pages are combined as a djvu book, I have to export it as pdf for kindle, usually the pdf file is much bigger, then I have to use Adobe acrobat to do OCR (include downsampling), and save again, the pdf file can be largely reduced, Actually I do not really need OCR, it just forces the pdf into higher version 1.6 and do the downsampling, so to reduce the size. The procedure is a little complicated, but finally I got very small size (relatively, ~10Mb for binarized-only 500-paged book, ~30Mb for books with some photos) pdf book. Before I can figure this procedure out, some pdf files converted from djvu are ridiculously big, the djvu file is just 10Mb, and the pdf file could be 200Mb, it is such a disaster. Or you can use Windjview and Acrobat pdf printer to convert djvu into pdf, the file size is also small, but I don't like the margin it added to pages, I have cut off all the margins in Scantailor to give a maximum reading area for Kindle. So, WinDjview is my last resort. I hope one day Kindle can read djvu directly, so I don't have to do this any more, it is really painful. I hope these information are useful for someone who want to make ebooks for kindle. I have to say that building didjvu could be very painful too, but finally I made it. Now, I just scan book, and run the shell script, then read book in Kindle. Please see attached shell script file, you might get some ideas how it works. I have to say I know nothing about shell script, I just copy and paste everywhere to make it work, maybe it doesn't look very professional. For this script, I put even pages and odd pages into different folders, I only scan one side at one time, so you don't have to flip book back and forth to save time, does that make sense? I think so.