Note: This might not be posted in the right forum, so please move it if that's the case. Had a bit trouble finding the right place for it.
Reading about efforts to improve bitonal output from scanning in http://www.diybookscanner.org/forum/viewtopic.php?f=19&t=2554
brought me to some interesting info from Cornell University Library: http://www.library.cornell.edu/preservation/tutorial/contents.html
. The tutorial is rather old in digital terms since the original is from 2000 and it haven't been updated since 2003 (and some external links are dead) but in part 3 under "benchmarking" I found something useful that still seems to be true.
In particular it's the use of the Quality Index (QI) I find useful. For bitonal scanned printed text it's defined as:
QI = (dpi x .039h)/3
Here h = size of characters in millimeters. If they are measured in inches the .039 part is omited. The formula can of course be used to calculate h based on desired QI and known DPI or calculate needed DPI to get a certain QI with a particular h:
h = 3QI/.039dpi
dpi = 3QI/.039h
The scale is so that 3.0 is barely legible quality, 3.6 is marginal, 5.0 is good and 8.0 is excellent. A quick test seems to make that reasonable. If letters are 2 mm high and we scan at 300 dpi we get:
QI = (300 x .039 x 2)/3 = 7.8 (or between good and excellent, closest to excellent)
Same way one could calculate what dpi is needed to get "good quality" with characters of the same size. That would be 3x5/.039x2 = 192.3 dpi. This all seems very close to the general recommendations of at least 200 dpi to be able to do OCR, 300 dpi as desired and more than 300 dpi only really needed for text with small fonts etc.
A real world example might be in place. I recently scanned a book with a resulting dpi of 370. The letters in general are 2 mm high, the numbers used in the notes only 1 mm (round figures). Here's a part of the notes on a page (not downsized):
QI on the 2 mm sized letters would be 9.62 (above excellent) and on the 1 mm sized note numbers 4.81 (just under good). That the QI is a bit low on the small note numbers is clear when the page has been through ScanTailor with standard settings (not downsized):
Yes, it's pretty clear that it's a "2" and a "3" but every pixel was needed and a few more wouldn't have hurt.
So the conclusion must be that the QI is a useful tool when it comes to figurering out what kind of DPI you need for at particular text. For special texts (fraktur typefaces etc) you might need a bit more than the QI would suggest but I haven't tested this.
There's also a QI for greyscale scans (replace 3 with 2) and - a bit confusing - a QI based on stroke width where the QI-scale is different. Stroke width is clearly a more relevant variabel when it comes to line art etc. but is also much harder to measure. It's pretty hard to measure if the finest line is .08 mm, .10 mm or .12 mm even though the later is 50% wider than the first.
Inspired by the text from Cornell I would suggest a more hands on approach to figurering out if a setup is good enough for line art in a particular book:
1. Find the line art with the finest lines in the book
2. Take a test shot of the page with the line art. It doesn't need to be in the actual scanning setup just as long as it's a photo of a whole page just as it would be in the scanning situation so that the dpi is about the same
3. Load the photo in PhotoShop, GIMP or whatever you like and measure how many pixes wide the finest line is.
I haven't tested it but my guess is that if you find that the line is a least 3 pixels wide the setup is good enough for bitonal output. If it's 2 pixels you might get something useful in grayscale/color but bitonal is likely to mess it up. At least this lines up nicely with the formulas from Cornell based on stroke width.