Scan Tailor

Scan Tailor specific announcements, releases, workflows, tips, etc. NO FEATURE REQUESTS IN THIS FORUM, please.

Re: Scan Tailor

Postby postscangrinder » 09 Jan 2010, 20:04

spamsickle wrote:You can effectively get that by choosing color output and just going through the whole Scan Tailor pipeline, though you'll still end up with TIFFs instead of your original JPEGs. You can either choose to have a bit of margin added in the "page formatting" step, or not.


Cheers, Spamsickle

I've been playing around with the app since reading your reply and can happily say certain aspects of this program are awesome. The majority of my scans straighten amazingly well, the ones that don't quite conform are close enough I just revert to my keyboard shortcuts in Paintshop to finish the scan off. By necessity I bound scripts to certain keys to speed up the rotation of an image, ie Shift-1 rotates an image left 15 degrees, incrementally up to Shift-9 (95 degrees) and Ctrl-1 to the right in similar fashion. I run a batch script to rotate any scans needing 90 degree rotation or another to flip/mirror the half that usually need it. Any contrast/histo/usm work is also scripted to certain keys to speed things up.

I scan as tiffs, btw, bulk processing them into resized jpg copies when I'm ready to add the IPTC data (with Exifer & Xnview) that I use to add source details before uploading the jpgs to the website. I always give a school the tiff copies as well as the finished pdfs so they can improve on them if they feel the need. I'm under no illusion my work is top quality, my main concerns are the sharpness of the text and trying to get the contrast right on the photographs so the subjects at least have a jawline.

Discovering this app means I can also continue working on the last of a large batch (circa 20,000) of negatives, which were languishing while I scrambled to get a large mag job finished for the end of January.

Kudos to the chaps working on Scan Tailor :-)

Later
Mark

p.s with Exifer (and Xnview), and involving several other steps, I can add the IPTC data to hundreds/thousands of jpgs in a matter of minutes. If anyone is looking to save time adding IPTC data I can write the process down if requested.
postscangrinder
 
Posts: 4
Joined: 07 Jan 2010, 19:36

Re: Scan Tailor

Postby daniel_reetz » 09 Jan 2010, 20:09

postscangrinder wrote:p.s with Exifer (and Xnview), and involving several other steps, I can add the IPTC data to hundreds/thousands of jpgs in a matter of minutes. If anyone is looking to save time adding IPTC data I can write the process down if requested.


I would appreciate such a tutorial, especially if you included a little motivation on why one might want to do that (I know some reasons, but they're probably different than yours). If you decide to go through with it, feel free to make a new thread for your tutorial here in Software.
User avatar
daniel_reetz
 
Posts: 2485
Joined: 03 Jun 2009, 13:56

Re: Scan Tailor

Postby postscangrinder » 09 Jan 2010, 21:52

daniel_reetz wrote:
postscangrinder wrote:p.s with Exifer (and Xnview), and involving several other steps, I can add the IPTC data to hundreds/thousands of jpgs in a matter of minutes. If anyone is looking to save time adding IPTC data I can write the process down if requested.


I would appreciate such a tutorial, especially if you included a little motivation on why one might want to do that (I know some reasons, but they're probably different than yours). If you decide to go through with it, feel free to make a new thread for your tutorial here in Software.


Hi Daniel

I'll write up the whole process I go through from the time I start scanning until the scans are on the website. I'll be interested to see if anyone thinks the rigmarole I go through to get my scans online is as long and windy as my write-up will be ;-)

There may be nothing new for anyone already registered to this forum but it may save the odd newcomer hours of angst.

In a couple of days
Mark

Edit: ...2 days?!...I've started working on a run-through but it will take a tad longer than I expected, I'm planning on using screen captures, skywriting and zeppelins. A few more days then.
Last edited by postscangrinder on 12 Jan 2010, 00:51, edited 1 time in total.
postscangrinder
 
Posts: 4
Joined: 07 Jan 2010, 19:36

Re: Scan Tailor

Postby Tulon » 10 Jan 2010, 08:48

daniel_reetz wrote:Greets, all... Feature request:

Can we please have an option to propagate the "split pages" line from one page to all other pages?

As long as I'm asking for things, can we get the same feature with the content selection box?

I don't really like features like these, for the following reasons:
1. They can only be useful if you take your scans on specialized hardware. Otherwise, your pages will have slightly varying positions and skew angles.
2. In all other cases, such a feature would actually be harmful. You'll basically have a function that is guaranteed to make things worse than before. A person would activate it out of interest, and then will have to figure out how to revert it.
3. Such a feature would only be justified if the corresponding automated algorithm fails too often and there is no hope to improve it.
The spine detection algorithm actually works fine in most of the cases. If it fails too often for you, please provide with examples.
The content box selection algorithm often cuts off page numbers and sometimes other kinds of headers / footers. These cases are actually quote complex. If a page number is far away from the rest of page content, how would you tell it apart from a book's edge? Still I haven't given up on this one. Recently I came across a couple (one, two) papers on this subject. Didn't have time to look into them yet, but they give me hope.

PS:
Today I've finished porting Rob's new dewarping algorithm to C++. It works much better than the old one, but it's not exactly fast.
In parallel, I was making comebacks to my own text line tracing algorithm, and I made some progress with it. If I manage to pull it off, it will allow me to get rid of those 3 switches that control Rob's algorithm.
When Scan Tailor asks you to enter DPIs manually, never enter arbitrary values. The video tutorial shows how to estimate the real DPI.
Tulon
 
Posts: 536
Joined: 03 Oct 2009, 06:13
Location: London, UK

Re: Scan Tailor

Postby StevePoling » 10 Jan 2010, 18:53

Tulon, have you been following my discussion with Dan about adding a green pinstripe to the platen? My thought is that with a green pinstripe going along the book's gutter, software could detect it, and automatically orient the page (left/right) and rotate the green line to vertical.

Moreover, I believe that if the code knows where the gutter is, it can detect the page's margins and then crop the image.

I have only just recently started looking at Scan Tailor, so I apologize if I'm talking about already-solved problems.
StevePoling
 
Posts: 290
Joined: 20 Jun 2009, 12:19
Location: Grand Rapids, MI

Re: Scan Tailor

Postby Tulon » 10 Jan 2010, 19:28

StevePoling wrote:Tulon, have you been following my discussion with Dan about adding a green pinstripe to the platen?

Sorry, I didn't. Was it this topic or another one?

StevePoling wrote:My thought is that with a green pinstripe going along the book's gutter, software could detect it, and automatically orient the page (left/right) and rotate the green line to vertical.

If you are able to insert a pinstripe when scanning, that implies you control the scanning process. If so, why not just scan every page in the same orientation? You know, you can apply a particular orientation to all pages at once, or to a group of pages.
As for deskewing, the current algorithm works really well. If there are at least a few lines of text on a page, it will detect the correct skew angle in 99.9% of cases.

StevePoling wrote:Moreover, I believe that if the code knows where the gutter is, it can detect the page's margins and then crop the image.

Well, the spine is detected anyway, as long as it's visible. It helps of course with content box detection, but it's not a silver bullet.
When Scan Tailor asks you to enter DPIs manually, never enter arbitrary values. The video tutorial shows how to estimate the real DPI.
Tulon
 
Posts: 536
Joined: 03 Oct 2009, 06:13
Location: London, UK

Re: Scan Tailor

Postby daniel_reetz » 10 Jan 2010, 20:05

Sorry, I didn't. Was it this topic or another one?


It was elsewhere. It's an idea we've thrown around the site a few times -- inserting some kind of marker into the scan process to automate post-processing. One possibility that we've discussed is to have some kind of bright green indicator at the apex of the platen. It would implicitly give you page orientation and spine location.

Maybe a better/more general way to broach this question might be: given the spine-detection algorithm you are currently using, is there a way to modify our V-platens to increase the likelihood of correct spine detection? I agree with you, right now things work pretty well, but if we could, for example, have a small strip of black tape along the apex of the platen which would make things even more reliable, that would be a smart modification to make to our scanners. I clearly see the value in keeping Scan Tailor very general (after all, we're all using it!), and very simple, but I also see a potential argument for co-developing some aspects of it with our V-platen scanners for maximal awesomeness.

And with respect to your previous answer, I am committed to contributing back images that don't work, to help improve Scan Tailor. I *really* appreciate your continued monitoring of this thread and all the work you've done, thanks.

Even so, I'd still like a super-secret option to turn on "split-pages" propagation for advanced users. :)
User avatar
daniel_reetz
 
Posts: 2485
Joined: 03 Jun 2009, 13:56

Re: Scan Tailor

Postby StevePoling » 10 Jan 2010, 23:19

Model car makers use 1/64" pinstripe tape to decorate their projects. I've bought it in blue from a local hobby shop. So, I figure with a little looking I could get some in green.

I propose putting a green pinstripe near the edge of the glass where the camera will see it close to the spine of the book. I believe it should provide a sharp, thin line that software can easily detect, then use as I've suggested.
StevePoling
 
Posts: 290
Joined: 20 Jun 2009, 12:19
Location: Grand Rapids, MI

Re: Scan Tailor

Postby Antoha-spb » 12 Jan 2010, 06:43

Привет Tulon!

Due to insufficiently uniform lighting on my scanner I got a problem with post-processing of the scans in ST. While I'm working on improving lights I am also looking for a software solution to the ussue.

The pix that my scanner (based on Canon EOS) produces under both ambient light and scanner bulbs are good for supplying them into ScanTailor, where they're converted into B&W images that I bind into PDF.

Where ambient light does not contribute enough to the uniformity of the page lightness, the average pixel brightness varies much across the scanned page.

Image

Being processed automatically it gives the results as follows:

Image

Similar page being processed.

Image

In such cases I return to fallback manual drudgery in Photoshop, selecting regions with similar lightness, turning midtones into lights and rising contrast to extract black letters and eliminate background. Each page may contain up to 4-5 regions to process separately...

Unfortunatey my math skills arent's good enough to algoritmize that, and even if I managed - it would take eternity for a Visual Basic program to process some hundred pages. Could you pls. suggest a way to handle the pix where lightness and contrast between letters and background varies across the page? What actions need to be taken before running ST with B&W output settings?

Thanks.
A.
PS. just in case - scanner build thread is here
PPS. just in case for curious ones - the page is an extract from Lighthouses Office 1915 yearbook ;)
User avatar
Antoha-spb
 
Posts: 86
Joined: 21 Nov 2009, 09:54
Location: Saint Petersburg

Re: Scan Tailor

Postby aku » 12 Jan 2010, 13:01

Antoha-spb wrote:Привет Tulon!

Due to insufficiently uniform lighting on my scanner I got a problem with post-processing of the scans in ST. While I'm working on improving lights I am also looking for a software solution to the ussue.

The pix that my scanner (based on Canon EOS) produces under both ambient light and scanner bulbs are good for supplying them into ScanTailor, where they're converted into B&W images that I bind into PDF.

Where ambient light does not contribute enough to the uniformity of the page lightness, the average pixel brightness varies much across the scanned page.

In such cases I return to fallback manual drudgery in Photoshop, selecting regions with similar lightness, turning midtones into lights and rising contrast to extract black letters and eliminate background. Each page may contain up to 4-5 regions to process separately...

Unfortunatey my math skills arent's good enough to algoritmize that, and even if I managed - it would take eternity for a Visual Basic program to process some hundred pages. Could you pls. suggest a way to handle the pix where lightness and contrast between letters and background varies across the page? What actions need to be taken before running ST with B&W output settings?

Thanks.
A.
PS. just in case - scanner build thread is here
PPS. just in case for curious ones - the page is an extract from Lighthouses Office 1915 yearbook ;)


I.e. you want something like
http://www.comp.nus.edu.sg/~tancl/Papers/DocEng2007/fp03-lu.pdf
and the follow-up paper
http://en.scientificcommons.org/42257334

as a stage of ST.

Shijian Lu has some other interesting papers as well which could be of interest to us, see
http://en.scientificcommons.org/shijian_lu for a list.
aku
 
Posts: 27
Joined: 02 Jan 2010, 08:38
Location: Vancouver, BC

PreviousNext

Return to Scan Tailor

Who is online

Users browsing this forum: No registered users and 2 guests