HELP - Scan Tailor Project --> .pdf

Scan Tailor specific announcements, releases, workflows, tips, etc. NO FEATURE REQUESTS IN THIS FORUM, please.

HELP - Scan Tailor Project --> .pdf

Postby clemd973 » 14 Oct 2010, 00:19

I've just finished building my scanner, and I've even got the cameras up and running with SDM...everything working beautifully. Now I'm ready to test the post processing so I loaded Scan Tailor, watched the video tutorials and even processed a test-project of about 10 pages. But I'm at a loss now as to how to get the STProject to .pdf in order to view it on my computer and mobile device. What's the most common procedure? I'm using a MacBook Pro, with Windows XP installed as well. Thanks. :?
User avatar
clemd973
 
Posts: 121
Joined: 22 Aug 2010, 21:20

Re: HELP - Scan Tailor Project --> .pdf

Postby univurshul » 14 Oct 2010, 00:38

You need PDF binding/building software. There is freeware, shareware and flagship software which does this.

First, locate where the output TIFFs that were produced in Scan Tailor.

OSX has the app "Preview' already on your Mac which can convert a series of TIFFs from Scan Tailor into a PDF. Simply open Preview with the cover image TIFF, and drag more images onto the opened TIFF. It should combine the TIFFs where you can then save the correlated images as a single PDF. In ColorSync utility app (built-in on OSX), you can make custom compression settings for your PDFs too.

When you have the desire to OCR your TIFFs, I personally recommend OmniPage Pro X. It performs OCR before it compresses and converts the image to PDF. There are several apps like AABBY Express, Adobe Acrobat, and Readiris, etc.

There is also plans regarding some interesting PDF & DJVU software for the community being written by DIY members here, so you should also explore djvubind (viewtopic.php?f=3&t=521) and look for an upcoming PDF builder app as well.

But try and stay with PDFs for awhile to ensure application compatibility and don't delete your master processed images; you need to determine the most ideal compression settings and what format will be best for you. This takes time, trial and error.
User avatar
univurshul
 
Posts: 496
Joined: 31 Mar 2010, 18:00
Location: NORTH AMERICA

Re: HELP - Scan Tailor Project --> .pdf

Postby spamsickle » 14 Oct 2010, 06:28

Since you say you have Windows XP installed, here's what I'm doing.

A product called ImageMagick converts from Scan Tailor's TIF images to pdf, with one console command:

mogrify -format pdf *.tif

Once I have PDF versions of all the pages, I use a second tool, pdftk, to put them together with the command

pdftk p*.pdf cat output mybook.pdf

The only thing to be careful of here is not to get into an infinite loop by accidentally mixing your output with your input. All my separate pages are named either p0001.pdf or simply 0001.pdf, so my input specification is either p*.pdf or 0*.pdf, and I make sure my output name doesn't begin with either "p" or "0".

For most books now, I'm also going through a third step, using Adobe Acrobat to OCR and output a Clearscan version of the PDF. This is a commercial product, though, unlike Image Magick and pdftk.
spamsickle
 
Posts: 572
Joined: 06 Jun 2009, 23:57

Re: HELP - Scan Tailor Project --> .pdf

Postby dingodog » 14 Oct 2010, 08:32

spamsickle wrote:Since you say you have Windows XP installed, here's what I'm doing.

mogrify -format pdf *.tif

I use
*sam2p*
- http://pts.szit.bme.hu/sam2p/

with this script:
Code: Select all
#!/bin/bash

directory=`pwd`

for file in $directory/*.tiff
do
   filename=${file%.tiff}
   sam2p $filename.tiff $filename.pdf
done

spamsickle wrote:
then I also use pdftk
Once I have PDF versions of all the pages, I use a second tool, pdftk, to put them together with the command

pdftk p*.pdf cat output mybook.pdf

it is important to perform a further refinement, after joined the single pdfs, XREF table must be rebuilt

Code: Select all
pdftk *.pdf cat output mybook.pdf ; pdftk mybook.pdf output fixed.pdf ; mv fixed.pdf mybook.pdf


Since when pdftk (but also other softwares) join the single pdfs, internal XREF table goes corrupted. This does not makes unreadable the file, but pdf is not in standard and some apps (like ghostscript) refuse to operate about a pdf non standard, showing the error message INVALID XREF TABLE
User avatar
dingodog
 
Posts: 81
Joined: 22 Jul 2010, 18:19
Location: on the net

Re: HELP - Scan Tailor Project --> .pdf

Postby Misty » 14 Oct 2010, 15:02

Apologies I've been taking so long with my PDF maker. I've been busy on non-scanning projects for the past few months, which has kept me away from it, and I originally left it off when I ran into a problem with ImageMagick. I'm still aiming to get it finished in the relatively near future, and I have most of the technical issues sorted through now.
The opinions expressed in this post are my own and do not necessarily represent those of the Canadian Museum for Human Rights.
User avatar
Misty
 
Posts: 473
Joined: 06 Nov 2009, 12:20
Location: Frozen Wasteland

Re: HELP - Scan Tailor Project --> .pdf

Postby clemd973 » 14 Oct 2010, 15:17

univurshul wrote:You need PDF binding/building software. There is freeware, shareware and flagship software which does this.

First, locate where the output TIFFs that were produced in Scan Tailor.

OSX has the app "Preview' already on your Mac which can convert a series of TIFFs from Scan Tailor into a PDF. Simply open Preview with the cover image TIFF, and drag more images onto the opened TIFF. It should combine the TIFFs where you can then save the correlated images as a single PDF. In ColorSync utility app (built-in on OSX), you can make custom compression settings for your PDFs too.

When you have the desire to OCR your TIFFs, I personally recommend OmniPage Pro X. It performs OCR before it compresses and converts the image to PDF. There are several apps like AABBY Express, Adobe Acrobat, and Readiris, etc.

There is also plans regarding some interesting PDF & DJVU software for the community being written by DIY members here, so you should also explore djvubind (http://www.diybookscanner.org/forum/vie ... ?f=3&t=521) and look for an upcoming PDF builder app as well.

But try and stay with PDFs for awhile to ensure application compatibility and don't delete your master processed images; you need to determine the most ideal compression settings and what format will be best for you. This takes time, trial and error.


Thanks for the information. I'm trying to shorten the learning curve with Scan Tailor, and as soon as I become adept at formatting the images/pages. I'll use this information and some of the things from the other replies to put everything in a .pdf. It would be great if Scan Tailor would have this built in - sort of an all-in-one post processing program. Thanks again.
User avatar
clemd973
 
Posts: 121
Joined: 22 Aug 2010, 21:20

Re: HELP - Scan Tailor Project --> .pdf

Postby univurshul » 14 Oct 2010, 15:33

clemd973 wrote:It would be great if Scan Tailor would have this built in - sort of an all-in-one post processing program. Thanks again.


Scan Tailor just works with the images and cleans them for later ebook construction. That alone is worth buffering apps before and after it. However, we do have a member spearheading the all-in-one route: viewtopic.php?f=3&t=302

I haven't had a chance to test it myself.

I'm actually busy testing software tools that focus on preparing images pre-Scan Tailor. I'll have a discussion posted about Adobe Lightroom 3 soon.
User avatar
univurshul
 
Posts: 496
Joined: 31 Mar 2010, 18:00
Location: NORTH AMERICA

Re: HELP - Scan Tailor Project --> .pdf

Postby clemd973 » 14 Oct 2010, 16:06

Misty wrote:Apologies I've been taking so long with my PDF maker. I've been busy on non-scanning projects for the past few months, which has kept me away from it, and I originally left it off when I ran into a problem with ImageMagick. I'm still aiming to get it finished in the relatively near future, and I have most of the technical issues sorted through now.


Can't wait to see it. How will we know when it's up and running??? As for me, please PM me when it's ready...I'd love to beta-test it if you're planning on going that route! Philip
User avatar
clemd973
 
Posts: 121
Joined: 22 Aug 2010, 21:20

Re: HELP - Scan Tailor Project --> .pdf

Postby clemd973 » 14 Oct 2010, 16:13

univurshul wrote: I'm actually busy testing software tools that focus on preparing images pre-Scan Tailor. I'll have a discussion posted about Adobe Lightroom 3 soon.


For Mac users, here's a good alternative route for pre-ScanTailor processing: http://www.diybookscanner.org/forum/vie ... ?f=3&t=527. Please let us know about the Adobe Lightroom 3 discussion.
User avatar
clemd973
 
Posts: 121
Joined: 22 Aug 2010, 21:20

Re: HELP - Scan Tailor Project --> .pdf

Postby spamsickle » 14 Oct 2010, 16:33

dingodog wrote:I use
*sam2p*
- http://pts.szit.bme.hu/sam2p/

I see that sam2p has Windows binaries as well as Linux. The author claims that it's better than ImageMagick for creating PDFs, and the reasons he gives seem reasonable.

I'll give it a try. Just doing a straight no-fiddling conversion of a single TIF file from an old scan, the sam2p version was quite a bit smaller (308K vs 465K), and I can't see the difference between them. That's not necessarily a big deal if I'm going to use Acrobat's Clearscan option after the PDF has been built, but it does appear to confirm the author's claim of smaller files. He also claims faster creation and finer control. I still need to learn more about the PDF format.
spamsickle
 
Posts: 572
Joined: 06 Jun 2009, 23:57

Next

Return to Scan Tailor

Who is online

Users browsing this forum: No registered users and 1 guest