HOW TO COMPRESS PDF FILES

Don't know where to start, or stuck on a certain problem? Drop by and tell us about it. Feel like helping others? Start here.

HOW TO COMPRESS PDF FILES

Postby LMB » 07 Jan 2011, 21:32

Hi,

My sincere congratulations for this wonderful site. I discovered your work just a few days ago and I was astonished with the marvelous photographic scanners you build.

As I saw in the internet archives pdf files are compressed in such a way that when you opened them with an OCR software like Abby Fine Reader 10 Professional, and rebuild them again they acquire an huge size. My question is: what do you do or what software do you use to compress so efficiently those digitalized files?

As I told you I use Abby Fine Reader 10 and the only way to compress a scanned book in pdf (the most used format) is to transform it in a black and white version or gray with a medium quality image. This is better than nothing but I need to discover a more efficient way to make this.

I'll appreciate all the help you can give me about this topic
Many thanks and all the luck in the world for you and Diy Book Scanner project.
LMB
LMB
 
Posts: 9
Joined: 02 Jan 2011, 00:46

Re: HOW TO COMPRESS PDF FILES

Postby spamsickle » 08 Jan 2011, 10:46

I don't know what Internet Archive is doing, but my guess would be that they're building a PDF from an existing text file. Since you wouldn't need to open such a file in OCR software (the text would already be there), my guess is probably incorrect.

I use Adobe Acrobat version 9 or higher when I want compression. It has a "Clearscan" option, which creates a custom font and vectorizes the text with it, at the same time that it does OCR. For my purposes, the OCR is acceptable -- I'm not using a text-to-voice application to read to me, just doing an occasionally search on the generated text. Clearscan also produces smoother looking characters than the original Scan Tailor output, and works better with the Scan Tailor output than it does with the original JPEGs.
spamsickle
 
Posts: 572
Joined: 06 Jun 2009, 23:57

Re: HOW TO COMPRESS PDF FILES

Postby LMB » 08 Jan 2011, 18:50

Hi
Thank you very much for your answer. The «clearscan» function of Adobe you are talking about it's a full version, it is not? I'm saying this beacause I don't find this function in the Adobe 9 version I'm using.

many thanks one more time
LMB
 
Posts: 9
Joined: 02 Jan 2011, 00:46

Re: HOW TO COMPRESS PDF FILES

Postby spamsickle » 08 Jan 2011, 19:17

I think it's in all versions, but I might be wrong. In the version I'm using, you click on the "Document" menu, then "OCR Text Recognition -> Recognize Text Using OCR". That should display a popup, with settings. If the PDF Output Style is not Clearscan, click "Edit" and select that as the output style.

If that doesn't get it for you, I probably can't help further. It may be that you don't have it, but you should direct your question to Adobe before giving up.
spamsickle
 
Posts: 572
Joined: 06 Jun 2009, 23:57

Re: HOW TO COMPRESS PDF FILES

Postby Mandor » 10 Jan 2011, 03:08

@LMB
The size of PDF files, produced by Abbyy FineReader depends on export settings — only text&pictures, all page + text over… At least two of options are to produce PDF, where all pages are graphic representation of whole page, plus OCR-ed text — over or below this image.
Mandor
 
Posts: 24
Joined: 28 Jul 2009, 01:27
Location: Sofia, Bulgaria

Re: HOW TO COMPRESS PDF FILES

Postby emmerick » 17 Jan 2011, 06:37

What is the average size of the PDF file of the books scanned by you? My scanned a book of 700 pages is around 90 to 100 megas. Would decrease it? Thanks
Iam Sorry for MY English, i am use GOOGLE translate. :)))))))
emmerick
 
Posts: 30
Joined: 06 Jan 2011, 14:35
Location: Rio de Janeiro/Brazil

Re: HOW TO COMPRESS PDF FILES

Postby Gerard » 17 Jan 2011, 08:42

User avatar
Gerard
 
Posts: 153
Joined: 17 Oct 2010, 07:15
Location: Berlin (Germany)

Re: HOW TO COMPRESS PDF FILES

Postby emmerick » 17 Jan 2011, 08:50




This is for linux. im use windows :( Thanks
Iam Sorry for MY English, i am use GOOGLE translate. :)))))))
emmerick
 
Posts: 30
Joined: 06 Jan 2011, 14:35
Location: Rio de Janeiro/Brazil

Re: HOW TO COMPRESS PDF FILES

Postby emmerick » 17 Jan 2011, 13:24

I was doing some testing here and I concluded: PDF really compress the quality is too bad the only way to be perfect is to pass the OCR I used Abby 10. A file that previously was 100 mega got 2 megs after OCR is the only thing that's a little more work because the header and footer and a few words that do not recognize.
Iam Sorry for MY English, i am use GOOGLE translate. :)))))))
emmerick
 
Posts: 30
Joined: 06 Jan 2011, 14:35
Location: Rio de Janeiro/Brazil

Re: HOW TO COMPRESS PDF FILES

Postby mellow-yellow » 17 Jan 2011, 15:49

A few options:

1. Adobe Acrobat (excluding the free Reader): http://www.websiteoptimization.com/spee ... mizer.html
2. Omnipage or ABBYY: Export or Save your PDF without images (text only). Of course, OCR errors reduce legibility.
3. Source images: Reduce source file resolution, convert color to B/W or Grayscale, reduce # of color/grayscal values
4. Print to a PDF with PDFCreator or equivalent
User avatar
mellow-yellow
 
Posts: 46
Joined: 28 Jun 2010, 13:33
Location: Portland, OR, USA

Next

Return to HELP

Who is online

Users browsing this forum: No registered users and 0 guests