HOW TO COMPRESS PDF FILES

Don't know where to start, or stuck on a certain problem? Drop by and tell us about it. Feel like helping others? Start here.

Re: HOW TO COMPRESS PDF FILES

Postby reggilbert » 18 Jan 2011, 22:31

emmerick wrote:What is the average size of the PDF file of the books scanned by you? My scanned a book of 700 pages is around 90 to 100 megas. Would decrease it?

The following draws on flatbed experience but might help with camera-based scanning. The software that comes with my scanner permits a choice of several output formats. The default is JPEG. Unfortunately, Acrobat compiles JPEGs very inefficiently -- the resulting file sizes are like the one you mention, emmerick -- up to 100MB for, say, only 500 images (an image often contains two pages on a flatbed) -- and that is just for b&w images. TIFF images aggregated poorly as well.

But for some reason the BMP format works far better in Acrobat, with no apparent reduction in resolution, either before, as a scan output choice, or after, as an Acrobat input choice. I think BMP is an uncompressed or optionally minimally compressed format, so maybe Acrobat has a lot to work with. I just scanned an 850-page book (425 images) in 300-dpi greyscale (greyscale seems to work better for Acrobat OCR) and Acrobat brought them all in, plus added its OCR, for a total of output size of 40MB. Keep in mind that the average source image is 8MB and you can enlarge the resulting PDF pages to 400 percent with virtually no loss of sharpness. I find that very impressive. B&w would have been 7MB or so.

This information may or may not be of any use to camera scanners. I don't believe cameras have a BMP option, and as far as I can tell in tests just now, Scan Tailor does not accept or output BMP images, so I assume it cannot output them either. (But I could swear Scan Tailor was able to do so a couple months ago, when I tested its page-splitting power - it did a great job. And those had to be BMPs, but I can't find the test anymore.)

So if cameras don't put out BMP and in any case Scan Tailor does not (again, that may be incorrect), that leaves conversion of another format, camera source files or Scan Tailor output, to BMP. But that could lose some resolution and may result in huge files anyway.

On the other hand, if you have RAW source files, which have to be converted to something in any case, and do not need to use Scan Tailor (if cropping and OCR is all you need, Acrobat can handle that), then maybe conversion to BMP and the aggregation of the resulting images in Acrobat could be an option for creating smaller Acrobat books.
User avatar
reggilbert
 
Posts: 49
Joined: 28 Sep 2010, 19:57
Location: Buffalo, New York

Re: HOW TO COMPRESS PDF FILES

Postby emmerick » 19 Jan 2011, 06:14

reggilbert wrote:
emmerick wrote:What is the average size of the PDF file of the books scanned by you? My scanned a book of 700 pages is around 90 to 100 megas. Would decrease it?

The following draws on flatbed experience but might help with camera-based scanning. The software that comes with my scanner permits a choice of several output formats. The default is JPEG. Unfortunately, Acrobat compiles JPEGs very inefficiently -- the resulting file sizes are like the one you mention, emmerick -- up to 100MB for, say, only 500 images (an image often contains two pages on a flatbed) -- and that is just for b&w images. TIFF images aggregated poorly as well.

But for some reason the BMP format works far better in Acrobat, with no apparent reduction in resolution, either before, as a scan output choice, or after, as an Acrobat input choice. I think BMP is an uncompressed or optionally minimally compressed format, so maybe Acrobat has a lot to work with. I just scanned an 850-page book (425 images) in 300-dpi greyscale (greyscale seems to work better for Acrobat OCR) and Acrobat brought them all in, plus added its OCR, for a total of output size of 40MB. Keep in mind that the average source image is 8MB and you can enlarge the resulting PDF pages to 400 percent with virtually no loss of sharpness. I find that very impressive. B&w would have been 7MB or so.

This information may or may not be of any use to camera scanners. I don't believe cameras have a BMP option, and as far as I can tell in tests just now, Scan Tailor does not accept or output BMP images, so I assume it cannot output them either. (But I could swear Scan Tailor was able to do so a couple months ago, when I tested its page-splitting power - it did a great job. And those had to be BMPs, but I can't find the test anymore.)

So if cameras don't put out BMP and in any case Scan Tailor does not (again, that may be incorrect), that leaves conversion of another format, camera source files or Scan Tailor output, to BMP. But that could lose some resolution and may result in huge files anyway.

On the other hand, if you have RAW source files, which have to be converted to something in any case, and do not need to use Scan Tailor (if cropping and OCR is all you need, Acrobat can handle that), then maybe conversion to BMP and the aggregation of the resulting images in Acrobat could be an option for creating smaller Acrobat books.



Good morning friend, thanks for the tips, but my camera just has more output JPG will do some testing here. Thanks. A program to convert JPG to BMP image would have the same result without loss? worth a try for testing.
Iam Sorry for MY English, i am use GOOGLE translate. :)))))))
emmerick
 
Posts: 30
Joined: 06 Jan 2011, 14:35
Location: Rio de Janeiro/Brazil

Re: HOW TO COMPRESS PDF FILES

Postby emmerick » 19 Jan 2011, 08:35

I reduced a PDF file of 100 megs to 40 megs in Adobe Acrobat X, going to file, open the PDF file, then go to save as PDF and optimize the options as they are leaving. Here at least reduced from 100 to 40 megs and the quality was still very good. Change the options to see if they get something better, because not quite understand those options.

Image
Iam Sorry for MY English, i am use GOOGLE translate. :)))))))
emmerick
 
Posts: 30
Joined: 06 Jan 2011, 14:35
Location: Rio de Janeiro/Brazil

Re: HOW TO COMPRESS PDF FILES

Postby seasalt » 02 Jun 2011, 07:47

on mac
I've been trying different things:
best so far is:

eg
scan images original was jpeg 300dpi
front and back cover - full colour 48bit
rest of pages bw
pages 360

20mb OCR PDF or 78mb OCR PDF both got:
OCR tool was abbyy express (layers images/text)

acrobat x - save as reduce file size option is 7.6mb
acrobat x - save as optimize = 9.9mb

then if I can get homebrew installed to I stall PDFbeads I am hoping to get under 5mb
seasalt
 
Posts: 45
Joined: 30 Apr 2011, 09:44

Re: HOW TO COMPRESS PDF FILES

Postby rubypdf » 04 Oct 2011, 12:33

emmerick wrote:



This is for linux. im use windows :( Thanks



I have done some efforts on windows version pdfsizeopt.
rubypdf blog:talking about PDF, Linux, development and so on.
rubypdf software: offer PDF software download
rubypdf
 
Posts: 5
Joined: 16 Nov 2010, 13:21
Location: Shanghai, China

Previous

Return to HELP

Who is online

Users browsing this forum: No registered users and 2 guests