BookScanWizard 2.0 – Memory Management Issues – Leak?

Discussion about Steve DeVore's Book Scan Wizard, a power-user package to automate scan processing.

Moderator: peterZ

Post Reply
John_Latta
Posts: 11
Joined: 25 Sep 2013, 16:54
E-book readers owned: iPad
Number of books owned: 10000
Country: US

BookScanWizard 2.0 – Memory Management Issues – Leak?

Post by John_Latta »

BookScanWizard is running on a 64bit computer with Windows 7 Pro with 48GB of memory. The versions of Java used were the latest download.

The software was run from a bat file which had this command for the 32bit version of Java.

java -Xmx1280m –jar BookScanWizard.jar

When attempting to run from this command

java -Xmx1536m –jar BookScanWizard.jar

or any larger amounts of memory on the Xmx command, the run fails with this message:
XMX1536m - crop.jpg
In an attempt to get larger heap space the 64bit version of Java was downloaded. This bat command was executed.

java -Xmx2G -jar -d64 BookScanWizard.jar

and it allowed BookScanWizard to run.

The task here is to create an electronic book which has 353 pages. The input image size is 4896 X 3672.

The processing included only Crop and Perspective and AutoLevels.

Doing batch processing of 10 pages completes, however, any more than 21 pages BookScanWizard halts with this error:
Out of Memory Error GC Overhead Limit Exceeded.JPG
Out of Memory Error GC Overhead Limit Exceeded.JPG (15.05 KiB) Viewed 14479 times
Given that the processing memory demands are page number sensitive it may imply a memory leak.

When running the 32bit version of Java with the Xmx command set at –Xmx1280 it was not possible to get any output from BookScanWizard due to heap errors such as this:
Java Heap Error.JPG
Java Heap Error.JPG (15.09 KiB) Viewed 14479 times
steve1066d
Posts: 296
Joined: 27 Nov 2010, 02:26
E-book readers owned: PRS-505
Number of books owned: 1250
Location: Minneapolis, MN
Contact:

Re: BookScanWizard 2.0 – Memory Management Issues – Leak?

Post by steve1066d »

Book Scan Wizard is pretty memory intensive. It processes multiple pages at a time up to the number of processors you have on your computer. What is the size of your output images? What processor do you have? If possible, can you post or send me the configuration file you are using?

18 megapixel images are bigger than I've tested with. Its possible that there's a memory leak but, but I think its more likely that your configuration is just requiring more memory. Also, there is some memory usage for each image (like for the thumbnails).
Steve Devore
BookScanWizard, a flexible book post-processor.
John_Latta
Posts: 11
Joined: 25 Sep 2013, 16:54
E-book readers owned: iPad
Number of books owned: 10000
Country: US

Re: BookScanWizard 2.0 – Memory Management Issues – Leak?

Post by John_Latta »

Here is the system I am using.
System.JPG
System.JPG (17.24 KiB) Viewed 14473 times
This results in 12 processors; 6 each for the CPUs.

I have not changed the size of the output images other than the size which results from the cropping. I want to keep the pixel density as high as possible – some books have many pictures and this will allow for the best image quality. Also I have seen the higher the pixel density the lower the OCR error count.

Here is the configuration list. Note that BookScanWizard can only process to the page 21.

# Book Scan Wizard Script
# http://bookscanwizard.sourceforge.net
# C:\Users\john.fourthwave.000\Desktop\BookWorkingDirectory


# *** Load Files ***
# the source directory
LoadImages = C:\Users\john.fourthwave.000\Desktop\Storms of my Grandchiildren 9-13\Left+Right

# The Destination directory
SetDestination = C:\Users\john.fourthwave.000\Desktop\Storms of my Grandchiildren 9-13\Output

# *** Page Rotations ***
Pages = left
Pages = right

# *** Remove Pages ***
# *** Perspective ***
Pages = left
PerspectiveAndCrop = 378,256, 3181,217, 3208,4637, 411,4721
Pages = right
PerspectiveAndCrop = 83,278, 2814,384, 2869,4738, 50,4704
# *** Crops ***
# *** Filters ***
Pages = all
AutoLevels = 1, 99
# *** Scaling ***
Pages = all
# This will ensure the left and right pages are exactly the same size.
ScaleToFirst=

# *** Output ***
Pages=0012-0032
SaveImages = JPEG
steve1066d
Posts: 296
Joined: 27 Nov 2010, 02:26
E-book readers owned: PRS-505
Number of books owned: 1250
Location: Minneapolis, MN
Contact:

Re: BookScanWizard 2.0 – Memory Management Issues – Leak?

Post by steve1066d »

Try running the 64 bit version of Java with 8 gig or so.

I think I should add some configuration parameters that will limit the number of threads it will attempt to use, but for now I think that will get you running.
'
Steve Devore
BookScanWizard, a flexible book post-processor.
John_Latta
Posts: 11
Joined: 25 Sep 2013, 16:54
E-book readers owned: iPad
Number of books owned: 10000
Country: US

Re: BookScanWizard 2.0 – Memory Management Issues – Leak?

Post by John_Latta »

This worked.

Thank you.

Please post the configuration parameters you are considering and I will run them.

I have 500 books to scan and this was only the test case.
steve1066d
Posts: 296
Joined: 27 Nov 2010, 02:26
E-book readers owned: PRS-505
Number of books owned: 1250
Location: Minneapolis, MN
Contact:

Re: BookScanWizard 2.0 – Memory Management Issues – Leak?

Post by steve1066d »

In your case, the new options would just slow you down... With the 64 bit Java you've got enough memory to take advantage of all those cores, so you might as well use them.
Steve Devore
BookScanWizard, a flexible book post-processor.
Post Reply