Scanning sheetmusic workflow with ScanTailor

Scan Tailor specific announcements, releases, workflows, tips, etc. NO FEATURE REQUESTS IN THIS FORUM, please.

Scanning sheetmusic workflow with ScanTailor

Postby w_m0zart » 13 Sep 2010, 18:01

I would like to share my experience of scanning fast and accurately sheetmusic into compact pdf files.

In the beginning when I created pdf files I noticed the embedded bitmaps in pdf files had accuracy problems. These problems occurred due to rasterization, resulting in slightly smaller or larger bitmaps. After long and extensive testing, I wrote software (with AutoIT3, see links below for source code) which uses ImageMagick and Ghostscript to accurately processes bitmaps in pdf or tiff files. From the supplied files, the program creates efficient coded, resized and small pdf files, which fit exactly on a4 page and is ready to distribute or print.

When instead of tiff files, pdf files are supplied a special algorithm in this program helps to get higher accuracy when converting it to tiff format (an intermediate format, necessary for resizing and colorspace conversion). For example, accuracy problems with a pdf file with the exact size of an a4 page (21.0x29.7 cm), holding a bitmap with a resolution of 300 dpi, and 2480x3508 pixels are such that it in theory the bitmap should fit, but with some pdf files the standard tools like Acrobat or ghostscript produce horizontally instead of 2480 pixels, 2479 or 2481 pixels. The solution I took in my program is that I compare the pdf defined internal variable BoundingBox from ghostscript with the variable HiResBoundingBox and compare what a converted output (to tiff) with ImageMagick produces. If page dimensions exceed a threshold of 1 pixel more, or 1 pixel less than what is expected in a standard a4, a special routine will be triggered to force a pdf to tiff conversion, but now with specified amount of pixels. If the pdf page was originally 2480 pixels, this last step may actually produce now the original page with 2480 pixels. If however still a page with 2479 or 2481 pixels will come out, the original in the pdf embedded bitmap had very likely 2479 or 2481 pixels horizontally.

Please see the link to find my typical workflow, the software and source code I use for that and where I incorporate ScanTailor in this.

Explaining scanning workflow: http://www.auditeon.com/software:pdfprocessing

A screencast, showing the steps can be seen here: http://www.auditeon.com/xyz/webcast/ScanSheetmusicDemo.htm

Software to autonomously resize and/or convert an arbitrary pdf or tif file to an exactly a4-sized pdf file: http://www.auditeon.com/software:pdfprocessing:makepdf#installation

Software to extract tiff files from a pdf file which can be directly imported by ScanTailor: http://www.auditeon.com/software:pdfprocessing:pdf2tif_300dpi

To reorder pdf pages, if necessary, one can use the application pdfsam.

I hope this helps someone. Maybe the code may inspire others to work further on it.

====== Update information ======
* 22-07-2011: The program has been renamed to MakePDF
* 11-07-2011: The program has been updated to v0.8c with new features:
    * MakePDF now helps to correct wrong placement of odd and even pages. By specifying the -q option (by renaming the MakePDF.exe to MakePDF -q.exe) it will add at the end of the document as many pages as necessary to create a document with a multiple of 4 pages. This can specifically be practical if your target format is a booklet/brochure. (multiple of 4 x a4 pages printed on a double sided A3). As a last manual step, empty pages only needs to be moved to the right location.
    * The program also accepts now an option to specify the output resolution (which is default 300dpi). Please see the link above for more information.
* 10-07-2011: The program has been updated to v0.8b with new features:
    * Direct processing of tiff files into pdf functionality has been added.
    * Cleaning up of source code.
    * Fixed error which could lead to slightly wrong bitmap dimensions (+/- 1 pixel).
Last edited by w_m0zart on 25 Aug 2011, 06:33, edited 13 times in total.
w_m0zart
 
Posts: 7
Joined: 21 Oct 2009, 16:25

Re: Scanning sheetmusic workflow with ScanTailor

Postby daniel_reetz » 14 Sep 2010, 01:35

I'm excited about your work with sheet music. Do you have any before/after or sample images that are copyright free, so I can do a blog post about your work?
User avatar
daniel_reetz
 
Posts: 2485
Joined: 03 Jun 2009, 13:56

Re: Scanning sheetmusic workflow with ScanTailor

Postby w_m0zart » 14 Sep 2010, 13:30

I have created two files:

The original file was scanned at 300 dpi and 8 bit grey. To make this file here smaller for downloading, I used a lossy jpg encoding. But the file can be still processed fine:
Original scans: http://www.auditeon.com/xyz/projects/Hubay_Der_Zephir_grey.pdf

After converting the pdf from above to separate tiff images with pdf2tif, I opened and processed the files with scantailor. The result from that was combined into a pdf file again, exceeding a4 page size. Then I drag and dropped the file onto the MakePDF program which produced following file:
Output result: http://www.auditeon.com/xyz/projects/Hubay_Der_Zephir_Scantailor_Processed.pdf
Last edited by w_m0zart on 22 Aug 2011, 16:35, edited 1 time in total.
w_m0zart
 
Posts: 7
Joined: 21 Oct 2009, 16:25

Re: Scanning sheetmusic workflow with ScanTailor

Postby daniel_reetz » 15 Sep 2010, 08:59

Outstanding, thank you! (and what beautiful looking sheet music)
User avatar
daniel_reetz
 
Posts: 2485
Joined: 03 Jun 2009, 13:56

Re: Scanning sheetmusic workflow with ScanTailor

Postby w_m0zart » 27 Jul 2011, 22:42

I updated the program MakePDF.exe. Now it accepts pdf and tiff files directly. The advantage of accepting tiff files is that after processing your images with ScanTailor, the resulted tiff images can be directly converted to an a4 sized pdf.

Check the new simplified workflow here: http://www.auditeon.com/software:pdfprocessing
Last edited by w_m0zart on 22 Aug 2011, 16:35, edited 1 time in total.
w_m0zart
 
Posts: 7
Joined: 21 Oct 2009, 16:25

Re: Scanning sheetmusic workflow with ScanTailor

Postby daniel_reetz » 28 Jul 2011, 10:08

w_mozart! Good to see you again, and thanks for the update. It's really great to see a project like yours come along - well documented, good software, and so on.
User avatar
daniel_reetz
 
Posts: 2485
Joined: 03 Jun 2009, 13:56

Re: Scanning sheetmusic workflow with ScanTailor

Postby w_m0zart » 31 May 2012, 00:35

Today another update has been released with following improvements:

    * Clean up of -partly- messy code.
    * Added sequential file sorting as opposed to alphabetical file sorting which is useful for ranges of tif files which have filenames without leading zeros. If a range of tif files is detected, sequential file sorting will be automatically enabled.
    * Command line length limitation (Workaround). If there are too many files as argument(1), drag/drop only the first and last file of a range of tif files. If software recognizes there is a sequence between these selected files (including possible ScanTailor 1L/2R format), it will take that sequence. There may be gaps between them, as long as basic filename matches with each other. Filenames should be either in the format NAME_nnn.tif or NAME_nnn_(1L|2R).tif format), where nnn can be any number and NAME any name.

Click on MakePDF to download the program.

(1) windows may throw following error: “Windows cannot access the specified device, path, or file. You may not have the appropriate permissions to access the item”
w_m0zart
 
Posts: 7
Joined: 21 Oct 2009, 16:25

Re: Scanning sheetmusic workflow with ScanTailor

Postby Tulon » 31 May 2012, 14:19

Note that if "Right-to-left writing system" is checked when creating a project, it's going to be 1R and 2L rather than 1L and 2R.
When Scan Tailor asks you to enter DPIs manually, never enter arbitrary values. The video tutorial shows how to estimate the real DPI.
Tulon
 
Posts: 536
Joined: 03 Oct 2009, 06:13
Location: London, UK

Re: Scanning sheetmusic workflow with ScanTailor

Postby w_m0zart » 01 Jun 2012, 22:31

For sheet music I don't think a "Right-to-left writing system" exists, except maybe some sonatas from P.D.Q. Bach. However, I can change in a next update the checking of all four options (1L|2R|1R|2L). Then it should just always work fine.
w_m0zart
 
Posts: 7
Joined: 21 Oct 2009, 16:25


Return to Scan Tailor

Who is online

Users browsing this forum: No registered users and 2 guests