Introducing spreads: command-line workflow tool

General discussion about software packages and releases, new software you've found, and threads by programmers and script writers.

Moderator: peterZ

Post Reply
User avatar
jbaiter
Posts: 98
Joined: 17 Jun 2013, 16:42
E-book readers owned: 2
Number of books owned: 0
Country: Germany
Location: Munich, Germany
Contact:

Introducing spreads: command-line workflow tool

Post by jbaiter »

Every since I finished my build of the Hackerspace Kit (thanks to @markvdb for making them available in Europe!), I've grown more and more frustrated with the lack of an application that ties together the many incredible programs available.

So I set out to write my own, titled spreads (github)

What it offers:
- Handles the whole workflow: triggering of the cameras, downloading the images, post-processing the images, assembling output formats
- Plugin-based, you can hook into any of the workflow steps and add your own functionality
- Fully parallelized, i.e. ScanTailor will run on all available CPU cores, cameras will be configured and triggered simultaneously
- Fully customizable with a configuration file (documentation coming soon, promised!)

What you need:
- Two Canon A2200 cameras with CHDK installed (adding support for more cameras should be a breeze, see the documentation)
- A *nix system (developed on Debian, but should run on most others)
- Python 2.7 with distribute and pip installed
- gphoto2 and a version of ptpcam modified for CHDK
- A recent version of ScanTailor-enhanced, if you want to use it for post-processing (you probably do ;-) )
- pdfbeads, if you want to generate a PDF from your scanned pages
- djvubind, if you want to generate a DJVU

Some caveats:
- This tool has only been in development for about a week. So expect bugs and lacking documentation and weird edge cases and the like. Just open an issue on GitHub and I'll try to take care of it
- As has been mentioned above, I've only been able to test it with Canon A2200 cameras (the ones I own). Theoretically, most CHDK-based cameras should work with that driver, but I can't make any promises until someone has tested it :-)
- No GUI yet, but a command-line interface that tries to be sane
- No Windows and OS X support at the moment, might change in the future
- At the moment, the PyPi installation way doesn't seem to work yet (I just submitted the project a few minutes ago), so you'll have to install it from GitHub (instructions are in the README).

Try it out, trash it, hack on it and direct feedback of any kind to me, I'm looking forward to improving it :-)
spreads: Command-line workflow assistant
markvdb
Posts: 90
Joined: 28 Dec 2010, 18:45
Number of books owned: 0
Country: Belgium

Re: Introducing spreads: command-line workflow tool

Post by markvdb »

Hello Johannes,

Thank you, thank you, thank you!

Looking forward to testing and this very thoroughly, very soon.

Mark
Mark
http://diybookscanner.eu - official EU diybookscanner kits - subscribe to our newsletter
User avatar
jbaiter
Posts: 98
Joined: 17 Jun 2013, 16:42
E-book readers owned: 2
Number of books owned: 0
Country: Germany
Location: Munich, Germany
Contact:

Re: Introducing spreads: command-line workflow tool

Post by jbaiter »

Since I can't seem to edit the OP, I'll just update this here:
I released v0.2 yesterday, these are the highlights:
  • New plugin system based on Doug Hellmann's `stevedore` package, allows packages to extend spreads without being included in the core distribution
  • The driver for CHDK cameras no longer relies on gphoto2 and ptpcam, but on a modified version Abel Deuring's `pyptpchdk` package to communicate with the cameras.
  • `Wand` is now used to deal with image data instead of `Pillow`
  • New 'colorcorrection' plugin allows users to automatically correct white balance.
  • Improved tutorial
What this means:
  • The tutorial now walks you through the installation, configuration and usage of the program, assuming zero previous knowledge and pre-installed packages. It is based on my setup, though, so your cameras should run CHDK and your system should be Debian-based.
  • The installation is less of a pain, now that all the camera-related functionality is only dependent on a single Python package that is automatically installed
  • It is now easier to extend the program with external packages (I'll update the documentation soon!)
Some things I'll be implementing in the next few days/weeks:
  • OCR postprocessing plugin (using Tesseract and/or Cuneiform)
  • ePub, HTML and CBR output plugins
  • Metadata plugin (I'll see if I can automatically recognize ISBN-Barcodes from a given page...)
I've also been playing around with the idea of assembling a Raspberry Pi image that you would just have to install, plug in the Pi, connect your cameras and triggering device and start scanning immediately. This would probably also involve the development of a client/server plugin, where the Raspberry boxes just deal with the scanning and downloading part (as 1Ghz and 512MB RAM is probably a bit too weak for ScanTailor and I don't know if it would compile on ARM anyway...) and leave the heavy postprocessing to a more powerful machine on the network. Would there be any interest in something like this?
spreads: Command-line workflow assistant
dtic
Posts: 464
Joined: 06 Mar 2010, 18:03

Re: Introducing spreads: command-line workflow tool

Post by dtic »

Very nice project!

What prevents this from working in Windows too? Are any of the tools/libraries used Linux only? Or is it something else?
jbaiter wrote: Metadata plugin (I'll see if I can automatically recognize ISBN-Barcodes from a given page...)
For bookmark creation I suggest JPdfBookmarks. Easy to use for plain bookmarks. I made a script for it here - windows only but the code is short and you'll easily see how it works. I also have a newer experimental version of the script (not online) that tries to do these steps:
(1) OCR the first 10 pages of the scantailor output,
(2) do a regular expressions search for the first occurance of a string of digits/hyphens/spaces right after "isbn" in the text,
(3) search online for metadata, including table of contents, for that ISBN.
The result can then be shown to the user for a quick edit and then added to the pdf using a script and JPdfBookmarks.
User avatar
jbaiter
Posts: 98
Joined: 17 Jun 2013, 16:42
E-book readers owned: 2
Number of books owned: 0
Country: Germany
Location: Munich, Germany
Contact:

Re: Introducing spreads: command-line workflow tool

Post by jbaiter »

What prevents this from working in Windows too? Are any of the tools/libraries used Linux only? Or is it something else?
At the moment just the lack of time and the prejudice that Windows users have problems with command-line applications :-)
On the more technical side, though, I don't have a lot of experience with how the whole Python ecosystem (pip/setupools/pkg_resources) works on Windows, and since the plugin system relies a lot on that, I would first have to look into it. I think that all of the used Python packages should compile just fine (as they're either in Python themselves or in rather platform-agnostic C).
There's also the inconvenience of having to install a custom USB driver for every device you want to use, but I guess there's no way around that at the moment.
I'll see if I can find some time to get it working in a Windows XP VM, maybe it's not as problematic as my gut tells me .)

JPdfBookmarks looks really nifty, should be fairly straightforward to create a plugin for it, thanks for the tip!
The approach to obtaining metadata is interesting, too, I was originally thinking of looking for barcode images with Tesseract, but this might work even better :-)
spreads: Command-line workflow assistant
dtic
Posts: 464
Joined: 06 Mar 2010, 18:03

Re: Introducing spreads: command-line workflow tool

Post by dtic »

Ok, good news. I will try to get it to run in Windows then, when I have some time to spare. I've used pip for python in Windows before, but there might of course be hickups related to some particular components here. We will see. To appease those with command line aversion a small GUI frontend would likely be pretty easy to make.
User avatar
jbaiter
Posts: 98
Joined: 17 Jun 2013, 16:42
E-book readers owned: 2
Number of books owned: 0
Country: Germany
Location: Munich, Germany
Contact:

Re: Introducing spreads: command-line workflow tool

Post by jbaiter »

Great! Keep me posted, I'd be glad to help :-)
Concerning the UI, yes, it shouldn't be too difficult. I'm currently playing around with PySide/PyQt, the plan is to create a very simple wizard that mirrors the CLI 'wizard' mode, i.e. very few settings, very simple to use.
With a GUI, spreads could also show a preview of the images during capture (as we can grab the camera's viewport with the module I'm using), which could help a lot with some setups.
spreads: Command-line workflow assistant
CaptOn
Posts: 17
Joined: 16 Jan 2013, 22:08
E-book readers owned: Kindle, windows tablet
Number of books owned: 20
Country: Australia

Re: Introducing spreads: command-line workflow tool

Post by CaptOn »

What is the prejudice that windows users have for command line?

I'm working on a similar complete workflow setup using windows power shell scripts. I figure most of the people i want to "Convert" to bookscanning are using windows 7, so i'm wanting to hand them a somewhat complete solution, plus most windows uses i've found can't debug comandline, however most linux users i've met are happy to translate something from powershell to bash.

I'm catalouging my dependencies at the moment, and trying to cut everthing down to gphoto, image magic, and tesseract. I'm still not sure how i will compile everthing in the end into the PDF/HTML/EPUB/Doc/DJVU.

Do you guys know any good places to start, i'm pretty handy with html and php, so i've been hoping that if i can get things into HTML then i can just parse stuff through into other formats from that but i suspect i'll run into pagination issues.


So yeah, long story short, i'm either going to have to create my own, but am hoping you can help me. I'm looking for a tool that can help me just create boxes or polygons overlayed on the images that i can then copy the points to my ISE, kind of in the manner of book scan wizzard. Anyone know any tools that do this?
dtic
Posts: 464
Joined: 06 Mar 2010, 18:03

Re: Introducing spreads: command-line workflow tool

Post by dtic »

CaptOn: I haven't gotten around testing spreads in Windows yet. But the chances for it working in Windows look good. Since jbaiter's plans for spreads seem to be the same things you want why not join him in working on spreads?
I'm looking for a tool that can help me just create boxes or polygons overlayed on the images that i can then copy the points to my ISE, kind of in the manner of book scan wizzard.
Not sure what you mean here. Are you looking for tools for constructing a GUI in an application? Or tools for assembling OCR'ed text and page images into a djvu/pdf file?
User avatar
jbaiter
Posts: 98
Joined: 17 Jun 2013, 16:42
E-book readers owned: 2
Number of books owned: 0
Country: Germany
Location: Munich, Germany
Contact:

Re: Introducing spreads: command-line workflow tool

Post by jbaiter »

CaptOn wrote:What is the prejudice that windows users have for command line?
Oh, I was talking about my prejudices against Windows users. I've just made the experience that the concept of a Shell is somewhat foreign to most people using Windows as their primary operating system. But that argument against spreads is kind of passé now, as I've been able to put together a small GUI wizard last week, that does most of the things the command-line version does.
I'm catalouging my dependencies at the moment, and trying to cut everthing down to gphoto, image magic, and tesseract. I'm still not sure how i will compile everthing in the end into the PDF/HTML/EPUB/Doc/DJVU.
That's basically the stack that spreads offers, too, except that gphoto isn't required, and you can plug ScanTailor in there, too. The Tesseract plugin has also made significant progress this week, I'll commit it later this weekend. Concerning output, that's not so difficult, there's some great tools available, like pdfbeads and djvubind. For HTML, you can just pipe the hOCR output from Tesseract through a little XSLT Stylesheet that removes the bounding box information. You can then use that HTML to convert to ePub (working on that, too).
Do you guys know any good places to start, i'm pretty handy with html and php, so i've been hoping that if i can get things into HTML then i can just parse stuff through into other formats from that but i suspect i'll run into pagination issues.
So yeah, long story short, i'm either going to have to create my own, but am hoping you can help me. I'm looking for a tool that can help me just create boxes or polygons overlayed on the images that i can then copy the points to my ISE, kind of in the manner of book scan wizzard. Anyone know any tools that do this?
What do you mean "if you can get things into HTML"? There's little to no HTML involved in getting a physical book to a digital form, except for the last step, when you convert the OCR to HTML. I'm also not sure what you mean with 'pagination issues'? And what is ISE? I'm afraid you'll have to be a bit more specific.
I'd be glad to help, if you're really intent on creating something of your own, whatever that is, but maybe we could join efforts and you help me get spreads running on Windows 7?
spreads: Command-line workflow assistant
Post Reply