It occurred to me that the processing of colored text is a constrained one. Most books are b/w. Some coffee table books have lots of photos. But red-letter Bibles have a restricted palette of two ink colors with which the book was printed. And even coffee-table books are often printed with 4-color (CMYK) process. Except for the odd coffee-table book of art and stuff, every imaged point starts out in one of a discrete number states (e.g. black or red on white), and then things like optics, lighting, and quantizing error causes the sensor to output a pixel of a different color.
This may seem like hunting a fly with an elephant gun, but this problem seems to fit a Hidden Markov Model (that the speech recognition guys use). Printing N-1 inks onto an Nth-colored page defines an N-state machine. The HMM would map each pixel to the most likely color (drawn from that palette) on the paper at that point. I have in mind an the algorithm that would accept as input an image, a palette and background color, then output the maximum-likelihood image generated by the model. Has anyone thought of using something like this in image processing? (Not OCR, binarization)
Obviously, I don't know what I'm talking about. Maybe one of you boffins may find this interesting.