I have a jpg file of a single page from a fiction book. FineReader
performs well in recognition, and very well in formatting the document,
but it looks like sometimes the formatting of the recognized document in
the Text window is not exported properly. In other words, although the
Text window shows the correct formatting, the exported document is not
formatted properly. There is no limiting factor in the export format,
which is why I think there could be a bug in FineReader export. Could
you please investigate the issue? I love FineReader, I think it's better
than any other OCR package out there, but this export issue makes me cry

I'm using exact HTML export with Full CSS. Most of the time the export
is correct, but sometimes, annoyingly, the export is incorrect.
There doesn't seem to be anything about the HTML export that would
prevent the format from being correct.
I have included the original jpg image, along with the exported HTML and
PDF output. I draw your attention to these paragraphs:
(indented properly) "Eighty-five is the best I can do."
(indented properly) "Okay, I'll talk him into eighty-five. But just for
you. I wouldn't do it for anybody else."
(NOT indented properly) "You're a sweetheart."
...
(indented properly) "Not earned out yet? Are you sure?"
(indented properly) "Sad but true."
(indented properly) "Hmm. Well, I guess Sheldon can live with a million
until the next royalty checks come in. In his tax bracket, it isn't so
bad."
(NOT indented properly) "The self-discipline will be good for him."
(NOT indented properly) "But how about making it a two-book deal?"
(rest of page NOT indented properly, except for last paragraph)
I checked the other exports, and found the following (each with Exact
Copy selected):
HTML: Not formatted properly
PDF: Formatted properly
RTF: Not formatted properly
DOC: Not formatted properly
XML: Not formatted properly
Here are my settings:
Document:
Document languages: English
Document print type: Autodetect
(all other options not selected)
Scan/Open:
Automatically read acquired page images
Image Processing
X Correct image skew
X Detect page orientation
(all other options not selected)
Read:
Thorough reading
Table processing
(all options not selected)
Training
Do not use user patterns
Save:
HTML:
Retain Layout: Exact copy
Save mode: Full (use CSS)
Text Settings:
X Use solid line as page break
X Keep headings and footers
(all other options not selected)
Picture Settings: Medium (for screen)
Character encoding:
Code page: (Automatic)
Code page type: Windows