Quantcast
Channel: Recent Discussions on pdfforge Forums
Viewing all articles
Browse latest Browse all 10270

PDF from scanner + OCR

$
0
0
I create PDFs from scanner and use OCR to make the PDF files searchable.
It seems that the OCR does not simply add the recognised text as metadata, but changes the scanned document itself replacing the recognised text with characters with a similar looking font typeface.
So in the middle of the line of text in the scanned document, I see some words looking too perfect for a scanned document and other words that have not been recognised and therefore are rendered as an image. The scanned document looks as if it has been artificially modified, “counterfeit”. This is unacceptable.
Is it possible to have the PDF file to accurately reproduce the scanned document and have the recognised text as metadata, as Adobe Acrobat does?
Thanks, Claudio.

Viewing all articles
Browse latest Browse all 10270

Trending Articles