Views
336
Replies
4
Status
Closed
I have a bunch of TIFF images that were scanned in grayscale mode at 600 dpi. Each one takes ~32MBytes of disk space, and the images are typical office documents -mostly text with a few logos-, which are being processed by OCR.
My main concern is: what is the best way to obtain as much text recognized as possible? I chose 600 dpi in order to get even the smallest type. The grayscale leaves a lot of "gray dust" in the areas were the original paper page was the purest white. Is there an Photoshop filter that will leave the white background really white? If such filter exists and I apply it, will it affect the OCR recognition? (in a positive, negative way?).
Since I won’t have access to the documents forever, I am trying to get the most complete file at scan time, but I may be doing an overkill.
Should I reduce the sampling to 300 dpi? Or perhaps I should stick with 600 dpi but scan in black and white?
Finally, how do I change a 600dpi TIFF to 300 dpi?
How do I change a grayscale to B&W? (both with Acrobat)
My OCR software (ABBYY FineReader) takes the original file that I provide and makes a working copy which is the one that actually gets OCR’d. The copy that I provide is 32MBytes and the working copy is 100 KBytes. They achieve that by (1) converting from grayscale to B&W and (2) doing some compression (lossy or non-lossy? I don’t know).
Thanks in advance,
-Ramon F. Herrera
My main concern is: what is the best way to obtain as much text recognized as possible? I chose 600 dpi in order to get even the smallest type. The grayscale leaves a lot of "gray dust" in the areas were the original paper page was the purest white. Is there an Photoshop filter that will leave the white background really white? If such filter exists and I apply it, will it affect the OCR recognition? (in a positive, negative way?).
Since I won’t have access to the documents forever, I am trying to get the most complete file at scan time, but I may be doing an overkill.
Should I reduce the sampling to 300 dpi? Or perhaps I should stick with 600 dpi but scan in black and white?
Finally, how do I change a 600dpi TIFF to 300 dpi?
How do I change a grayscale to B&W? (both with Acrobat)
My OCR software (ABBYY FineReader) takes the original file that I provide and makes a working copy which is the one that actually gets OCR’d. The copy that I provide is 32MBytes and the working copy is 100 KBytes. They achieve that by (1) converting from grayscale to B&W and (2) doing some compression (lossy or non-lossy? I don’t know).
Thanks in advance,
-Ramon F. Herrera
Related Tags
Must-have mockup pack for every graphic designer 🔥🔥🔥
Easy-to-use drag-n-drop Photoshop scene creator with more than 2800 items.