Automated straightening of skewed text scans?

M
Posted By
Mark
Jan 17, 2005
Views
3331
Replies
14
Status
Closed
Hello,

I have a large number of high-resolution, full-color scans of a (primarily) text document. Unfortunately, many of the scans are visibly skewed at different angles (i.e., where the text is not horizontal).

I’d like to straighten out these skewed images while preserving as much of the image quality as possible, and to do it quickly.

Of course, I can de-skew these images manually in Paint Shop Pro using the "free rotate" feature, and by trial and error (by setting the correct angle of rotation), can get the text lines reasonably horizontal.

Unfortunately, this is laborious when one has almost 500 scans to do (and possibly many more in the future.)

As a possible solution, OCR applications autodetect when text lines are not perfectly horizontal and do the necessary rotation before they perform OCR. However, the three I tried (ABBYY Reader, OmniPage and TextBridge), do other pre-processing on the images, so the saved de-skewed images have been altered much more than what PSP-by-hand gives (which is pretty faithful to the original source.) OmniPage and TextBridge lower the resolution and filter the images (entirely unacceptable), while ABBYY preserves the input resolution and color-depth, but does some odd subtle filtering (also unacceptable — I could not turn this off in ABBYY.)

So what are my options? (Any other OCR to try?)

If I have to do this manually, I’d certainly like the process to be faster and more fool-proof, without need to do any trial and error.

What sayest the experts? Please reply in this newsgroup if you can (the return email address does not work.) Are there any other forums I can post this request to?

Thanks.

Mark

Master Retouching Hair

Learn how to rescue details, remove flyaways, add volume, and enhance the definition of hair in any photo. We break down every tool and technique in Photoshop to get picture-perfect hair, every time.

J
JoeB
Jan 17, 2005
Mark wrote in
news::

Hello,

I have a large number of high-resolution, full-color scans of a (primarily) text document. Unfortunately, many of the scans are visibly skewed at different angles (i.e., where the text is not horizontal).

I’d like to straighten out these skewed images while preserving as much of the image quality as possible, and to do it quickly.
Of course, I can de-skew these images manually in Paint Shop Pro using the "free rotate" feature, and by trial and error (by setting the correct angle of rotation), can get the text lines reasonably horizontal.

Unfortunately, this is laborious when one has almost 500 scans to do (and possibly many more in the future.)

As a possible solution, OCR applications autodetect when text lines are not perfectly horizontal and do the necessary rotation before they perform OCR. However, the three I tried (ABBYY Reader, OmniPage and TextBridge), do other pre-processing on the images, so the saved de-skewed images have been altered much more than what PSP-by-hand gives (which is pretty faithful to the original source.) OmniPage and TextBridge lower the resolution and filter the images (entirely unacceptable), while ABBYY preserves the input resolution and color-depth, but does some odd subtle filtering (also unacceptable — I could not turn this off in ABBYY.)

So what are my options? (Any other OCR to try?)

If I have to do this manually, I’d certainly like the process to be faster and more fool-proof, without need to do any trial and error.
What sayest the experts? Please reply in this newsgroup if you can (the return email address does not work.) Are there any other forums I can post this request to?

Thanks.

Mark

Have you tried the Straighten tool in PSP? It’s one of the Deform tools, and you just move the line and drag it underneath the off-kilter line of text, hit the checkmark in the tool options palette, and it straightens the document. That’s at least faster than trying to guess how much to rotate with the Free Rotate option.

As to other forums, you can sign up for the Jasc userforum through the corel/jasc site. It has had both a web interface and a newsgroup (NNTP) interface. However, the NNTP service has been down since last Wednesday. Many people find the web interface too clunky so aren’t frequenting it much while waiting for the NNTP to come back (if it does). Also, you can’t post attachments with the web forum (even though it says that you can).

HTH

Regards,

JoeB
T
Trev
Jan 17, 2005
"Mark" wrote in message
Hello,

I have a large number of high-resolution, full-color scans of a (primarily) text document. Unfortunately, many of the scans are visibly skewed at different angles (i.e., where the text is not horizontal).

I’d like to straighten out these skewed images while preserving as much of the image quality as possible, and to do it quickly.
Of course, I can de-skew these images manually in Paint Shop Pro using the "free rotate" feature, and by trial and error (by setting the correct angle of rotation), can get the text lines reasonably horizontal.

Unfortunately, this is laborious when one has almost 500 scans to do (and possibly many more in the future.)

As a possible solution, OCR applications autodetect when text lines are not perfectly horizontal and do the necessary rotation before they perform OCR. However, the three I tried (ABBYY Reader, OmniPage and TextBridge), do other pre-processing on the images, so the saved de-skewed images have been altered much more than what PSP-by-hand gives (which is pretty faithful to the original source.) OmniPage and TextBridge lower the resolution and filter the images (entirely unacceptable), while ABBYY preserves the input resolution and color-depth, but does some odd subtle filtering (also unacceptable — I could not turn this off in ABBYY.)

So what are my options? (Any other OCR to try?)

If I have to do this manually, I’d certainly like the process to be faster and more fool-proof, without need to do any trial and error.
What sayest the experts? Please reply in this newsgroup if you can (the return email address does not work.) Are there any other forums I can post this request to?

Thanks.

Mark

PSP 9 has a Straighten tool, Find It by clicking the flyout arrow on the deform tools.
If you dont Have ( you can download the try n buy.

To use just move one of the nodes at the end of the line to a straight line on your image or say top of a line of text. then move the other node to the opposite end of the line or text and click the apply icon Works horizontal or vertical
J
jjs
Jan 17, 2005
Do NOT confuse the straighten tool with File-Automate-Straighten and Crop. It is so nice and automagic, but with text it will make a new picture of every single _word_ in the page. 🙁
J
jjs
Jan 17, 2005
"Trev" <trevbowdenATdsl.pipexDOTnet> wrote in message

PSP 9 has a Straighten tool, Find It by clicking the flyout arrow on the deform tools.

Deform tools? Where is it, please? Or do you mean the measure tool, followed by rotate canvas?
R
Rick
Jan 17, 2005
"Mark" wrote in message
Hello,

I have a large number of high-resolution, full-color scans of a (primarily) text document. Unfortunately, many of the scans are visibly skewed at different angles (i.e., where the text is not horizontal).

I’d like to straighten out these skewed images while preserving as much of the image quality as possible, and to do it quickly.
Of course, I can de-skew these images manually in Paint Shop Pro using the "free rotate" feature, and by trial and error (by setting the correct angle of rotation), can get the text lines reasonably horizontal.

Unfortunately, this is laborious when one has almost 500 scans to do (and possibly many more in the future.)

As a possible solution, OCR applications autodetect when text lines are not perfectly horizontal and do the necessary rotation before they perform OCR. However, the three I tried (ABBYY Reader, OmniPage and TextBridge), do other pre-processing on the images, so the saved de-skewed images have been altered much more than what PSP-by-hand gives (which is pretty faithful to the original source.) OmniPage and TextBridge lower the resolution and filter the images (entirely unacceptable), while ABBYY preserves the input resolution and color-depth, but does some odd subtle filtering (also unacceptable — I could not turn this off in ABBYY.)

So what are my options? (Any other OCR to try?)

If I have to do this manually, I’d certainly like the process to be faster and more fool-proof, without need to do any trial and error.
What sayest the experts? Please reply in this newsgroup if you can (the return email address does not work.) Are there any other forums I can post this request to?

Thanks.

Precise text autorotate/deskew is available in high-end OCR packages which cost several thousands of dollars. (e.g. Prime Recognition). Have you considered letting your OCR software do its preprocessing, and if the results are reasonably accurate simply paste it into a PDF file or any other format you prefer.

As long as you’re treating these documents as simple graphics instead of text with embedded graphics you’ll have these kinds of issues.
FH
Fred Hiltz
Jan 17, 2005
jjs wrote:
"Trev" <trevbowdenATdsl.pipexDOTnet> wrote in message
PSP 9 has a Straighten tool, Find It by clicking the flyout arrow on the deform tools.

Deform tools? Where is it, please? Or do you mean the measure tool, followed by rotate canvas?

PSP has no Measure tool. It is not a clone of Photoshop, but has different names for many of the tools and filters. The Deform tool is exactly where Trev says it is. Try Help > Help Topics > Contents
Getting to Know the Program > Exploring the User Interface. After
you have seen what things are called and where they are located, the posts here will make more sense.

Fred Hiltz, fhiltz at yahoo dot com
M
Mark
Jan 17, 2005
Tom wrote:
Mark wrote:

I’d like to straighten out these skewed images while preserving as much of the image quality as possible, and to do it quickly.

As long as you’re treating these documents as simple graphics instead of text with embedded graphics you’ll have these kinds of issues.

Let me reiterate: I wish to auto-deskew these text images yet maintain as much as possible the original image quality. This is for both preservation and readability. These images are not just fodder for conversion to digital text and which will be thrown away when that is complete — they will be used side-by-side with the digital text.

(I’m now investigating commercial products which may do this auto-deskew, such as Imagenation. Any others out there to consider?)

And thanks to everyone for the feedback on Paint Shop Pro’s Straighten tool. It is quicker than the rotate-by-trial-and-error method, but it is still a lot of manual work. Done one already — 499 left to go in this lot, and maybe thousands more in the near future. 🙂

In the meanwhile, still looking for a push-button solution which won’t cost thousands.

Mark
J
jjs
Jan 17, 2005
"Fred Hiltz" wrote in message

PSP has no Measure tool.

Ooooh, PSP. Sorry, I thought I had cross-posts killed.
J
jjs
Jan 17, 2005
"Mark" wrote in message

In the meanwhile, still looking for a push-button solution which won’t cost thousands.

Okay, how much will you pay?
T
Trev
Jan 17, 2005
"jjs" wrote in message
"Fred Hiltz" wrote in message

PSP has no Measure tool.

Ooooh, PSP. Sorry, I thought I had cross-posts killed.
I missed it too


Trev.
West Riding of Yorkshire
The one with the white rose
"I’ve done the calculation and your chances of winning the lottery are identical whether you play or not."
M
Mark
Jan 18, 2005
I thank everyone who has replied so far to my inquiry about finding automatic deskewing software for straightening out skewed page scans (of primarily text.)

I did research today of available software applications (for Windows), and ran across three applications with demos which I was able to test:

1) TechSoft’s PixEdit 7.0.11

2) Mystik Media’s AutoImager 3.03

3) Spicer’s Imagenation 7.50

PixEdit is a *very* expensive commercial software (like $2000 or so if I read some information correctly!) whose demo version worked *excellently* on some test images. Of course, the demo cannot be used for production since it writes a banner to the center of each output page — and it will also expire in a few days.

AutoImager also worked acceptably well (though it did not give the essentially perfect results which PixEdit gave.) And it is a lot less expensive than PixEdit. In addition, the demo does not overwrite the output image. It will expire in 15 days, so I will have time to experiment more with it. At about $60, it appears to be a steal for deskewing of scanned text images.

Imagenation, however, did not work very well. For images which had a visible skew, albeit slight, it did not detect any skewing, while PixEdit and AutoImager did detect skew and adjusted accordingly. I could not find any adjustments to the Imagenation settings (such as minimum skew before it adjusts). So barring some adjustments which will make it work, I don’t recommend it (it is also fairly expensive, but nowhere near as expensive as PixEdit.)

So, the next obvious question: Are there any other automatic deskewing software applications (which run under Windows) that I should also try?

Thanks.

Mark
J
JoeB
Jan 18, 2005
Mark wrote in
news::

I thank everyone who has replied so far to my inquiry about finding automatic deskewing software for straightening out skewed page scans (of primarily text.)

I did research today of available software applications (for Windows), and ran across three applications with demos which I was able to test:
1) TechSoft’s PixEdit 7.0.11

2) Mystik Media’s AutoImager 3.03

3) Spicer’s Imagenation 7.50

PixEdit is a *very* expensive commercial software (like $2000 or so if I read some information correctly!) whose demo version worked *excellently* on some test images. Of course, the demo cannot be used for production since it writes a banner to the center of each output page — and it will also expire in a few days.

AutoImager also worked acceptably well (though it did not give the essentially perfect results which PixEdit gave.) And it is a lot less expensive than PixEdit. In addition, the demo does not overwrite the output image. It will expire in 15 days, so I will have time to experiment more with it. At about $60, it appears to be a steal for deskewing of scanned text images.

Imagenation, however, did not work very well. For images which had a visible skew, albeit slight, it did not detect any skewing, while PixEdit and AutoImager did detect skew and adjusted accordingly. I could not find any adjustments to the Imagenation settings (such as minimum skew before it adjusts). So barring some adjustments which will make it work, I don’t recommend it (it is also fairly expensive, but nowhere near as expensive as PixEdit.)

So, the next obvious question: Are there any other automatic deskewing software applications (which run under Windows) that I should also try?

Thanks.

Mark

While no expert at all in this area, it seems to me that if you’ve found a product that produces what you consider acceptable results for only $60.00, that would be my choice. You can spend more than $60.00 worth of time finding something else, but it’s up to you whether your time is worth it or if you can charge it to your client if they need better results.

I use ABBY Finereader 6 Corporate (the best OCR solution I’ve found so far), and haven’t noticed the subtle filtration you mentioned in your first post that seems to make the text work but alter the image in the scanned output. However, that could be simply because my output didn’t have to be a perfect match of the input (in that the output differences were not perceptible enough to matter to me or my clients).

Other than the comments above, I’m afraid I don’t have suggestions for a better way to achieve your goals. If you felt like visiting the Jasc web forum, you could post some sample images and perhaps others could help based on being able to see the problems in posted images.

Regards,

JoeB
M
Marvin
Jan 18, 2005
Mark wrote:
Hello,

I have a large number of high-resolution, full-color scans of a (primarily) text document. Unfortunately, many of the scans are visibly skewed at different angles (i.e., where the text is not horizontal).

I’d like to straighten out these skewed images while preserving as much of the image quality as possible, and to do it quickly.
Of course, I can de-skew these images manually in Paint Shop Pro using the "free rotate" feature, and by trial and error (by setting the correct angle of rotation), can get the text lines reasonably horizontal.

Unfortunately, this is laborious when one has almost 500 scans to do (and possibly many more in the future.)

As a possible solution, OCR applications autodetect when text lines are not perfectly horizontal and do the necessary rotation before they perform OCR. However, the three I tried (ABBYY Reader, OmniPage and TextBridge), do other pre-processing on the images, so the saved de-skewed images have been altered much more than what PSP-by-hand gives (which is pretty faithful to the original source.) OmniPage and TextBridge lower the resolution and filter the images (entirely unacceptable), while ABBYY preserves the input resolution and color-depth, but does some odd subtle filtering (also unacceptable — I could not turn this off in ABBYY.)

So what are my options? (Any other OCR to try?)

If I have to do this manually, I’d certainly like the process to be faster and more fool-proof, without need to do any trial and error.
What sayest the experts? Please reply in this newsgroup if you can (the return email address does not work.) Are there any other forums I can post this request to?

Thanks.

Mark

I have Readiris Pro 9. It does what I think you want. You can download the trial version 10 at http://www.irisusa.com/products/readiris/pc/index.html.
M
mitch
Jan 20, 2005
You could try ExperVision’s Typereader
http://www.expervision.com/tr6.htm

I’ve used an old version (3) quite successfully for OCR. Their trial V6 allow 15 saves, but any number of scans.

Don’t know the imaging capabilities of the newer version. The older version would be no good as it autoscans a 1-bit image, although the deskew seems quite sensitive.

Mark wrote:
I thank everyone who has replied so far to my inquiry about finding automatic deskewing software for straightening out skewed page scans (of primarily text.)

I did research today of available software applications (for Windows), and ran across three applications with demos which I was able to test:
1) TechSoft’s PixEdit 7.0.11

2) Mystik Media’s AutoImager 3.03

3) Spicer’s Imagenation 7.50

PixEdit is a *very* expensive commercial software (like $2000 or so if I read some information correctly!) whose demo version worked *excellently* on some test images. Of course, the demo cannot be used for production since it writes a banner to the center of each output page — and it will also expire in a few days.

AutoImager also worked acceptably well (though it did not give the essentially perfect results which PixEdit gave.) And it is a lot less expensive than PixEdit. In addition, the demo does not overwrite the output image. It will expire in 15 days, so I will have time to experiment more with it. At about $60, it appears to be a steal for deskewing of scanned text images.

Imagenation, however, did not work very well. For images which had a visible skew, albeit slight, it did not detect any skewing, while PixEdit and AutoImager did detect skew and adjusted accordingly. I could not find any adjustments to the Imagenation settings (such as minimum skew before it adjusts). So barring some adjustments which will make it work, I don’t recommend it (it is also fairly expensive, but nowhere near as expensive as PixEdit.)

So, the next obvious question: Are there any other automatic deskewing software applications (which run under Windows) that I should also try?

Thanks.

Mark

Master Retouching Hair

Learn how to rescue details, remove flyaways, add volume, and enhance the definition of hair in any photo. We break down every tool and technique in Photoshop to get picture-perfect hair, every time.

Related Discussion Topics

Nice and short text about related topics in discussion sections