Transym logo Transym Computer Services
Home
Products
Sales
Company
Download
Support
Technical

Support > Top OCR Tips
Use a Good Quality Scanner

The higher quality the scanner you use the higher quality the images that it produces.  Accurate images make for less errors and therefore faster more accurate results.


Always check the images for scanning problems

If you're processing a small number of documents, it's always worth having a quick look at them to check for anything that might cause a problem.  Badly distorted images, correction fluid etc.  If you're processing large batches, it's essential that you have a look at the scanner too.  A small amount of correction fluid on the glass will cause an error on every single page that you process.


Use 300 or 400 DPI

This is the optimum resolution for representing a normal sized character.  It provides just the right amount for accuracy and efficiency.  If the resolution is too low then the characters will be difficult to recognize.  If it's too high it is slower to process and uses more storage.


If there is a scan for OCR feature on the scanner use it

Some scanners have in-built filters to handle photo and text differently.  Using these filters will help produce a more accurate and readable image.


Scan in black and white

Using colour or grey scale can increase the image file size by between 10 to 50 times.  To keep the amount of data being processed and stored to a minimum, always scan in black and white where possible.


Keep sectioning turned off unless you need it

Sectioning allows any columns in the text to be recognized and read as a column.  If for example you have three columns next to each other, rather than seeing the top line of each column as a single sentence that has been broken up into three parts, it sees it as the top line of a column and reads down accordingly.  If there are any tables in the document, they will need to be read left to right.  Sectioning can sometimes see a table as columns unless turned off.


Post process the results

Some OCR engines will suggest alternatives for each error discovered (TOCR will return up to 4 alternatives for each character found ).  If you know that certain areas of the page can only contain say digits, then post process to correct the output.

Happy Scanning!

Support > Top OCR Tips
Home  Products  Sales  Company  Download  Support  Technical  Copyright  Privacy Statement
Copyright © Transym, 2007. All Rights Reserved