Optimum accuracy and reliability is the driving force behind Transym and we believe our software to be the most accurate available. Thousands of companies worldwide have chosen our engine and SDK to add these qualities to their own solutions and to give them the competitive edge.
Features common to all of our products are:
- Designed for Integration
We have produced both versions of TOCR in a format designed for easy integration. Example routines are provided in C, C#, Visual basic, VB.Net and Delphi to enable fast integration and to provide working solutions.
- Font Independent
Because recognized text is primarily used for searching or formatting we don't focus on the font that has been used. Instead we have optimized TOCR to recognize the characters and allow you to set the font in which you wish to view or export the text.
- Reliability
Failure to recognize characters and words results in a high number of errors reported on a page and can often cause OCR software to crash or hang. During unattended or batch processing of large amounts of data, this is a significant problem and seriously impairs productivity and therefore profitability.
Transym's intensive testing and training process utilizes over 21,000 image files and has been designed to produce an engine that is not only highly accurate but extremely robust, minimizing the number of errors that are encountered and maximizing the efficiency of the handling of those that do occur.
- Character Accuracy
This is a measure of how many characters are read correctly by the software. Factors which can affect this are creative typefaces, shading, broken or touching characters, skewed and curved baselines, insert errors, space errors and underlined text all of which can slow down performance and increase the requirement for human intervention in the OCR process.
Through our testing process, TOCR is trained against real life multi lingual examples of documents and images that have been subject to such distortions. Its ability to recognize, is truly optimized for maximum character accuracy.
- Word Accuracy
In some circumstances a character can be recognized correctly but in the context of the word it is used can be incorrect. In many fonts the image for o and 0 or 2 and Z can not only be similar but in some cases identical. Just recognizing the character is not therefore enough, some reference must therefore be made to a collection of commonly used words to help determine which option is correct.
Most other OCR solutions use libraries or dictionaries to perform this function. These are static repositories which are language specific and can struggle to cater for the adoption of new words or the inclusion of quotations and phrases from other languages such as French or Latin which are commonly used, especially in legal and medical documents.
At Transym, we use a lexicon which includes words and phrases from many languages, living or dead, to provide a single source of reference offering outstanding word accuracy and consequently outstanding reliability.
In addition, on very poor quality documents or where characters have been badly reproduced, TOCR will provide up to 4 suggested alternatives during word accuracy checking and carry on processing so that the document or batch can be completed and the checking process can be performed in the quickest time possible.
- Optimization for poor backgrounds
The quality of the background of a document can also have an impact on the recognition of characters. Photocopied, faxed and crumpled documents can deform and distort character images rendering them difficult to recognize.
TOCR is tested and enhanced using extremes of light and dark backgrounds, deformation and speckle. It is hardened using a vast source of imperfect samples to train the software to identify text as opposed to background defects.
- Automatic Orientation detection
TOCR automatically detects which way up the image or page has been scanned and delivers the recognized text the right way up.
- Scaleable performance
Although TOCR only requires a single processor PC running Windows 95, for large scale solutions it can be scaled to run on up to 255 Processors on a single machine.
|