Optical Character Recognition (OCR), refers to reading text from paper and translating the images into a form that the computer can manipulate (for example, into ASCII codes).
An OCR system enables you to take a scanned document such as an invoice, bill of lading, form and printed article, feed it directly into an electronic computer file, and then edit the file using a word processor. OCR systems can include an optical scanner for reading text, and sophisticated software for analyzing images. Most OCR systems use software to recognize characters, although some expensive systems use a combination of software and hardware. Advanced OCR systems can read text in a large variety of fonts, but they still have difficulty with handwritten text.
The potential of OCR systems is enormous. They enable users to harness the power of computers to access printed documents. OCR is already being used widely in the legal profession, where searches that once required hours or days can now be accomplished in a few seconds.
As labor-intensive as paper forms can be, they will not be disappearing any time soon. The paperless office is not yet a reality, and paper’s portability, free of batteries, is a feature difficult to replace.
That’s where automated data capture comes in. Using a recognition engine to convert text or handwriting from the printed page into computer readable characters saves up to 90 percent of the time it would take to manually enter the information.