Welcome to the Abbyy solutions centre

What is OCR?

Suppose you wanted to digitise a magazine article or a printed contract. You could spend hours retyping and then correcting misprints. Or you could convert all the required materials into digital format in several minutes using a scanner (or a digital camera) and Optical Character Recognition software.

What is OCR software? (video available here)

What exactly is meant by OCR?

Optical Character Recognition, or OCR, is a technology that enables you to convert different types of documents, such as scanned paper documents, PDF files or images captured by a digital camera into editable and searchable data.

Imagine you've got a paper document - for example, magazine article, brochure, or PDF contract your partner sent to you by email. Obviously, a scanner is not enough to make this information available for editing, say in Microsoft Word. All a scanner can do is create an image or a snapshot of the document that is nothing more than a collection of black and white or colour dots, known as a raster image. In order to extract and repurpose data from scanned documents, camera images or image-only PDFs, you need an OCR software that would single out letters on the image, put them into words and then - words into sentences, thus enabling you to access and edit the content of the original document.

ABBYY FineReader Diagram

What Technology lies behind OCR?

Let's take a look on how FineReader OCR recognises text. First, the program analyses the structure of document image. It divides the page into elements such as blocks of texts, tables, images, etc. The lines are divided into words and then - into characters. Once the characters have been singled out, the program compares them with a set of pattern images. It advances numerous hypotheses about what this character is. Basing on these hypotheses the program analyses different variants of breaking of lines into words and words into characters. After processing huge number of such probabilistic hypotheses, the program finally takes the decision, presenting you the recognised text.

How to use OCR Software?

Using ABBYY FineReader OCR is easy: the process generally consists of three stages: Open (Scan) the document, Recognise it and then Save in a convenient format (DOC, RTF, XLS, PDF, HTML, TXT etc.) or export data directly to one of Office applications such as Microsoft Word, Excel or Adobe Acrobat.

In addition, the latest version of ABBYY FineReader supports Automated Tasks mode which is essential when you deal with routine tasks regularly. With this feature, recognition tasks run automatically without having to manually execute all of the above mentioned steps.

What benefits does OCR bring to You?

With FineReader OCR, recognised document looks just like the original. Advanced, powerful OCR software allows you to save a lot of time and effort when creating, processing and repurposing various documents. With ABBYY FineReader OCR, you can scan paper documents for further editing and sharing with your colleagues and partners. You can extract quotes from books and magazines and use them for creating your course studies and papers without the need of retyping. With a digital camera and FineReader OCR, you can capture text outdoors from banners, posters and timetables and then use the captured information for your purposes. In the same way, you can capture information from paper documents and books – for example if there is no a scanner close at hand or you cannot use it. In addition, you can use OCR software for creating searchable PDF archives.

The entire process of data conversion from original paper document, image or PDF takes less than a minute, and the final recognised document looks just like the original!