What is Intelligent Character Recognition?

Jesse Spencer-Davenport
4 min readFeb 10, 2021

--

Image provided by author

Intelligent character recognition recognizes letters and numbers by analyzing features like lines, line intersections, and closed loops. It combines this feature analysis with traditional pixel-based processing to achieve high accuracy character recognition.

For example, an “O” is a closed loop, but a “C” is an open loop. These features are compared to vector-like representations of a character, rather than pixel-based representations. Because intelligent character recognition looks at features instead of pixels, it works well on multiple fonts and with handprinted characters.

Intelligent character recognition is an advance in a technology known as optical character recognition (OCR).

How Traditional OCR Works

Traditional OCR uses a “matrix matching” algorithm to identify characters using pattern recognition. The character on the document’s image may look like this:

Image provided by author

It is compared to a stored example that looks like this:

Image provided by author

By comparing a matrix of pixels between the character on the image and the stored example, the software determines the character is a “G”.

Image provided by author

Seems like a good approach — but beware of the pitfalls! Because it is comparing text to stored examples pixel by pixel, the text must be very similar. Even if there are hundreds of examples stored for a single character, problems often arise when matching text on poor quality images or using uncommon fonts.

How Intelligent Character Recognition Uses Features

Intelligent character recognition decomposes characters into their component features rather than by comparing pixels to known examples.

Instead of pixels, features:

Image provided by author

Features matching how the character is drawn are often easier for software to understand since the margin of error is less. Feature detection is less susceptible from errors caused by random pixelization.

Image provided by author

Now you know why intelligent character recognition is an improvement over standard OCR.

How Does Intelligent Character Recognition Work?

Intelligent character recognition engines work by combining both traditional and feature-based OCR techniques. The results of both algorithms are combined to produce the best matching result. Each character is given a “confidence score,” which corresponds to how closely the character pixels or features match or a combination of the two.

Even with this blended approach the typical OCR villains are on the attack: poor document quality, multiple font types, and different font sizes.

What is this character? Is it a “G”, a “C”, a “0”, or is it even a character at all?

Image provided by author

Intelligent character recognition must make a decision and it may not make sense within the context of the word or sentence. If a human can’t read the character, then OCR will certainly have trouble.

OCR Post-Processing to the Rescue

Without additional context, character recognition errors make sense. Even if the character isn’t discernable, a human knows “ballboy” is an indie band from Scotland and “bollboy” is just gibberish:

Image provided by author

The most common post-processing done by OCR engines is basic spell correction. Often, errors from poor recognition result in small spelling mistakes. All commercial OCR engines compare results with a lexicon of common words and attempt to make logical replacements.

But what about proper nouns and other important words that aren’t in the lexicon of common words?

Here’s where intelligent character recognition really shines. There are two easy ways to identify incorrect characters:

  1. The easiest way is to import custom lexicons for words related to your organization or industry. You may have medical terms or even customer / company information that you need to match against. Using a custom lexicon will provide even better chances at finding the right match.
  2. Another method for improving OCR character accuracy is something called “fuzzy matching.” Fuzzy matching is a method of providing weighted thresholds to characters and allowing the software to substitute characters based on likely good replacements. For example, the software would be allowed to try an “o” when a “0” provides a bad result. Same for an “l” instead of a “1”, etc.

Out of the box OCR like Transym, Tesseract, Azure, ABBYY, and Prime all provide more accurate OCR results when combined with an intelligent character recognition solution, like Grooper.

Originally published at https://blog.bisok.com.

--

--

Jesse Spencer-Davenport

I enjoy solving problems through business process analysis and increasing revenues through excellent content marketing.