Intelligent Document Processing Buyer’s Guide
No matter where you’re at on your automation journey, at some point difficult data throws a wrench in the gears. Even very large automation projects with seemingly straight-forward requirements are halted with just a single pesky data source that seems untouchable.
The good news is that no data is out of reach, and if you are looking into intelligent document processing, this buyer’s guide will provide the resources you need to make the best decision possible.
This guide will include pricing as examples to help you ballpark pricing expectations.
Intelligent Document Processing Buyer’s Guide Table of Contents:
Disclaimer: Before I answer the question “What is intelligent document processing,” it would be a disservice to you not to talk about the technology’s history.
With a better understanding of it’s history, you’ll be able to more accurately frame the range of possible solutions.
The History of Intelligent Document Processing
Understanding the technology’s roots will help you spot the fakes and technology over-promisers. Nobody wants a failed project because of mismatched expectations (not you or the vendor).
Intelligent document processing has two main predecessors: document imaging, and document capture.
Document Imaging Software
Document imaging software simply converts physical documents like paper, microfilm / microfiche to a digital image. As imaging technology evolved, it provided increasingly better digital copies of documents by using advanced image processing technology to “clean up” document images. This was especially useful for older documents and poor-quality film / fiche.
The technology was first explored in the 1980’s and was mainstream by the mid ’90s. Document imaging technology flatlined by 2010 as it was being replaced with the next best document processing technology — document capture.
Document Capture Software
The document capture market built on the successes and shortcoming of document imaging as organizations demanded increased automation from their digitized documents.
Most notably, these solutions incorporated optical character recognition (OCR) software, the ability to create basic templates for data extraction, and point solutions like accounts payable automation and handwriting recognition for forms.
Document capture solutions have been around for two decades and include a few notable legacy vendors like:
But to talk about these vendors with any degree of accuracy is tricky because virtually all of them have changed names and hands multiple times (PSIGEN tops our list for the longest running capture vendor until their recent acquisition in 2021).
It is safe to say that almost all enterprise document capture software offerings are the result of acquisitions combined into newly-branded products. Whether this is an advantage or disadvantage, we leave you to decide.
What is Intelligent Document Processing?
Intelligent document processing (IDP) augments human understanding of unstructured and difficult data through data science tools like:
- Computer vision
- Optical character recognition
- Machine learning
- Natural language processing
IDP offers all the capabilities of document imaging and capture, but with modern technological advances that make working with difficult document-based (and digital!) data more possible than ever before.
The best IDP platforms are configurable to process documents the same way that human operators would. They provide multiple methods to understand access document-based data so you get the flexibility to tackle a diverse range of document types.
An IDP system, once deployed, should expand beyond the initial requirement to every department that needs to process documents. IDP is industry-agnostic, and will process any kind of document and deliver data to virtually any workflow or downstream application.
Because the nature of document-based data integration hasn’t changed much over the years, IDP operates in three broad phases:
3 Phases of Intelligent Document Processing
No IDP software can skip phases 1 or 2. Data processing requires accurate text recognition (document capture), and document capture requires robust document imaging.
- Phase 1 — Document imaging
- Phase 2 — Document capture
- Phase 3 — Intelligent data integration
What happens in phase 3 is just as important as the technology powering the first two phases of intelligent document processing.
Buyer’s Tip: Any IDP technology that doesn’t have a core competency in document imaging or capture is only a good fit for digitally born documents.
Document Imaging Technology
Because intelligent automation or robotic processes automation projects require document images like invoices, financial statements, medical claims, etc., it is important to be able to handle both digitally-born and scanned documents that may or may not be text searchable (not all PDFs are text searchable!).
Digitally-born documents are those that have been produced directly by a software application. These could be from:
- A print function where a document has been printed to PDF
- Direct output in PDF format
- Microsoft Office files
- XML or similarly formatted file
If you expect the IDP software to deliver data to your automation tools, it must be able to “read” the text on the document accurately and consistently. Reliability is crucial.
In general, document imaging technology involves the following:
- Ability to control physical scanning hardware
- Image processing / post processing to improve image quality
- Computer vision algorithms to identify data “trapped” inside non-text elements
Document Capture Technology
A reinventing of document capture is the heart of intelligent document processing. As an IDP buyer or researcher, you need to understand the difference between legacy capture and the new approaches in IDP:
Legacy Document Capture
This is a solution to identify and extract document data from highly structured documents and forms that have extremely predictable layouts. No “A.I.” is needed or used in these solutions. A good example is a utility bill or internally created form. These are called “structured documents.”
If you control scanning quality, and the layouts of the documents you are working with almost never change, legacy document capture is still a viable option today. It works because all you must do is create a template for each document type that looks in the same spot on every document for the field you need to capture.
The template will be programmed where to look for information and what that information represents, e.g., invoice date, invoice number, supplier name, etc. But if the information you need is more complex; if the layouts change based on things out of your control, this method becomes extremely inefficient, and hardly worth the time and effort.
The Technology Powering Intelligent Document Processing
Don’t Fall for the Hype!
Guard yourself from marketing hype! If it sounds too good to be true, it probably is.
Any time artificial intelligence is brought up, condition yourself to call bullshit until it’s been proven to you.
Harsh, perhaps, but this will save you from failed projects or mismatched expectations.
As artificial intelligence, machine learning, and natural language processing continue to gain momentum, it is easy to have a fallacy in thinking: “Can’t A.I. just understand what’s on my documents?”
A.I. should only be viewed from the context of using computers (machines) to do something really difficult. With this idea in mind, you must understand that you’ve been bombarded with a lot of expectation-shaping when it comes to A.I., and especially deep learning, and neural networks.
How Much Should A.I. Be Like Human Intelligence?
Steer clear of the implications that A.I.-based “intelligence” should, will, or can, work like human intelligence. For example, neural networks do not, in fact “copy” the structure of the human brain, as is often claimed.
A.I. doesn’t, and won’t ever, work like that that. There is absolutely zero evidence that a trained neural network, or any other A.I. system, has, or can have anything like generalized intelligence — absolutely none.
A.I.-based tools are incredibly helpful, but they’re only ever going to be part of a full solution. Intelligent document processing isn’t a narrow problem, and it’s never going to be. That’s why A.I. won’t save you!
Optical Character Recognition
OCR is responsible for transforming pixels on a page that represent the text we read with our eyes to text that software will work with to understand what is written.
In addition to document imaging and legacy capture, OCR is a crucial part of intelligent document processing. New advances in technology make OCR extremely powerful in IDP.
OCR can also be a point of weakness in a solution. If the only OCR solution available belongs to a third party, consider what happens if that solution is discontinued or sold to a competitor (this has already happened!).
The same goes for proprietary OCR — will it remain supported, and is there a roadmap for making improvements?
Here are the critical things to find out about OCR:
- How many OCR engines are used?
- Are they proprietary, or are someone else’s technology?
- Get a list of the engines and do your own research on known limitations.
- What unique processes / approaches make the OCR better?
- Find out how important OCR is to the overall solution (it should be very important!).
- How are OCR errors handled (there should be multiple solutions to correct “bad” OCR)?
- How does the system handle multiple font types on the same document?
Machine Learning and Document Classification
Machine learning (ML) is a way to build systems that display artificial intelligence (remember, solving hard problems). Generally speaking, ML is a strategy that builds systems that:
- Are not specifically programmed to perform their task until given training data
- Get better (to a point) at performing their task when exposed to more data or more inputs
ML algorithms are generally divided into two types: supervised and unsupervised.
Unsupervised machine learning should not be used in document-based A.I. projects because it:
- Requires a very large data set to work
- Doesn’t work very well in general
If people ask about or promote unsupervised ML, they are talking about one of two different things:
- Will the A.I. get better at classifying (or extracting) on its own?
- Will A.I. save me from having to understand my own documents or data?
The answer to both of these questions is “No!” Document A.I. doesn’t get better on its own, because a human-centric design of document processing systems achieves better outcomes than unsupervised ML in almost every use-case.
You Still Have to Understand Your Own Documents
Nothing — not even Amazon, or Azure, or Watson — will save you from having to understand your own documents and data. If someone tells you otherwise, they’re lying.
And if someone you’re talking to believes that Grooper (or any other document processing system) will save your from your own document or data problems, they’re setting you up for failure.
If the IDP solution is using supervised machine learning, here’s what you need to know:
- What ML algorithm(s) are being used?
- What data / inputs can you feed the ML?
- Can the ML work on features like the presence of addresses, phone numbers, etc.?
- Is the learning part of the ML transparent for humans to “see” how the machine is learning?
- Can ML be used in classification and data extraction? If so, how?
Natural Language Processing
Natural language processing (NLP) finds paragraphs, sentences, or other language elements that convey specific meaning. NLP is responsible for context-based data capture which is important in creating an understanding of what specific dates, names, etc. on a document represent.
NLP is also responsible for distinguishing between commonly used descriptions that differ, but mean the same thing, like “NE,” “Northeast,” and “NE ¼ of the SW ¼,” and “NE ¼ and the SW ¼.” These are difficult problems to solve without built-in NLP.
NLP is typically either “baked into” the solution and available at multiple points during processing or is essentially an add-on from an external library (not as useful).
Here are the types of questions to ask while evaluating natural language processing:
- How does the NLP help during document processing?
- How are paragraphs isolated and identified (does the software understand when paragraphs span multiple physical pages?)?
- Can the NLP join paragraphs together that span multiple pages?
- Does the NLP have an awareness of features within paragraphs like entries from a pre-defined list (lexicon), or non-value features like addresses, phone numbers, given names, etc.?
- How does the software handle data recorded on a document with a label that isn’t in a predictable place (i.e., a value has a corresponding label somewhere on the page)?
Structured, Semi-Structured, and Unstructured Data
IDP solutions providers should have multiple use-cases with all types of documents. What you need to understand is what these different document types “look like,” and the significance of processing each one.
How a vendor replies to questions about these different document types will tell a great story about their overall document processing journey, and the power of their technology.
Structured documents are the easiest document types to process. As mentioned previously, they are documents with layouts that don’t change — even when new documents are received.
Structured documents can be painful to process when:
- You don’t have control of the original scanning
- When they are printed / scanned the page image comes in at different sizes
For structured document processing, you only need to ask how the system works with known documents where the formatting is sometimes off by a little bit. They should be able to explain how their data extraction models work when labels and values shift around a bit.
PRICE: Structured document extraction is the least expensive of all options. If this is all you need, expect to pay as little as $10,000 per year.
Semi-structured documents are some of the most difficult documents to process. Think of an invoice. It usually has a single-line header that shows where quantity, description, price, etc. values are to be found.
A good IDP solution will have no trouble finding each line-item, and all details regardless of layout and page breaks.
Now, think of that same invoice. But then add multiple headers, each with new data that changes how the information below is to be interpreted. This presents a very difficult problem because an understanding of many hierarchies of data must be built within the software to extract the correct data.
Common Examples of Semi-Structured Documents
The most common example of difficult semi-structured documents is explanation of benefits forms for multiple patients used in Healthcare. Other examples include mortgage documents and financial statements.
The vendor’s approach to processing complex semi-structure forms should be different than that used to process simpler structured forms. Ask if they process explanation of benefits forms and how that works.
You’ll quickly gain an understanding of whether their tool is purpose-built for a few use-cases, or if it is built to handle multiple (any) document types.
PRICE: Semi-structured documents represent some of the most difficult document extraction use-cases imaginable. To build out robust explanation of benefits extraction, corresponding workflows, and receive training, expect solutions to start out in the $100,000 price range.
For easier semi-structured documents like invoices and mortgage documents, expect to pay the vendor around $50,000 — $90,000 to get the solution into production.
Unstructured documents rely mainly on machine learning, accurate OCR, and natural language processing. Unstructured documents are typically legal documents in the form of contracts. They likely contain:
- Many paragraphs of clauses
- Legal terms
- Durations of time
- Addendums that modify language in the original contract
Data extraction from these documents relies heavily on human validation. A good IDP software will provide a review module that shows operators where on the page extracted data was found, and a chance to review / modify the data.
In instances where the wrong information was extracted, a workflow to retrain the model should be provided.
PRICE : Success with unstructured document solutions depends on the availability of subject matter experts (SMEs). Because these documents are often somewhat universal within particular industries, it is not rare to find a solutions provider already using IDP who will offer a very low per-page price for data extractions. These can be as little as $.10 / page or as much as $.25 per page.
If you choose to start an unstructured document data integration project yourself, plan for licensing costs in the $40,000 range and training, and services in the $60,000 — $90,000 range.
Digital Text Files
Because IDP systems excel at processing text-based data, do not rule out pure text-only digital files! These files are often used in business-to-business (B2B) data exchange workflows and are ripe for IDP use-cases.
In most situations, these text files contain a massive amount of information, and not all of it is needed. It will likely be in a format that is foreign to your back-end systems, so using IDP to extract the most important information, and then format it creates even more value for IDP in your organization.
As mentioned previously in the unstructured documents section, IDP platforms should provide a method for human review of all extracted data.
Business rules should be enforceable so that certain data elements are either:
- Always presented to a human for review, or
- When the software doesn’t meet a certain confidence threshold for predicted accuracy
Another critical element of human-in-the-loop architecture is in the way the software is trained to work with your documents and data. The best approach to bringing an IDP system into production from proof-of-concept is to have subject matter experts intimately involved in the systems initial training and configuration.
They know the documents better than anyone and will quickly point out what data is important and how they find it on the documents. The software should be configured to mirror the way they think about the information contained on the documents.
Don’t neglect the critical step of how data will be moved from the IDP system to your workflows and downstream software applications. You may require real-time integration to robotic process automation bots, or you may need to move extracted data into a staging database for final review before integrating with line of business systems.
The IDP platform should provide multiple paths for data integration to meet the diverse needs of your organization.
Part of the IDP selection process must include your cloud strategy. What is your cloud strategy? Do you have one? If not, it’s OK, but if you do have one, be prepared to explain it and how new solutions needs to fit into it.
It is not in the scope of this intelligent document processing buyers guide to advise your cloud strategy.
IDP platforms are delivered in many different models:
- Do you simply need a web front-end?
- Or the ability to run a service within a docker container?
- Or is running a server-based solution in AWS or your private cloud enough?
With a concisely stated cloud strategy you will be able to short list vendors much more quickly (and don’t be surprised if they challenge you to gain a better understanding of how their offerings might fit into your technology ecosystem).
IDP Pricing, License Models, and Delivery
Vendors have many ways of generating revenue from IDP software. Getting a clear picture of how pricing works initially, and in the future is important. Your goal is to gain an understanding of what present and future activities will affect pricing.
Pay particular attention to solutions which require multiple SKUs to complete a solution. If, for example, the core components of your solution all require a line-item price on your quote you are likely getting into a solution which has grown through acquisitions.
This isn’t necessarily good or bad; it just means the vendor must recoup costs on each piece of technology it has purchased. Oftentimes these solutions will be more expensive and complicated to price.
Perpetual vs. Subscription?
Perpetual licenses are becoming much less common as vendors shift to subscription-based pricing models. Perpetual licenses cost more upfront than subscription.
Perpetual Pros: You own a license for the software and can decide to continue using it even if you no longer pay maintenance. The software cost was paid once, and never again.
Perpetual Cons: While you may own the license, you also typically only own only a single version. You won’t gain access to new features and functionality provided by new software versions.
Subscription Pros: You pay only a flat fixed monthly fee for full access to the software. Upgrades and different support tiers are included, so there are no surprises to deal with down the road.
Subscription Cons: If you don’t take advantage of newly released features and functionality, you will eventually be paying more for the solution than if you were able to purchase a perpetual license.
Page Volume Pricing
Page volume pricing is typically offered in tiers, or buckets of pages. Because of the flexibility in IDP platforms, get a clear idea of what a “page” is, and all scenarios that will affect volume.
Page volume pricing has the benefit of allowing faster processing. These systems typically let you throw as much compute as you can afford at the IDP solution and it will scale to nearly endless speed and overall throughput.
Core-based pricing is essentially the opposite of page volume pricing. With page volume pricing, you could consume your entire allotment of pages in any amount of time. Given enough compute resources, you could burn through a million pages in a month.
Core-based pricing may provide unlimited page volume, but limits throughput, and increases the time it takes complete larger projects more quickly. By licensing additional computer cores, you will increase the total throughput of the IDP solution.
Per-feature pricing may be an option with both page volume and core-based license models. This pricing model is seen more often on IDP systems which have grown by acquisition.
When a new type of tool is needed, or advanced functionality, buying teams have the option of adding on additional functionality which incurs additional charges and fees. Ensure the demonstration or POC you were involved with is accompanied by accurate and transparent pricing for all required features / capabilities.
Delivery of the Solution
Because IDP solutions vary in scope and capability, delivery methods will vary significantly.
Platform solutions provide the most capability but they are more difficult to set up. Even IDP platforms that offer pre-built solutions will require a considerable amount of time to configure because of the 80 / 20 rule.
If an IDP platform solution has built-in invoice processing, for example, it will only really work on about 80% of your invoices. Because you are going to need more automation than just 80 out of 100 invoices, you will spend configuration time getting the platform tooled up to get the remaining 20% of your invoices.
This process may break part of what was working for the 80% and will require technical resources to get fully up and running.
What Solution Do We Recommend?
It is our recommendation that for most use-cases, an IDP platform should be configured from the ground up on your documents. You most likely have enough specific extraction and workflow requirements that it makes sense to tailor the solution to your needs rather than re-work your processes due to software inflexibility.
Delivering a production-ready IDP solution can cost more than the software itself depending on how challenging your documents and workflows are. Do not let this deter you or put you back in the situation of manually extracting data!
Once the IDP system is fully up and running, you will reap the benefits of:
- Decreased ongoing costs
- Faster access to accurate information
- The ability to look at workflows from a new perspective to increase innovation and digital transformation
Keep in mind that your organization, processes, documents, expectations, and requirements are different from any other organization. The IDP vendor will need your input and your expertise to get the solution configured. Nobody knows your documents like you do, so your subject matter experts will be intimately involved with the IDP delivery team.
Intelligent Document Processing Training
The more you invest in training, the more strategic you will be. All IDP vendors offer implementation, either through a partner, or through their own professional services teams.
Seek out an IDP vendor with a robust training program who ultimately wants you to be self-sufficient and not dependent on endless services.
Even if your initial use-case is something simple, like invoice processing, you will eventually discover new use-cases in other departments. If you have trained technical resources on how to configure and deliver the IDP platform, you’ll achieve:
- Faster return on investment
- Faster implementation times for new use-cases without a big lift from external (expensive!) professional services
Training resources should include an online wiki / user guides, tiered training programs (instructor-led and self-paced), a support community where users interact with each other, and regular demonstrations on how to use advanced functionality.
Intelligent document processing is primarily a machine that takes data in, and outputs structured data.
As such, it should be viewed as a temporary custodian of your data. When you think about getting data from documents, you must also consider where the data will go (or live), and where the documents will go (or live).
Because IDP solutions will create a pristine copy of original physical or scanned documents, you’ll want to store them somewhere for humans to look at later on. For this, you will need a content management system like:
- OnBase, etc.
These content management systems will all have different options for storing document metadata, approval workflows, storage limits, etc. You’ll need to factor in the cost of these solutions, or choose an IDP vendor who provides an all-in-one solution.
Robotic Process Automation Tools
Outside of content management and workflow solutions, another popular complementary technology to IDP is robotic process automation. If you’ve made it this far in your search for information, then you’ve doubtless run across vendors like:
- Automation Anywhere
- Blue Prism, etc.
These are all great vendors offering solutions to automate tedious “star and compare” work between software systems that aren’t yet talking directly to each other.
IDP is a great fit to deliver structured data to RPA solutions. Buyer beware — just because a vendor’s portfolio includes “intelligent document processing” doesn’t mean that it is more than just document imaging or document capture. True IDP is incredibly complex and not typically offered within other automation solutions.
The Right Mindset for Success
Above all, you must clearly define the value that automation will bring to your organization. If you are uncertain how to prove out the value, but know “there’s got to be a better way,” using a business analyst (BA) to uncover the value is a great place to start.
Most IDP vendors will have BAs available to help plot a successful course. It may decrease overall project costs by engaging the services of a BA rather than paying for a proof of concept.
If your leadership demands a proof-of-concept (POC) before purchase, consider structuring the POC in a way that maximizes your investment. Obviously the POC will involve your documents.
Start Out Simple, And Build Up
Because IDP is not magic, and is in fact, quite difficult, choosing your most difficult documents for the POC may not be a good idea. Instead, you could start with a simpler automation that frees up staff time, while planning for a much more impactful project for the future.
Because IDP is all about intelligent automation, it needs to fit in with overall automation strategies and goals. One way we’ve seen organizations successfully transform business operations is by creating a series of goals. It may look like this:
- ACME 1.0 — this is ACME’s present state. There may be little automation in place, but low-hanging fruit for automation quick wins have been identified.
- ACME 2.0 — this is ACME after initial quick automation wins have been accomplished. These may be in the realm of invoice processing, or HR document onboarding.
- ACME 3.0 — this is ACME’s goal of 80% of manual labor. Bear in mind, multiple technologies will be involved by this time: robotic process automation, the development of APIs, business process automation, etc.
- ACME 4.0 — ACME is automated to the point of digital transformation, and pivoting to offering new services, or industry-changing innovations that position ACME as a leader in profits and market share.
Common Terms and Abbreviations
- A.I. — Artificial Intelligence
- BA — Business analyst
- CMIS — Content Management Interoperability Services
- CV — Computer vision
- ECM — Electronic Content Management [System]
- IA — Intelligent automation
- IDP — Intelligent document processing (cognitive automation, capture 2.0; all the same)
- IP — Image processing
- ML — Machine learning
- NLP — Natural language processing
- OCR — Optical character recognition
- POC — Proof of concept
- SME — Subject matter expert
Originally published at https://blog.bisok.com.