- Overview- About Document Understanding™
- Introduction
- Fundamental capabilities
- Key concepts
 
- Getting started
- Building models
- Consuming models
- ML packages- Public endpoints
- 1040 - document type
- 1040 Schedule C - document type
- 1040 Schedule D - document type
- 1040 Schedule E - document type
- 1040x - document type
- 3949a - document type
- 4506T - document type
- 709 - document type
- 941x - document type
- 9465 - document type
- ACORD125 - document type
- ACORD126 - document type
- ACORD131 - document type
- ACORD140 - document type
- ACORD25 - document type
- Bank Statements - document type
- Bills Of Lading - document type
- Certificate of Incorporation - document type
- Certificate of Origin - document type
- Checks - document type
- Children Product Certificate - document type
- CMS 1500 - document type
- EU Declaration of Conformity - document type
- Financial Statements - document type
- FM1003 - document type
- I9 - document type
- ID Cards - document type
- Invoices - document type
- Invoices2 - document type
- Invoices Australia - document type
- Invoices China - document type
- Invoices Hebrew - document type
- Invoices India - document type
- Invoices Japan - document type
- Invoices Shipping - document type
- Packing Lists - document type
- Payslips - document type
- Passports - document type
- Purchase Orders - document type
- Receipts - document type
- Receipts2 - document type
- Receipts Japan - document type
- Remittance Advices - document type
- UB04 - document type
- US Mortgage Closing Disclosures - document type
- Utility Bills - document type
- Vehicle Titles - document type
- W2 - document type
- W9 - document type
 
 
- Supported languages
- Data and security
- Licensing and Charging Logic
- How to
- Troubleshooting

Document Understanding User Guide
To automate document processing, four fundamental capabilities are required: digitization, classification, extraction, and validation.
Digitization converts a physical document into machine-readable text, which can then be processed digitally. Although Optical Character Recognition (OCR) is a significant part of digitization, the digitization process is more complex and involves various steps, including OCR.
For example, when dealing with PDF documents, the digitization algorithm can distinguish between scanned and native PDFs or hybrid ones that contain scanned images and native text. Most of the text can be extracted directly from a native PDF document, but in some cases, a few logos may need to be read using OCR. The digitization process can handle all of these situations to ensure maximum accuracy in text detection while running quickly and efficiently.
You can change the OCR used in your project from Project settings. For more information, check the Configure project settings page. You can check the available OCR engines and the supported languages from the Supported languages section of the user guide.
You can check the Known limitations page for more information on the supported files, image size limits, and more specifications.
In most use cases, documents need to be sorted into logical categories so different processing methods can be applied to them.
The objective of a classification is to scan a document and decide what document type it belongs to. Knowing the type of a document is important, as different document types require different processing techniques. For example, an invoice needs to be processed by an invoice extraction model to ensure all relevant fields get extracted.
Data extraction is the process of selecting and retrieving only the relevant information from a document. Extracting specific data from a lengthy document using string manipulation can be challenging. However, Document UnderstandingTM provides various extraction methodologies for different document types and formats. For example, we only want to extract the Vendor Name, Billing Name, Due Date, and Total fields from an invoice.
In classification and extraction, software robots use the concept of confidence, which measures the level of certainty that a particular task was performed well. The task can either be recognizing a document type, identifying a field, or reading the data in it. In these cases, the Document Understanding framework allows you to engage a human user to review and validate the robot's output. In the best scenario, the human input is used to train the robot's accuracy through machine learning.