- Overview
- Introduction
- Language Support
- AI Center Relation to Document Understanding
- Document Understanding Process
- Document Understanding Process: Studio Template
- Quickstart Tutorials
- Extracting Data From Receipts
- Invoices Retrained With One Additional Field
- Extracting Data From Forms
- Framework Components
- Taxonomy
- Taxonomy Overview
- Taxonomy Manager
- Taxonomy Related Activities
- Digitization
- Digitization Overview
- OCR Engines
- Digitization Related Activities
- Document Classification
- Document Classification Overview
- Configure Classifiers Wizard of Classify Document Scope
- Keyword Based Classifier
- Intelligent Keyword Classifier
- FlexiCapture Classifier
- Machine Learning Classifier
- Document Classification Related Activities
- Document Classification Validation
- Document Classification Validation Overview
- Classification Station
- Document Classification Validation Related Activities
- Document Classification Training
- Document Classification Training Overview
- Configure Classifiers Wizard of Train Classifiers Scope
- Machine Learning Classifier Trainer
- Document Classification Training Related Activities
- Data Extraction
- Data Extraction Overview
- Configure Extractors Wizard of Data Extraction Scope
- RegEx Based Extractor
- Form Extractor
- Intelligent Form Extractor
- Machine Learning Extractor
- FlexiCapture Extractor
- Data Extraction Related Activities
- Data Extraction Validation
- Data Extraction Validation Overview
- Validation Station
- Data Extraction Validation Related Activities
- Data Extraction Training
- Data Extraction Training Overview
- Configure Extractors Wizard of Train Extractors Scope
- Machine Learning Extractor Trainer
- Data Extraction Training Related Activities
- Data Consumption
- ML Packages
- About ML Packages
- Supported languages
- OCR
- ML Packages
- Other services
- OCR Configuration
- Hardware Requirements
- Pipelines
- About Pipelines
- Training Pipelines
- Evaluation pipelines
- Full pipelines
- Fine-tuning
- The Auto-Fine-tuning Loop (Public Preview)
- Document Manager
- About Document Manager
- Access Document Manager
- Use a Predefined Schema
- Create and Configure Fields
- Import Documents
- Label Documents
- Search Documents
- Export Documents
- Checkboxes and Signatures
- Dataset Diagnostics
- OCR Services
- OCR Services
- Document Understanding deployed in Automation Suite
- Install and use
- First run experience
- Deploy UiPathDocumentOCR
- Deploy an out-of-the-box ML package
- ML Packages Offline Installation
- Offline Bundles 2022.10.0
- Offline Bundles 2022.10.2
- Offline Bundles 2022.10.4
- Offline Bundles 2022.10.6
- Offline Bundles 2022.10.8
- Offline bundles 2022.10.9
- Offline Bundles 2022.10.10
- Offline bundles 2022.10.11
- Offline bundles 2022.10.12
- Offline bundles 2022.10.13
- Offline bundles 2022.10.14
- Offline bundles 2022.10.14+patch1
- Offline bundles 2022.10.15
- Use Document Manager
- Use the Framework
- Document Understanding deployed in AI Center standalone
- Install and use
- First run experience
- Deploy UiPathDocumentOCR
- Deploy an out-of-the-box ML package
- ML packages offline installation
- Use Document Manager
- Use the Framework
- Deep Learning
- Training High Performing Models
- Licensing
- Public Endpoints
- API Key
- Cloud and On-Prem Usage
- Machine Learning Extractor
- Metering & Charging Logic
- Legal Information
- References
- Activities Packages
- UiPath.Abbyy.Activities
- UiPath.AbbyyEmbedded.Activities
- UiPath.DocumentUnderstanding.ML.Activities
- UiPath.DocumentUnderstanding.OCR.LocalServer.Activities
- UiPath.IntelligentOCR.Activities
- UiPath.OCR.Activities
- UiPath.OCR.Contracts
- UiPath.DocumentProcessing.Contracts
- UiPath.OmniPage.Activities
- UiPath.PDF.Activities

Document Understanding User Guide
After automatic data extraction, one optional (but highly recommended) step is that of extracted data validation.
This refers to a human review step, in which knowledge workers can review the automatically extracted results and correct them when necessary.
Using Data Extraction Validation ensures that the structured data now available is 100% correct.
When Data Extraction Validation Should Be Used
It is strongly recommended to use the Data Extraction Validation components when:
- you need 100% accuracy on the data,
-
you have no other way to double-check the automatically extracted information from other sources of truth
- e.g., you can check a certain Name or Address that equals a Name or Address already confirmed and existing in a database, etc.
-
you do not have sufficient synthetic checks you can use on data consistency
-
e.g., you can check that line items add up to a total; you can check that an ID number checksum is correct, etc.
Note:Our strong recommendation is that, if possible, to add the Validation step, if you need 100% accuracy.
If this is not an option for all documents, then:
- try to double-check as much of the information as possible
- try to decide on specific confidence thresholds that the business use case can accept for certain fields
- make sure to always check both Extraction Confidence as well as OCR Confidence for a given value before making your decision.
-
How to Use the Data Extraction Validation Components
Validating the automatically extracted data can be done by a human input through the use of Validation Station.
The Validation Station is available both
- as an attended activity, through the use of the Present Validation Station activity, or
- as Action Center tasks, through the use of the Create Document Validation Action and Wait for Document Validation Action and Resume activities.