UiPath Documentation
document-understanding
2021.10
false
  • Getting Started
    • Introduction
    • Language Support
    • Document Understanding Process: Studio Template
    • Document Understanding Configuration Checklist
    • AI Center Relation to Document Understanding
  • Framework Components
    • Taxonomy
      • Taxonomy Overview
      • Taxonomy Manager
      • Taxonomy Related Activities
    • Digitization
      • Digitization Overview
      • OCR Engines
      • Digitization Related Activities
    • Document Classification
      • Document Classification Overview
      • Configure Classifiers Wizard of Classify Document Scope
      • Keyword Based Classifier
      • Intelligent Keyword Classifier
      • FlexiCapture Classifier
      • Machine Learning Classifier
      • Document Classification Related Activities
    • Document Classification Validation
      • Document Classification Validation Overview
      • Classification Station
      • Document Classification Validation Related Activities
    • Document Classification Training
      • Document Classification Training Overview
      • Configure Classifiers Wizard of Train Classifiers Scope
      • Machine Learning Classifier Trainer
      • Document Classification Training Related Activities
    • Data Extraction
      • Data Extraction Overview
      • Configure Extractors Wizard of Data Extraction Scope
      • RegEx Based Extractor
      • Form Extractor
      • Intelligent Form Extractor
      • Machine Learning Extractor
      • FlexiCapture Extractor
      • Data Extraction Related Activities
    • Data Extraction Validation
      • Data Extraction Validation Overview
      • Validation Station
      • Data Extraction Validation Related Activities
    • Data Extraction Training
      • Data Extraction Training Overview
      • Configure Extractors Wizard of Train Extractors Scope
      • Machine Learning Extractor Trainer
      • Data Extraction Training Related Activities
    • Data Consumption
      • Data Consumption Overview
      • Data Consumption Related Activities
  • ML Packages
    • About ML Packages
    • Hardware Requirements
    • Supported languages
      • OCR
      • ML Packages
      • Other services
    • OCR Configuration
    • Install and Use Intelligent Form Extractor
  • Pipelines
    • About Pipelines
    • Training Pipelines
    • Evaluation Pipelines
    • Full Pipelines
    • Fine-tuning
    • The Auto-Fine-tuning Loop (Public Preview)
  • Data Manager
    • About Data Manager
    • Access Data Manager
    • Create and Configure Fields
    • Use a Predefined Schema
    • Import Documents
    • Label Documents
    • Search Documents
    • Export Documents
    • Checkboxes and Signatures
  • OCR Services
    • OCR Services
  • Document Understanding deployed in Automation Suite
    • Install and Use
    • First Run Experience
    • Deploy UiPathDocumentOCR
    • Deploy an Out-of-the-box ML Package
    • ML Packages Offline Installation
    • Use Data Manager
    • Use the Framework
  • Document Understanding deployed in AI Center standalone
    • Install and Use
    • First Run Experience
    • Deploy UiPathDocumentOCR
    • Deploy an Out-of-the-box ML Package
    • ML Packages Offline Installation
    • Use Data Manager
    • Use the Framework
  • Deep Learning
    • Training High Performing Models
  • Licensing
    • Public Endpoints
    • API Key
    • Cloud and On-Prem Usage
    • Metering & Charging Logic
  • References
    • Activities Packages
      • UiPath.Abbyy.Activities
      • UiPath.AbbyyEmbedded.Activities
      • UiPath.DocumentUnderstanding.ML.Activities
      • UiPath.DocumentUnderstanding.OCR.LocalServer.Activities
      • UiPath.IntelligentOCR.Activities
      • UiPath.OCR.Activities
      • UiPath.OCR.Contracts
      • UiPath.DocumentProcessing.Contracts
      • UiPath.OmniPage.Activities
      • UiPath.PDF.Activities
UiPath logo, featuring letters U and I in white
OUT OF SUPPORT

Document Understanding User Guide

Last updated Feb 4, 2025

Document Classification Overview

What Is Document Classification

Document Classification is a component in the Document Understanding Framework that helps in identifying what types of files the robot is processing.

A file can be classified into one or more document types, depending on its content and the classification methods used:

  • if a file contains a single logical document type (e.g., it is an Invoice or a Medical Record in all its entirety), then the classification component should be configured accordingly and return a single classification result;
  • if a file contains multiple logical document types (e.g., it contains an Invoice from page 1 to page 5, and a Medical Record for the next 10 pages, and an Insurance Agreement from page 16 to the end), then the classification component should return multiple classification results, each corresponding to the right page range from the input file.

Document Types that classification is attempted on are the ones defined in the project Taxonomy.

When Document Classification Should Be Used

On the one hand, if a project needs to process files that are all of the same document type and are always present as one instance per file (e.g., one invoice in one file), then classification is not necessary and can be skipped in its entirety.

On the other hand, if the project is dealing with two or more document types (e.g., the workflow must process Invoices and Medical Records which cannot be distinguished before processing), or files are sometimes expected to contain two or more distinct document types within them (e.g., one file contains 3 Invoices), then classification is strongly recommended.

How to Use the Document Classification Component

Classification is done through the Classify Document Scope activity. To classify the documents, you can use one or more classifiers, as the scope activity has the role of configuring and executing one or more algorithms for document classification and of offering an easy, unitary configuration option for all your needs.

In short, this is what the Classify Document Scope does:

  • Provides all Classifiers (classification algorithms) the necessary configurations for them to run.
  • Accepts one or more classifiers.
  • Allows for document type filtering, taxonomy mapping, and minimum confidence threshold settings at classifier level.
  • Reports classification information in a unified manner, irrespective of the source of classification.

The Classify Document Scope allows you to configure it by using the Configure Classifiers wizard. You can customize

  • which document types are accepted from which classifier,
  • what is the minimum confidence threshold for a given result that is acceptable for each classifier,
  • what is the taxonomy mapping, at document type level, between the project taxonomy and the classifier's internal taxonomy (if any).

Please note that the order of the classifiers in the Classify Document Scope is important:

  • classifiers are executed with priority, from left to right;
  • a classification result returned by a classifier is accepted if it reports one of the acceptable document types and has a confidence threshold equal to or above the minimum confidence threshold set for that classifier;
  • a classifier is executed with the page ranges that have remained unclassified by the previous classifiers only (so may be called multiple times in one execution).

Available Classifiers

Based on the requirements of the use case, you can choose from several classification methods, called classifiers.

Classifiers can be found in the UiPath.IntelligentOCR.Activities packages, as well as in other UiPath (UiPath.DocumentUnderstanding.ML.Activities) or third-party packages (UiPath.Abbyy.Activities).

The available classifiers are:

You can always build your own Classifier, by using the public Document Processing Contracts, thus being able to implement any algorithm that fits your use case.

  • What Is Document Classification
  • When Document Classification Should Be Used
  • How to Use the Document Classification Component
  • Available Classifiers

Was this page helpful?

Connect

Need help? Support

Want to learn? UiPath Academy

Have questions? UiPath Forum

Stay updated