UiPath Documentation
document-understanding
2022.4
true
  • Overview
    • Introduction
    • Language Support
    • AI Center Relation to Document Understanding
  • Document Understanding Process
    • Document Understanding Process: Studio Template
  • Quickstart Tutorials
    • Extracting Data From Receipts
    • Invoices Retrained With One Additional Field
    • Extracting Data From Forms
  • Framework Components
    • Taxonomy
      • Taxonomy Overview
      • Taxonomy Manager
      • Taxonomy Related Activities
    • Digitization
      • Digitization Overview
      • OCR Engines
      • Digitization Related Activities
    • Document Classification
      • Document Classification Overview
      • Configure Classifiers Wizard of Classify Document Scope
      • Keyword Based Classifier
        • Special Requirements
      • Intelligent Keyword Classifier
        • Special Requirements
      • FlexiCapture Classifier
      • Machine Learning Classifier
      • Document Classification Related Activities
    • Document Classification Validation
      • Document Classification Validation Overview
      • Classification Station
      • Document Classification Validation Related Activities
    • Document Classification Training
      • Document Classification Training Overview
      • Configure Classifiers Wizard of Train Classifiers Scope
      • Machine Learning Classifier Trainer
      • Document Classification Training Related Activities
    • Data Extraction
      • Data Extraction Overview
      • Configure Extractors Wizard of Data Extraction Scope
      • RegEx Based Extractor
        • Special Requirements
      • Form Extractor
        • Special Requirements
      • Intelligent Form Extractor
        • Special Requirements
      • Machine Learning Extractor
      • FlexiCapture Extractor
      • Data Extraction Related Activities
    • Data Extraction Validation
      • Data Extraction Validation Overview
      • Validation Station
      • Data Extraction Validation Related Activities
    • Data Extraction Training
      • Data Extraction Training Overview
      • Configure Extractors Wizard of Train Extractors Scope
      • Machine Learning Extractor Trainer
      • Data Extraction Training Related Activities
    • Data Consumption
      • Data Consumption Overview
      • Data Consumption Related Activities
  • ML Packages
    • About ML Packages
    • Hardware Requirements
    • Supported languages
      • OCR
      • ML Packages
      • Other services
    • OCR Configuration
  • Pipelines
    • About Pipelines
    • Training Pipelines
    • Evaluation Pipelines
    • Full Pipelines
    • Fine-tuning
    • The Auto-Fine-tuning Loop (Public Preview)
  • Document Manager
    • About Document Manager
    • Access Document Manager
    • Use a Predefined Schema
    • Create and Configure Fields
    • Import Documents
    • Label Documents
    • Search Documents
    • Export Documents
    • Checkboxes and Signatures
  • OCR Services
    • OCR Services
  • Document Understanding deployed in Automation Suite
    • Install and Use
    • First Run Experience
    • Deploy UiPathDocumentOCR
    • Deploy an Out-of-the-box ML Package
    • ML Packages Offline Installation
      • Offline bundles 2022.4.15
      • Offline Bundles 2022.4.14
      • Offline Bundles 2022.4.13
      • Offline Bundles 2022.4.12
      • Offline Bundles 2022.4.11
      • Offline Bundles 2022.4.9
      • Offline Bundles 2022.4.7
      • Offline Bundles 2022.4.5
      • Offline Bundles 2022.4.0
    • Use Document Manager
    • Use the Framework
  • Document Understanding deployed in AI Center standalone
    • Install and Use
      • Hardware Requirements
      • SQL Server Requirements
    • First Run Experience
      • Activate the License
      • Create a Project on AI Center
      • Upload the Document Understanding ML Packages
      • Create a Data Labeling Session
      • Launch the Data Labeling Session
    • Deploy UiPathDocumentOCR
    • Deploy an Out-of-the-box ML Package
      • Create an Invoices ML Package
      • Deploy the Invoices ML Package as an ML Skill
    • ML Packages Offline Installation
    • Use Document Manager
    • Use the Framework
      • Use Document Understanding Models (including UiPathDocumentOCR)
  • Deep Learning
    • Training High Performing Models
      • Data Extraction Components
  • Licensing
    • Public Endpoints
    • API Key
    • Cloud and On-Prem Usage
      • Machine Learning Extractor
    • Metering & Charging Logic
    • Legal Information
  • References
    • Activities Packages
      • UiPath.Abbyy.Activities
      • UiPath.AbbyyEmbedded.Activities
      • UiPath.DocumentUnderstanding.ML.Activities
      • UiPath.DocumentUnderstanding.OCR.LocalServer.Activities
      • UiPath.IntelligentOCR.Activities
      • UiPath.OCR.Activities
      • UiPath.OCR.Contracts
      • UiPath.DocumentProcessing.Contracts
      • UiPath.OmniPage.Activities
      • UiPath.PDF.Activities
UiPath logo, featuring letters U and I in white

Document Understanding User Guide

Last updated Apr 4, 2025

Intelligent Keyword Classifier

What Is Intelligent Keyword Classifier

The Intelligent Keyword Classifier is a classifier that uses the word vector it learns from files of certain document types to perform document classification.

The algorithm is built around the concept of repeating content for the same document type and starts from the premise that document types have a series of words that usually occur in those document types, thus allowing for a vector similarity computation.

When classifying a file into a document type, the Intelligent Keyword Classifier:

  • finds the closest word vector a file is more similar to,
  • reports on the highest scoring document type, with the underlying matching main words.

The Intelligent Keyword Classifier also has file splitting capabilities, meaning that it can report more than one class for a given file, for separate page ranges.

When To Use

You should consider using this classifier if:

  • your files contain one or more document types within a single file
  • your document types are relatively easy to differentiate as far as content goes.

How To Train

Place the Intelligent Keyword Classifier Trainer activity in a Train Classifiers Scope, and configure it accordingly.

We cannot enforce training file consistency across parallel trainings at the activity level. Two possible solutions for this issue are provided by Document Understanding Process. Both consist of traffic control:

  1. lock files (implemented by default in the process): rename the file using the .lock extension, modify and save the file, then rename the file again, removing the .lock extension
  2. manual setup of a special queue: create an empty queue in Orchestrator and integrate your two activities from the project.

For more information on how to train a Classifier, check this page that describes the process of using the Manage Learning wizard.

Learn More

Learn more about Intelligent Keyword Classifier, by following this link.

  • What Is Intelligent Keyword Classifier
  • When To Use
  • How To Train
  • Learn More

Was this page helpful?

Connect

Need help? Support

Want to learn? UiPath Academy

Have questions? UiPath Forum

Stay updated