ixp
latest
false
  • Overview
    • Introduction
    • Extracting data from unstructured documents
    • Building and deploying models
    • Quotas
  • Model building
  • Model validation
  • Model deployment
    • Overview
    • Publishing model versions
    • Managing published versions
    • Consuming published projects
    • Building and consuming a workflow
    • CLI
  • API
  • Frequently asked questions
UiPath logo, featuring letters U and I in white

Unstructured and complex documents user guide

Last updated Dec 22, 2025

Building and consuming a workflow

You can consume the predictions of a published Unstructured and Complex Documents model version by building a workflow in UiPath Studio.

Overview

Building an IXP Unstructured and Complex Documents workflow generally involves the following steps:
  1. Installing packages
  2. Defining a taxonomy
  3. Digitizing a document
  4. Classifying a document
  5. Extracting a document
  6. Validating a document
Note: The taxonomy definition step only applies to Windows projects, not cross-platform ones. To find out the differences between project types, check the sections that follow.

Prerequisites

When you start building your Studio workflow, you must decide what type of project you want to run: Windows or Cross-platform. Each project type requires different packages.

Regardless of the project type you choose, you can install the packages:

Windows
  • UiPath.DocumentUnderstanding.ML.Activities 1.31.1
  • UiPath.IntelligentOCR.Activities 6.22.0
  • UiPath.System.Activities 24.10.6
Cross-platform
  • UiPath.DocumentUnderstanding.Activities 2.12.0
  • UiPath.System.Activities 24.10.6
Note:
  • Studio only supports IntelligentOCR in Windows projects and is not compatible with cross-platform projects.
  • You can build cross-platform workflows and use other templates in Studio Web.

Building an IXP workflow for Windows projects

This section contains the steps to follow if you choose not to use one of the Studio templates and start from scratch.

To build an IXP workflow for Windows projects, proceed as follows:

  1. In Studio Desktop, create a basic process.
  2. When configuring your process, in the Compatibility field, select what type of workflow you want to build: Windows or Cross-platform. For more details, check About automation projects.
  3. Open Taxonomy Manager from the Design tab and set up your table fields.
    Note:

    Taxonomy Manager:

    • supports creating tables and fields. When you create IXP Unstructured and Complex Documents workflows, it is recommended to create table fields instead of just fields. If you use multiple document types, use table fields that map to IXP field groups in Taxonomy Manager.
    • is available only when the IntelligentOCR package is installed. This means that it is only available on Windows projects, not Cross-platform.
  4. In the Sequence, add an Assign activity to specify where you want to read documents from. In the Save to field, create and add a variable of type System.String[]. In the Value to save, add Directory.GetFiles("./documents"), and replace ./documents with your location.
    The Assign activity.

  5. Add a Load Taxonomy activity to store the configured taxonomy in a variable to reference it in the rest of the automation. Create and add a variable of type DocumentTaxonomy.
    Note: You need to map the variable to the output of the activity.

    The Load Taxonomy activity.

  6. Add a For Each activity to go through each document. For the input, add the docs variable you previously created.
    The For Each activity.

  7. Drag and drop the following activities within For Each:
    Note: The following is the specific order in which you need to add the activities in the For Each activity.
    • Digitize document to read the documents you provided, and obtain the Document Object Model (DOM) output. For the input, add a variable doc for the file path of the document you want to digitize.
      The Digitize Document activity.

    • Classify Document Scope to classify the document being processed into one of the defined document types in your taxonomy.

      For the inputs, add the following:

      • Document Path - Add the doc variable.
      • Document Text - Create and add the text variable.
      • Document Object Model (DOM) - Create and add the dom variable.
      • Taxonomy - Add the taxo variable.

      For the outputs, add the following:

      • Classification Results - Create and add a new variable ClassificationResults.

      The Classify Document Scope activity.

      Add the following activities inside Classify Document Scope:
    Note: A classification activity is optional if you only have one document type in your taxonomy. You can copy the document type ID and use that as an input to the Data Extraction Scope activity.
  8. In the previous For Each activity, add another For Each to go through each classification result. For the input, add the ClassificationResults variable.
  9. Drag and drop the following activities within For Each:
    • Data Extraction Scope to configure extractor activities. Add the following activities inside Data Extraction Scope:
      • Document Understanding Project Extractor to extract the document data. Make sure you configure the extractor for each document type.
      • Generative Extractor to extract documents using generative models. Make sure you add this activity inside the Data Extraction Scope activity.

      For the inputs, add the following:

      • Document Path – Add the doc variable.
      • Document Text – Add the text variable.
      • Document Object Model (DOM) – Add the dom variable.
      • Taxonomy – Add the taxo variable.
      • Classification Result – Add the ClassificationResults variable.

      For the output, add the following:

      • Extraction Results – Create and add a new variable ExtractionResults.

      The Data Extraction Scope activity.

  10. Optionally, you can configure decision criteria to determine whether human validation is required for the classification output.

    This can be done using custom business rules or post-processing logic. You can also use custom decision criteria in a workflow to trigger validation, or you can set up field-level confidence thresholds. This decision criteria is contingent on the business process requirements and your use case's allowance for false positives, that is results that skip human validation but have been extracted incorrectly.

    Based on these rules, you can control whether a document is automatically validated or is routed to human validation.

  11. Add one of the following activities:
    • Create Document Validation Artifacts
    • Present Validation Station to validate in Validation Station. The output ExtractionResults of the Data Extraction Scope activity will be the input of the Present Validation Station activity. For the input, add the ExtractionResults variable. For the output, create and add a new variable ValidatedExtractionResults.

      For the inputs, add the following:

      • Document Path – Add the doc variable.
      • Document Text – Add the text variable.
      • Document Object Model (DOM) – Add the dom variable.
      • Taxonomy – Add the taxo variable.
      • Automatic Extraction Results – Add the ExtractionResults variable.

      For the output, add the following:

      • Validated Extraction Results – Create and add a new variable ValidatedExtractionResults.

      The Present Validation Station activity.

    In this validation step, you can also use other activities than the ones presented. For more details, check the following resources:

    Validation Station

    Action Center

    Apps

Building an IXP workflow for Cross-platform projects

This section contains the steps to follow if you choose not to use one of the Studio templates and start from scratch.

To build an IXP Unstructured and complex documents workflow for Cross-platform projects, proceed as follows:

Triggering human validation

Human validation of the classification output is triggered by applying decision logic after the classification step, before the workflow proceeds to extraction. The decision is not automatic by default, it is explicitly controlled through confidence thresholds and business rules defined in the workflow.

The following list shows how human validation can be triggered:

  1. Classification confidence evaluation

    Each classification result includes confidence scores that indicate how certain the model is about the predicted document type. These scores are evaluated in the workflow to determine whether the classification is reliable.

  2. Confidence thresholds

    You can define a minimum confidence threshold for classification. If the confidence score for the predicted document type falls below this threshold, the classification is considered uncertain and the document is flagged for human validation.

  3. Business rules and conditional logic

    In addition to confidence thresholds, you can apply custom business rules, such as:

    • Specific document types that always require manual review.
    • Mismatches between expected and predicted document types.
    • Rules based on how the document will be processed later. For example, documents that must be verified before extraction or approval.
  4. Triggering the validation step

    When the defined criteria are met, the workflow routes the document to a human validation step by invoking one of the validation mechanisms:

    • Present Validation Station for in-robot validation.
    • Create Validation Task for Action Center-based validation.
    • Create Document Validation Artifacts for validation in Apps.
  5. Human confirmation or correction

    During validation, the human reviewer confirms or corrects the document type. The validated classification result is then used by subsequent steps, such as data extraction, ensuring that downstream processing is based on an approved document type.

To conclude, human validation for classification is triggered by workflow-controlled rules, typically based on confidence scores and business logic, which determine when a classification result requires manual review before the process continues.

Interpreting Validation Station results from IXP models

When using workflows that leverage models for IXP Unstructured and complex documents, the Validation Station serves as a crucial interface for reviewing, confirming, and refining the extracted data. Validation Station shows how the model interpreted the document, allowing you to understand the extraction accuracy, identify uncertain areas, and make corrections where needed.

In Validation Station, the document type and its corresponding fields are displayed alongside the extracted values and confidence indicators.

For more details on the validation process, check the following resources:

Comparing Windows and Cross-platform project workflows

The following table shows a comparison between the IXP workflows for Windows and Cross-platform projects:

 WindowsCross-platform
Packages required
  • Intelligent OCR
  • Document Understanding ML
  • Document Understanding
Defining the taxonomyThe Taxonomy Manager option allows you to define the list of fields that will show in the Validation Station or included in the extraction results object.
Note: Taxonomy Manager is available only when the Intelligent OCR package is installed.
The Document Understanding package automatically reads and displays the fields defined in the IXP model schema. These fields are not configured through the workflow.

Was this page helpful?

Get The Help You Need
Learning RPA - Automation Courses
UiPath Community Forum
Uipath Logo
Trust and Security
© 2005-2025 UiPath. All rights reserved.