- Overview
- Model building
- Model validation
- Model deployment
- Overview
- Publishing model versions
- Managing published versions
- Consuming published projects
- Building and consuming a workflow
- CLI
- API
- Frequently asked questions

Unstructured and complex documents user guide
You can consume the predictions of a published Unstructured and Complex Documents model version by building a workflow in UiPath Studio.
- Installing packages
- Defining a taxonomy
- Digitizing a document
- Classifying a document
- Extracting a document
- Validating a document
When you start building your Studio workflow, you must decide what type of project you want to run: Windows or Cross-platform. Each project type requires different packages.
Regardless of the project type you choose, you can install the packages:
- Automatically - Use the Document Understanding Process template. For more details on how to search and install templates in Studio, check Project templates.
- Manually - For more details, check Installing packages. If you choose to manually install the packages, make sure you install the following versions or newer, based on the project type:
- UiPath.DocumentUnderstanding.ML.Activities 1.31.1
- UiPath.IntelligentOCR.Activities 6.22.0
- UiPath.System.Activities 24.10.6
- UiPath.DocumentUnderstanding.Activities 2.12.0
- UiPath.System.Activities 24.10.6
- Studio only supports IntelligentOCR in Windows projects and is not compatible with cross-platform projects.
- You can build cross-platform workflows and use other templates in Studio Web.
This section contains the steps to follow if you choose not to use one of the Studio templates and start from scratch.
To build an IXP workflow for Windows projects, proceed as follows:
- In Studio Desktop, create a basic process.
- When configuring your process, in the Compatibility field, select what type of workflow you want to build: Windows or Cross-platform. For more details, check About automation projects.
- Open Taxonomy Manager from the Design tab and set up your table fields.
Note:
Taxonomy Manager:
- supports creating tables and fields. When you create IXP Unstructured and Complex Documents workflows, it is recommended to create table fields instead of just fields. If you use multiple document types, use table fields that map to IXP field groups in Taxonomy Manager.
- is available only when the IntelligentOCR package is installed. This means that it is only available on Windows projects, not Cross-platform.
- In the Sequence, add an Assign activity to specify where you want to read documents from. In the Save to field, create and add a variable of type System.String[]. In the Value to save, add
Directory.GetFiles("./documents"), and replace./documentswith your location. - Add a Load Taxonomy activity to store the configured taxonomy in a variable to reference it in the rest of the automation. Create and add a variable
of type DocumentTaxonomy.
Note: You need to map the variable to the output of the activity.
- Add a For Each activity to go through each document. For the input, add the docs variable you previously created.
- Drag and drop the following activities within For Each:Note: The following is the specific order in which you need to add the activities in the For Each activity.
- Digitize document to read the documents you provided, and obtain the Document Object Model (DOM) output. For the input, add a variable doc for the file path of the document you want to digitize.
- Classify Document Scope to classify the document being processed into one of the defined document types in your taxonomy.
For the inputs, add the following:
- Document Path - Add the doc variable.
- Document Text - Create and add the text variable.
- Document Object Model (DOM) - Create and add the dom variable.
- Taxonomy - Add the taxo variable.
For the outputs, add the following:
- Classification Results - Create and add a new variable ClassificationResults.
Add the following activities inside Classify Document Scope:- Generative Classifier to classify documents using generative models.
Note: A classification activity is optional if you only have one document type in your taxonomy. You can copy the document type ID and use that as an input to the Data Extraction Scope activity. - Digitize document to read the documents you provided, and obtain the Document Object Model (DOM) output. For the input, add a variable doc for the file path of the document you want to digitize.
- In the previous For Each activity, add another For Each to go through each classification result. For the input, add the ClassificationResults variable.
- Drag and drop the following activities within For Each:
- Data Extraction Scope to configure extractor activities. Add the following activities inside Data Extraction Scope:
- Document Understanding Project Extractor to extract the document data. Make sure you configure the extractor for each document type.
- Generative Extractor to extract documents using generative models. Make sure you add this activity inside the Data Extraction Scope activity.
For the inputs, add the following:
- Document Path – Add the doc variable.
- Document Text – Add the text variable.
- Document Object Model (DOM) – Add the dom variable.
- Taxonomy – Add the taxo variable.
- Classification Result – Add the ClassificationResults variable.
For the output, add the following:
- Extraction Results – Create and add a new variable ExtractionResults.
- Data Extraction Scope to configure extractor activities. Add the following activities inside Data Extraction Scope:
- Optionally, you can configure decision criteria to determine whether human validation is required for the classification output.
This can be done using custom business rules or post-processing logic. You can also use custom decision criteria in a workflow to trigger validation, or you can set up field-level confidence thresholds. This decision criteria is contingent on the business process requirements and your use case's allowance for false positives, that is results that skip human validation but have been extracted incorrectly.
Based on these rules, you can control whether a document is automatically validated or is routed to human validation.
- Add one of the following activities:
- Create Document Validation Artifacts
- Present Validation Station to validate in Validation Station. The output ExtractionResults of the Data Extraction Scope activity will be the input of the Present Validation Station activity. For the input, add the ExtractionResults variable. For the output, create and add a new variable ValidatedExtractionResults.
For the inputs, add the following:
- Document Path – Add the doc variable.
- Document Text – Add the text variable.
- Document Object Model (DOM) – Add the dom variable.
- Taxonomy – Add the taxo variable.
- Automatic Extraction Results – Add the ExtractionResults variable.
For the output, add the following:
- Validated Extraction Results – Create and add a new variable ValidatedExtractionResults.
In this validation step, you can also use other activities than the ones presented. For more details, check the following resources:
Validation Station
- Validation Station
- Classic Validation Station
- Compact Validation Station
- Manual validation for digitize documents
Action Center
Apps
This section contains the steps to follow if you choose not to use one of the Studio templates and start from scratch.
To build an IXP Unstructured and complex documents workflow for Cross-platform projects, proceed as follows:
Human validation of the classification output is triggered by applying decision logic after the classification step, before the workflow proceeds to extraction. The decision is not automatic by default, it is explicitly controlled through confidence thresholds and business rules defined in the workflow.
The following list shows how human validation can be triggered:
- Classification confidence evaluation
Each classification result includes confidence scores that indicate how certain the model is about the predicted document type. These scores are evaluated in the workflow to determine whether the classification is reliable.
- Confidence thresholds
You can define a minimum confidence threshold for classification. If the confidence score for the predicted document type falls below this threshold, the classification is considered uncertain and the document is flagged for human validation.
- Business rules and conditional logic
In addition to confidence thresholds, you can apply custom business rules, such as:
- Specific document types that always require manual review.
- Mismatches between expected and predicted document types.
- Rules based on how the document will be processed later. For example, documents that must be verified before extraction or approval.
- Triggering the validation step
When the defined criteria are met, the workflow routes the document to a human validation step by invoking one of the validation mechanisms:
- Present Validation Station for in-robot validation.
- Create Validation Task for Action Center-based validation.
- Create Document Validation Artifacts for validation in Apps.
- Human confirmation or correction
During validation, the human reviewer confirms or corrects the document type. The validated classification result is then used by subsequent steps, such as data extraction, ensuring that downstream processing is based on an approved document type.
To conclude, human validation for classification is triggered by workflow-controlled rules, typically based on confidence scores and business logic, which determine when a classification result requires manual review before the process continues.
When using workflows that leverage models for IXP Unstructured and complex documents, the Validation Station serves as a crucial interface for reviewing, confirming, and refining the extracted data. Validation Station shows how the model interpreted the document, allowing you to understand the extraction accuracy, identify uncertain areas, and make corrections where needed.
In Validation Station, the document type and its corresponding fields are displayed alongside the extracted values and confidence indicators.
For more details on the validation process, check the following resources:
The following table shows a comparison between the IXP workflows for Windows and Cross-platform projects:
| Windows | Cross-platform | |
|---|---|---|
| Packages required |
|
|
| Defining the taxonomy | The Taxonomy Manager option allows you to define the list of fields that will show in the Validation Station or included in the extraction results
object.
Note: Taxonomy Manager is available only when the Intelligent OCR package is installed.
| The Document Understanding package automatically reads and displays the fields defined in the IXP model schema. These fields are not configured through the workflow. |