- Overview
- Document Processing Contracts
- Release notes
- About the Document Processing Contracts
- Box Class
- IPersistedActivity interface
- PrettyBoxConverter Class
- IClassifierActivity Interface
- IClassifierCapabilitiesProvider Interface
- ClassifierDocumentType Class
- ClassifierResult Class
- ClassifierCodeActivity Class
- ClassifierNativeActivity Class
- ClassifierAsyncCodeActivity Class
- ClassifierDocumentTypeCapability Class
- ContentValidationData Class
- EvaluatedBusinessRulesForFieldValue Class
- EvaluatedBusinessRuleDetails Class
- ExtractorAsyncCodeActivity Class
- ExtractorCodeActivity Class
- ExtractorDocumentType Class
- ExtractorDocumentTypeCapabilities Class
- ExtractorFieldCapability Class
- ExtractorNativeActivity Class
- ExtractorResult Class
- FieldValue Class
- FieldValueResult Class
- ICapabilitiesProvider Interface
- IExtractorActivity Interface
- ExtractorPayload Class
- DocumentActionPriority Enum
- DocumentActionData Class
- DocumentActionStatus Enum
- DocumentActionType Enum
- DocumentClassificationActionData Class
- DocumentValidationActionData Class
- UserData Class
- Document Class
- DocumentSplittingResult Class
- DomExtensions Class
- Page Class
- PageSection Class
- Polygon Class
- PolygonConverter Class
- Metadata Class
- WordGroup Class
- Word Class
- ProcessingSource Enum
- ResultsTableCell Class
- ResultsTableValue Class
- ResultsTableColumnInfo Class
- ResultsTable Class
- Rotation Enum
- Rule Class
- RuleResult Class
- RuleSet Class
- RuleSetResult Class
- SectionType Enum
- WordGroupType Enum
- IDocumentTextProjection Interface
- ClassificationResult Class
- ExtractionResult Class
- ResultsDocument Class
- ResultsDocumentBounds Class
- ResultsDataPoint Class
- ResultsValue Class
- ResultsContentReference Class
- ResultsValueTokens Class
- ResultsDerivedField Class
- ResultsDataSource Enum
- ResultConstants Class
- SimpleFieldValue Class
- TableFieldValue Class
- DocumentGroup Class
- DocumentTaxonomy Class
- DocumentType Class
- Field Class
- FieldType Enum
- FieldValueDetails Class
- LanguageInfo Class
- MetadataEntry Class
- TextType Enum
- TypeField Class
- ITrackingActivity Interface
- ITrainableActivity Interface
- ITrainableClassifierActivity Interface
- ITrainableExtractorActivity Interface
- TrainableClassifierAsyncCodeActivity Class
- TrainableClassifierCodeActivity Class
- TrainableClassifierNativeActivity Class
- TrainableExtractorAsyncCodeActivity Class
- TrainableExtractorCodeActivity Class
- TrainableExtractorNativeActivity Class
- Document Understanding Digitizer
- Document Understanding ML
- Document Understanding OCR Local Server
- Document Understanding
- Release notes
- About the Document Understanding activity package
- Project compatibility
- Configuring external connection
- Set PDF Password
- Merge PDFs
- Get PDF Page Count
- Extract PDF Text
- Extract PDF Images
- Extract PDF Page Range
- Create Validation Task and Wait
- Wait for Validation Task and Resume
- Create Validation Task
- Create Classification Validation Task
- Create Classification Validation Task and Wait
- Wait for Classification Validation Task and Resume
- Intelligent OCR
- Release notes
- About the IntelligentOCR activity package
- Project compatibility
- Configuring Authentication
- Load Taxonomy
- Digitize Document
- Classify Document Scope
- Keyword Based Classifier
- Document Understanding Project Classifier
- Intelligent Keyword Classifier
- Create Document Classification Action
- Create Document Validation Artifacts
- Retrieve Document Validation Artifacts
- Wait For Document Classification Action And Resume
- Train Classifiers Scope
- Keyword Based Classifier Trainer
- Intelligent Keyword Classifier Trainer
- Data Extraction Scope
- Document Understanding Project Extractor
- RegEx Based Extractor
- Form Extractor
- Intelligent Form Extractor
- Create Document Validation Action
- Wait For Document Validation Action And Resume
- Train Extractors Scope
- Export Extraction Results
- ML Services
- OCR
- OCR Contracts
- Release notes
- About the OCR Contracts
- Project compatibility
- IOCRActivity Interface
- OCRAsyncCodeActivity Class
- OCRCodeActivity Class
- OCRNativeActivity Class
- Character Class
- OCRResult Class
- Word Class
- FontStyles Enum
- OCRRotation Enum
- OCRCapabilities Class
- OCRScrapeBase Class
- OCRScrapeFactory Class
- ScrapeControlBase Class
- ScrapeEngineUsages Enum
- ScrapeEngineBase
- ScrapeEngineFactory Class
- ScrapeEngineProvider Class
- OmniPage
- PDF
- [Unlisted] Abbyy
- [Unlisted] Abbyy Embedded

Document Understanding Activities
Extract Document Data
UiPath.IntelligentOCR.StudioWeb.Activities.ExtractDocumentDataWithDocumentData<UiPath.IntelligentOCR.StudioWeb.Activities.DataExtraction.ExtendedExtractionResultForDocumentData>
Extracts data from an input file or Document Data object, and stores the results into a Document Data object.
Prerequisites
The Extract Document Data activity requires input objects of type Document Data or File. A possible use case for using this activity is to precede it with a Classify Document activity, that generates an object of type Document Data.
Input options
- Document Data - from the Classify Document activity
- File - from Get File/Folder or Get Newest Email activities
Supported languages for generative models
The supported languages for the generative models are the same as the OCR engine used, which depends on the project. For the Predefined and Generative Predefined projects, the OCR Engine used is UiPath Document OCR. For more information, visit the OCR Supported languages page.
Models used by the activity
The Extract Document Data activity uses the following:- Pre-trained specialized models available out of the box, based on DocPath.
- Custom pre-trained models deployed in Document Understanding modern and classic projects.
- Generative extraction models.
The Generative Predefined project type and the corresponding extractors are not available in Automation Suite.
Designer panel
- Input - Requires you to specify
the file itself, or Document Data, in case you have used other Document Understanding
Activities before in your workflow, (for example, Classify
Document).
Important: The maximum numbers of pages a file can have is 500. Files exceeding this limit fail to extract.
- Project - Requires you to select
your Document Understanding project from the dropdown list. The available options are:
- Predefined – Classic project
type that uses pre-trained specialized models recommended for standard
scenarios.
For more information on the charging logic for classic project, visit Metering and charging logic.
- Generative Predefined – Modern
project type that uses pre-trained generative models accepting instructions as input
for extraction of document data.
For more information on the charging logic for modern projects, visit Metering and charging logic.
- Existing projects from the tenant and folder you are connected to.
- You can create a custom project by
going to Document Understanding.
For more information, visit Introduction for building models.
Note: If you have created more than 500 projects on your tenant and use the Extract Document Data activity, UiPath Studio or Studio Web will not display any projects beyond the initial 500. Therefore, those projects cannot be used. - Predefined – Classic project
type that uses pre-trained specialized models recommended for standard
scenarios.
- Extractor - After you select a
project, you can also select an extractor that you want to use.
- For the Predefined project,
you have two choices:
- Select a pre-trained model. Visit
Out-of-the-box models for a list of
pre-trained models that you can use.
Note: The Extract Document Data activity extracts the information for the fields available on the document type for the selected extractor (regardless of the actual type of the document). This is not applicable for generative models.
- Select the Generative
extractor.
Note: The information sent to the Generative Extractor goes to an LLM Model instance. This instance isn't publicly available, doesn't store the data sent, and doesn't use it for training purposes.Important:
This feature is currently part of an audit process and is not to be considered part of the FedRAMP Authorization until the review is finalized. See here the full list of features currently under review.
- Select a pre-trained model. Visit
Out-of-the-box models for a list of
pre-trained models that you can use.
- For the Generative Predefined
project, you have three choices for extraction, tailored to a specific document
layout:
- Long Document Simple Layout Extractor – Recommended for long form documents with mostly text and headings. For example, you can use the Long Document Simple Layout Extractor on documents such as lease agreements, master service agreements, or other similar documents.
- Long Document Complex Layout Extractor – Recommended for long form documents that include elements such as images, handwriting, form controls, floating callout boxes, or other complex layout types. For example, you can use the Long Document Complex Layout Extractor on documents such as insurance policies, or other similar documents.
- Short Document Complex Layout Extractor – Recommended for short documents that include elements such as images, handwriting, form control, floating callout boxes, or other complex layout types. For example, you can use the Short Document Complex Layout Extractor on documents such as government IDs, healthcare intake forms, or other similar documents.
- Use Classification Result: If
the Generate Data Type property is set to false, you can opt for the Use
Classification Result option. This option automatically uses a recommended
extractor based on the document type resulted from the Classify Document
activity.
If multiple extractors can work with that document type, the activity returns an error. In this scenario, you must manually select your preferred extractor.
- For the Predefined project,
you have two choices:
- Document Type details - This field
appears if you choose the option Generative. Prompt to identify the fields to be
extracted, provided as key-value pairs, where the key represents the name of the field and
the value a description for it, helping the extractor identify the corresponding value.
Select the field, and you will get a prompt with the following options, provided as
pairs:
- Field name - Requires you to input the field name to be extracted (Ex. Due date) (30-character limit)
- Instruction - Requires you to provide instructions about what information should be extracted for the corresponding field.. The maximum number of characters allowed is 1000. The response, extraction result, also called Completion, has a word limit of 700. This is limited to 700 words. This means that you can't extract more than 700 words from a single prompt. If your extraction requirements exceed this limit, you can divide the document into multiple pages, process them individually, and then merge the results afterwards.
Tip: For good practices on how to use generative prompts, check the Generative extractor - Good practices page. - Version or Tag - Use this property
when using an existing Document Understanding modern project. Select the tag that
corresponds to the project version from which you want to process data. For instance, if
you choose the Production tag assigned to Version 3, the activity processes data
from Version 3 of your project in the production environment.
The default value for Version is Staging. If the Staging tag doesn't exist in your selected project, then the default value is Production.
For more information about versions, visit Publishing models.
- Document Type - When you choose a tag from the Version field, the activity automatically selects the first deployed document type from the relevant version of your chosen project. Moreover, the activity shows the extraction fields related to your chosen document type.
Properties panel
Input
- Timeout (seconds) - Maximum execution time (in seconds) for the call to the generative model. If the operation exceeds this timeout, it is automatically terminated to prevent delays or hangs. This property is only displayed if the Generative Extractor is selected as an extractor.
- Auto-validation - Use this option
to enable automatic validation, a capability that helps validate the results obtained for
data extraction against a Generative model. The default value for the Auto-validation
field is
False
.- Confidence threshold - This
field becomes visible once you enable Auto-validation. Extraction results
falling below the threshold are compared to the generative extraction model. If they
match, the system adjusts the extraction confidence to meet the threshold value.
Possible threshold values range from 0 to 100.
If the value is set to 0, no validation is applied. However, if you set a specific value (from 0 to 100), the system checks all extraction results below this value. For example, if you set a confidence threshold of 80%, the system will apply the generative validation for fields with confidence below 80%.
Note: Auto-validation is available only for specialized extraction models.
- Confidence threshold - This
field becomes visible once you enable Auto-validation. Extraction results
falling below the threshold are compared to the generative extraction model. If they
match, the system adjusts the extraction confidence to meet the threshold value.
Possible threshold values range from 0 to 100.
- Generate Data Type - If set to
True
, indicates that the output should be generated based on the selected extractor, resulting in anIDocumentData<ExtractorType>
object. Alternatively, if set toFalse
, indicates that the data generation should be skipped, resulting in a genericIDocumentData<DictionaryData>
object.Visit Document Data for additional details and limitations available for the two object types.
Output
- Document Data - All the extracted
field data from the file. Information can also be received from Classify Document.
Visit Document data to learn how Document Data works and how to consume the extracted results for single and multi-value fields.
Design-time external connection
The design-time external connection allows you to leverage the activity using Document Understanding resources from other projects or tenants. Before configuring these properties, ensure you have fulfilled the prerequisites mentioned in the Configuring runtime external connection page. Once these steps are completed, you can then proceed to configure the runtime external connection.
- App ID: Enter the App ID of the external application you previously created.
- App secret: Enter the App secret of the external application you previously created.
- Tenant URL: Enter the URL of the
tenant where you created the external application. This is the tenant from where you will
use resources at design-time.
The URL should be in the following format:
https://<baseURL>/<OrganizationName>/<TenantName>
.
Runtime external connection
The runtime external connection allows you to execute the activity via on-premises robots. Before configuring these properties, ensure you have fulfilled the prerequisites mentioned in the Configuring runtime external connection page. Once these steps are completed, you can then proceed to configure the runtime external connection.
- Runtime Credentials Asset - Use
this field when you need to access Document Understanding resources while the robot is
connected to a local Orchestrator, or from a different tenant. You can choose to enter a
Credential Asset, for authentication purposes, in one of the following ways:
- From the dropdown list, select the desired Credential Asset from the Orchestrator to which the UiPath® Robot is connected to.
- Manually enter the path to the
Orchestrator Credential Asset where you store the external application credentials for
accessing the project.
The format of the path should be:
<OrchestratorFolderName>/<AssetName>
.
- Runtime Tenant Url - Use this
field, alongside the Runtime Credentials Asset field. Enter the URL of the tenant
that the robot will connect to in order to execute the extraction. The URL should be in
the following format:
https://<baseURL>/<OrganizationName>/<TenantName>
.
Extractor | Recommended scenario | Provider | Region availability | Multi-modal support1 |
---|---|---|---|---|
Long Document Simple Layout Extractor | Recommended for long form documents with mostly text and headings. For example, you can use the Long Document Simple Layout Extractor on documents such as lease agreements, master service agreements, or other similar documents. | Azure OpenAI | United Kingdom, Australia, India, Canada | |
Long Document Complex Layout Extractor | Recommended for long-form documents with complex layouts, such as images, handwritten text, form elements, or distinctive layouts such as floating callout boxes. You can use this extractor on long-form documents like insurance policies, which usually have complex layouts. | Azure OpenAI | United States, European Union, Japan, Singapore | |
Short Document Complex Layout Extractor | Recommended for shorter documents (of maximum 20 pages) featuring images, handwritten text, form elements, or complex layouts, such as floating callout boxes. You can use this extractor on documents like government IDs or healthcare intake forms that typically have shorter but more complex layouts. | Azure OpenAI | United States, European Union, Japan, Singapore |
1 Multi-modal support refers to the ability to extract different types of data inputs, such as text, images, handwritten text, etc.
To quickly get started with the generative capabilities of the Extract Document Data activity, perform the following steps:
- Add an Extract Document Data activity.
- From the Project dropdown list, select Generative Predefined.
- For Extractor, select one of
the following extractors: Long Document Simple Layout Extractor, Long
Document Complex Layout Extractor, or Short Document Complex Layout
Extractor.
The Document Type details property appears in the body of the activity.
- For Dictionary provide your
instructions as Dictionary key-value pairs, where:
- Field name represents
the name of the field that you want to extract from the document. For
example,
email address
. - Instruction represents
the instruction about what information you want to give the extractor for
extracting the field. It is the description used by the generative extractor
to identify the corresponding value.
For example, check the following table for a sample of key-value pairs:
Table 2. Examples of key-value pairs for the generative extractor prompt Field name Instruction Name "What is the name of the candidate?" Current Job "What is the current job of the candidate?" Employer "What is the current employer of the candidate?" Figure 1. Key-value pairs details for the generative extractor
- Field name represents
the name of the field that you want to extract from the document. For
example,