- 概述
- 文档处理合同
- 发行说明
- 关于文档处理合同
- Box 类
- IPersistedActivity 接口
- PrettyBoxConverter 类
- IClassifierActivity 接口
- IClassifierCapabilitiesProvider 接口
- 分类器文档类型类
- 分类器结果类
- 分类器代码活动类
- 分类器原生活动类
- 分类器异步代码活动类
- 分类器文档类型功能类
- ContentValidationData Class
- EvaluatedBusinessRulesForFieldValue Class
- EvaluatedBusinessRuleDetails Class
- 提取程序异步代码活动类
- 提取程序代码活动类
- 提取程序文档类型类
- 提取程序文档类型功能类
- 提取程序字段功能类
- 提取程序原生活动类
- 提取程序结果类
- FieldValue Class
- FieldValueResult Class
- ICapabilitiesProvider 接口
- IExtractorActivity 接口
- 提取程序有效负载类
- 文档操作优先级枚举
- 文档操作数据类
- 文档操作状态枚举
- 文档操作类型枚举
- 文档分类操作数据类
- 文档验证操作数据类
- 用户数据类
- 文档类
- 文档拆分结果类
- DomExtensions 类
- 页类
- 页面分区类
- 多边形类
- 多边形转换器类
- 元数据类
- 词组类
- 词类
- 处理源枚举
- 结果表格单元类
- 结果表值类
- 结果表列信息类
- 结果表类
- 旋转枚举
- Rule Class
- RuleResult Class
- RuleSet Class
- RuleSetResult Class
- 分区类型枚举
- 词组类型枚举
- IDocumentTextProjection 接口
- 分类结果类
- 提取结果类
- 结果文档类
- 结果文档范围类
- 结果数据点类
- 结果值类
- 结果内容引用类
- 结果值令牌类
- 结果派生字段类
- 结果数据源枚举
- 结果常量类
- 简单字段值类
- 表字段值类
- 文档组类
- 文档分类类
- 文档类型类
- 字段类
- 字段类型枚举
- FieldValueDetails Class
- 语言信息类
- 元数据输入类
- 文本类型枚举
- 类型字段类
- ITrackingActivity 接口
- ITrainableActivity 接口
- ITrainableClassifierActivity 接口
- ITrainableExtractorActivity 接口
- 可训练的分类器异步代码活动类
- 可训练的分类器代码活动类
- 可训练的分类器原生活动类
- 可训练的提取程序异步代码活动类
- 可训练的提取程序代码活动类
- 可训练的提取程序原生活动类
- 基本数据点类 - 预览
- 提取结果处理程序类 - 预览
- Document Understanding ML
- Document Understanding OCR 本地服务器
- Document Understanding
- 智能 OCR
- 发行说明
- 关于“智能 OCR”活动包
- 项目兼容性
- 加载分类
- 将文档数字化
- 分类文档作用域
- 基于关键词的分类器
- Document Understanding 项目分类器
- 智能关键词分类器
- 创建文档分类操作
- 创建文档验证工件
- 检索文档验证工件
- 等待文档分类操作然后继续
- 训练分类器范围
- 基于关键词的分类训练器
- 智能关键词分类训练器
- 数据提取作用域
- Document Understanding 项目提取程序
- Document Understanding 项目提取程序训练器
- 基于正则表达式的提取程序
- 表单提取程序
- 智能表单提取程序
- 文档脱敏
- 创建文档验证操作
- 等待文档验证操作然后继续
- 训练提取程序范围
- 导出提取结果
- 机器学习提取程序
- 机器学习提取程序训练器
- 机器学习分类器
- 机器学习分类训练器
- 生成分类器
- 生成式提取程序
- 配置身份验证
- ML 服务
- OCR
- OCR 合同
- OmniPage
- PDF
- [未公开] Abbyy
- [未列出] Abbyy 嵌入式

Document Understanding 活动
提取文档数据
UiPath.IntelligentOCR.StudioWeb.Activities.ExtractDocumentDataWithDocumentData<UiPath.IntelligentOCR.StudioWeb.Activities.DataExtraction.ExtendedExtractionResultForDocumentData>
描述
从输入文件或文档数据对象中提取数据,并将结果存储到文档数据对象中。
在开始之前
先决条件
The Extract Document Data activity requires input objects of type Document Data or File. A possible use case for using this activity is to precede it with a Classify Document activity, that generates an object of type Document Data.
输入选项
“提取文档数据”活动接收以下选项之一作为输入:
- 文档数据 - 来自“分类文档”活动
- 文件 - 来自“获取文件/文件夹”或“获取最新的电子邮件”活动
生成式模型支持的语言
The supported languages for the generative models are the same as the OCR engine used, which depends on the project. For the Predefined and Generative Predefined projects, the OCR Engine used is UiPath Document OCR. For more information, visit the OCR Supported languages page.
活动使用的模型
“提取文档数据”活动使用以下内容:
- Pre-trained specialized models available out of the box, based on the Helix Extractor.
- 部署在 Document Understanding 新式和传统项目中的自定义预训练模型。
- 生成式提取模型。
已知限制
The Generative Predefined project type and the corresponding extractors are not available in Automation Suite.
使用“提取文档数据”活动时,分类字段支持新式项目提取程序和开箱即用的模型,但不支持传统项目提取程序。
向“提取文档数据”活动提供带有子文档的文档数据将触发运行时错误。此行为是设计使然。要从拆分文档中提取数据,请遍历每个子文档。
项目兼容性
Windows | 跨平台
配置
设计器面板
-
Input - Requires you to specify the file itself, or Document Data, in case you have used other Document Understanding Activities before in your workflow, (for example, Classify Document).
重要提示:The maximum numbers of pages a file can have is 500. Files exceeding this limit fail to extract.
-
Project - Requires you to select your Document Understanding project from the dropdown list. The available options are:
- Predefined – Classic project type that uses pre-trained specialized models recommended for standard scenarios. For more information on the charging logic for classic project, visit Metering and charging logic.
- Generative Predefined – Modern project type that uses pre-trained generative models accepting instructions as input for extraction of document data. For more information on the charging logic for modern projects, visit Metering and charging logic.
- Predefined Non-Latin Languages – Modern project type that uses pre-trained models for non-Latin document processing scenarios. For more information on the charging logic for modern projects, visit Metering and charging logic.
- 您连接到的租户和文件夹中的现有项目。
- You can create a custom project by going to Document Understanding. For more information, visit Introduction for building models.
备注:
If you have created more than 500 projects on your tenant and use the Extract Document Data activity, UiPath Studio or Studio Web will not display any projects beyond the initial 500. Therefore, those projects cannot be used.
-
Extractor - After you select a project, you can also select an extractor that you want to use.
- For the Predefined project, you have two choices: - Select a pre-trained model. Visit Out-of-the-box models for a list of pre-trained models that you can use.
备注:
The Extract Document Data activity extracts the information for the fields available on the document type for the selected extractor (regardless of the actual type of the document). This is not applicable for generative models.
- Select the Generative extractor.
备注:
The information sent to the Generative Extractor goes to an LLM Model instance. This instance isn't publicly available, doesn't store the data sent, and doesn't use it for training purposes.
- Select the Generative extractor.
- For the Generative Predefined project, you have three choices for extraction, tailored to a specific document layout:
- Long Document Simple Layout Extractor – Recommended for long form documents with mostly text and headings. For example, you can use the Long Document Simple Layout Extractor on documents such as lease agreements, master service agreements, or other similar documents.
- Long Document Complex Layout Extractor (Preview) – Recommended for long form documents that include elements such as images, handwriting, form controls, floating callout boxes, or other complex layout types. For example, you can use the Long Document Complex Layout Extractor on documents such as insurance policies, or other similar documents.
- Short Document Complex Layout Extractor (Preview) – Recommended for short documents that include elements such as images, handwriting, form control, floating callout boxes, or other complex layout types. For example, you can use the Short Document Complex Layout Extractor on documents such as government IDs, healthcare intake forms, or other similar documents.
- For the Predefined Non-Latin Languages project, you have three choices for extraction, tailored to a specific non-Latin document layout:
- Invoices Japan – Recommended for Japanse invoice documents. The extractor can handle common Japanese invoice layouts, and can identify and extract key invoice fields such as supplier information, invoice number, and currency.
- Invoices China - Recommended for Chinese invoice documents. The extractor can handle common Chinese invoice layouts, and can identify and extract key invoice fields such as supplier information, invoice number, and currency.
- Receipts Japan - Recommended for Japanese receipt documents. You can use the extractor to identify and extract fields such as merchant name, transaction date, total amount, tax, and currency from Japanese-language receipts.
- Use Classification Result: If the Generate Data Type property is set to false, you can opt for the Use Classification Result option. This option automatically uses a recommended extractor based on the document type resulted from the Classify Document activity. If multiple extractors can work with that document type, the activity returns an error. In this scenario, you must manually select your preferred extractor.
- For the Predefined project, you have two choices: - Select a pre-trained model. Visit Out-of-the-box models for a list of pre-trained models that you can use.
-
Document Type details - This field appears if you choose the option Generative. Prompt to identify the fields to be extracted, provided as key-value pairs, where the key represents the name of the field and the value a description for it, helping the extractor identify the corresponding value. Select the field, and you will get a prompt with the following options, provided as pairs:
- Field name - Requires you to input the field name to be extracted (Ex. Due date) (30-character limit)
- Instruction - Requires you to provide instructions about what information should be extracted for the corresponding field.. The maximum number of characters allowed is 1000. The response, extraction result, also called Completion, has a word limit of 700. This is limited to 700 words. This means that you can't extract more than 700 words from a single prompt. If your extraction requirements exceed this limit, you can divide the document into multiple pages, process them individually, and then merge the results afterwards.
提示:
For good practices on how to use generative prompts, check the Generative extractor - Good practices page.
-
Version or Tag - Use this property when using an existing Document Understanding modern project. Select the tag that corresponds to the project version from which you want to process data. For instance, if you choose the Production tag assigned to Version 3, the activity processes data from Version 3 of your project in the production environment. The default value for Version is Staging. If the Staging tag doesn't exist in your selected project, then the default value is Production. For more information about versions, visit Publishing models.
-
Document Type - When you choose a tag from the Version field, the activity automatically selects the first deployed document type from the relevant version of your chosen project. Moreover, the activity shows the extraction fields related to your chosen document type.
属性面板
输入
- Timeout (seconds) - Maximum execution time (in seconds) for the call to the generative model. If the operation exceeds this timeout, it is automatically terminated to prevent delays or hangs. This property is only displayed if the Generative Extractor is selected as an extractor.
- Auto-validation - Use this option to enable automatic validation, a capability that helps validate the results obtained for data extraction against a Generative model. The default value for the Auto-validation field is
False.- Confidence threshold - This field becomes visible once you enable Auto-validation. Extraction results falling below the threshold are compared to the generative extraction model. If they match, the system adjusts the extraction confidence to meet the threshold value. Possible threshold values range from 0 to 100. If the value is set to 0, no validation is applied. However, if you set a specific value (from 0 to 100), the system checks all extraction results below this value. For example, if you set a confidence threshold of 80%, the system will apply the generative validation for fields with confidence below 80%.
备注:
Auto-validation is available only for specialized extraction models.
- Confidence threshold - This field becomes visible once you enable Auto-validation. Extraction results falling below the threshold are compared to the generative extraction model. If they match, the system adjusts the extraction confidence to meet the threshold value. Possible threshold values range from 0 to 100. If the value is set to 0, no validation is applied. However, if you set a specific value (from 0 to 100), the system checks all extraction results below this value. For example, if you set a confidence threshold of 80%, the system will apply the generative validation for fields with confidence below 80%.
- Generate Data Type - If set to
True, indicates that the output should be generated based on the selected extractor, resulting in anIDocumentData<ExtractorType>object. Alternatively, if set toFalse, indicates that the data generation should be skipped, resulting in a genericIDocumentData<DictionaryData>object. Visit Document Data for additional details and limitations available for the two object types.
输出
- Document Data - All the extracted field data from the file. Information can also be received from Classify Document. Visit Document data to learn how Document Data works and how to consume the extracted results for single and multi-value fields.
设计时外部连接
Runtime credentials can be retrieved from Orchestrator credential assets (via the Runtime Credentials Asset field). Design-time credentials must be entered manually and are not pulled via an Orchestrator asset.
The design-time external connection allows you to leverage the activity using Document Understanding resources from other projects or tenants. Before configuring these properties, ensure you have fulfilled the prerequisites mentioned in the Configuring runtime external connection page. Once these steps are completed, you can then proceed to configure the runtime external connection.
-
App ID: Enter the App ID of the external application you previously created.
-
App secret: Enter the App secret of the external application you previously created.
-
Tenant URL: Enter the URL of the tenant where you created the external application. This is the tenant from where you will use resources at design-time.
URL 应采用以下格式:
https://<baseURL>/<OrganizationName>/<TenantName>。
运行时外部连接
The runtime external connection allows you to execute the activity via on-premises robots. Before configuring these properties, ensure you have fulfilled the prerequisites mentioned in the Configuring runtime external connection page. Once these steps are completed, you can then proceed to configure the runtime external connection.
- Runtime Credentials Asset - Use this field when you need to access Document Understanding resources while the robot is connected to a local Orchestrator, or from a different tenant. You can choose to enter a Credential Asset, for authentication purposes, in one of the following ways:
-
From the dropdown list, select the desired Credential Asset from the Orchestrator to which the UiPath® Robot is connected to.
-
如果您在 Orchestrator 凭据资产中存储了用于访问项目的外部应用程序凭据,请手动输入 Orchestrator 凭据资产的路径。
路径的格式应为:
<OrchestratorFolderName>/<AssetName>。
-
- Runtime Tenant Url - Use this field, alongside the Runtime Credentials Asset field. Enter the URL of the tenant that the robot will connect to in order to execute the extraction. The URL should be in the following format:
https://<baseURL>/<OrganizationName>/<TenantName>.
支持的型号
The generative extractors available under the Generative Predefined project can be used for the documents described in the following table:
Long Document Complex Layout and Short Document Complex Layout extractors are not currently available in Automation CloudTM for Public Sector environments (FedRamp).
Table 1. Supported scenarios for generative extractors
| 提取程序 | 推荐场景 | 提供程序 | 区域支持情况 | Multi-modal support1 |
|---|---|---|---|---|
| 长文档简单布局提取程序 | 建议用于主要包含文本和标题的长文档。 例如,您可以在租赁协议、主服务协议或其他类似文档上使用“长文档简单布局提取程序”。 | Azure OpenAI | 澳大利亚、欧盟、印度、日本、新加坡、英国、美国、加拿大 | ❌ |
| 长文档复杂布局提取程序(预览版) | 建议用于包含复杂布局(例如图像、手写文字、表单元素)或独特布局(例如浮动标注框)的长文档。您可以将此提取程序用于保单等布局复杂的长文档。 | Azure OpenAI | 美国、欧盟、日本、新加坡 | ✅ |
| 短文档复杂布局提取程序(预览版) | 建议用于包含图像、手写文字、表单元素或复杂布局(例如浮动标注框)的较短文档(最多 20 页)。您可以将此提取程序用于政府身份证件或医疗接诊表等通常内容较短但布局更复杂的文档。 | Azure OpenAI | 美国、欧盟、日本、新加坡 | ✅ |
1 Multi-modal support refers to the ability to extract different types of data inputs, such as text, images, handwritten text, etc.
使用生成提取程序
To quickly get started with the generative capabilities of the Extract Document Data activity, perform the following steps:
- Add an Extract Document Data activity.
- From the Project dropdown list, select Generative Predefined.
- For Extractor, select one of the following extractors: Long Document Simple Layout Extractor, Long Document Complex Layout Extractor, or Short Document Complex Layout Extractor. The Document Type details property appears in the body of the activity.
- For Dictionary provide your instructions as Dictionary key-value pairs, where:
-
Field name represents the name of the field that you want to extract from the document. For example,
email address. -
Instruction represents the instruction about what information you want to give the extractor for extracting the field. It is the description used by the generative extractor to identify the corresponding value. For example, check the following table for a sample of key-value pairs:
Table 2. Examples of key-value pairs for the generative extractor prompt
字段名称 说明 名称 “候选人叫什么名字?” 当前作业 “候选人当前的工作是什么?” 雇主 “候选人当前的雇主是什么?” Figure 1. Key-value pairs details for the generative extractor

-