UiPath Documentation
document-understanding
latest
false
重要 :
新发布内容的本地化可能需要 1-2 周的时间才能完成。
UiPath logo, featuring letters U and I in white

Document Understanding classic user guide

上次更新日期 2026年4月23日

数字化概述

什么是数字化

Digitization is the process of obtaining machine readable text from a given incoming file, so that a robot can then understand its contents and act upon them. It is the first step applied on files that need to be processed through the Document UnderstandingTM framework.

数字化步骤有两个输出:

  • 已处理文件中的文本存储在字符串变量中;以及
  • 文件的文档对象模型 - JSON 对象,其中包含名称、内容类型、文本长度、页数等基本信息,还有各种详细信息,例如页面旋转、检测到的语言、文件中每个字词的内容和坐标。

In the Document Processing Framework, digitization is performed using the Digitize Document activity.

什么不是数字化

尽管相关,但数字化步骤并非 OCR

在许多情况下,需要处理的文件是原生 PDF 文件(未扫描),无需使用 OCR 即可由机器人以编程方式读取文件。

何时在数字化中使用 OCR

The Digitize Document activity requires, as part of its configuration, the selection of an OCR engine - so that, at need, it can be used, but only executes OCR on:

  • 图像文件
    • 支持的图像格式为 .png、.jpe、.jpg、.jpeg、.tiff、.tif、.bmp
    • 对于多页 TIFF 文件,则对每一页应用 OCR
  • 以下 PDF 页面:
    • 不会公开任何计算机可读内容
    • 其中包含覆盖页面很大一部分的图像。
备注:

The following digitization limitations apply:

  • 文件大小限制为 160 MB。
  • 每个文档最多包含 500 页。

OCR is also applied, always, if the Digitize Document activity is configured with the ForceApplyOCR flag set to True. This option is usually recommended for use cases in which a significant percentage of files seem to contain native content, but the natively read content does not correspond to what a user can observe in those files.

如何选择 OCR 引擎

As each use case has its own particularities, it is strongly recommended to test all available OCR Engines with different settings, in order to determine which one works best for your project. Another recommendation is to pay particular attention to the OCR engine arguments, such as Profile, Scale, Language etc. (may vary from one engine to another), so that you identify the best settings for each use case.

  • 什么是数字化
  • 什么不是数字化
  • 何时在数字化中使用 OCR
  • 如何选择 OCR 引擎

此页面有帮助吗?

连接

需要帮助? 支持

想要了解详细内容? UiPath Academy

有问题? UiPath 论坛

保持更新