document-understanding
2024.10
true
重要 :
新发布内容的本地化可能需要 1-2 周的时间才能完成。

Document Understanding API 指南
上次更新日期 2024年12月18日
根据您的用例,您可以对 Document UnderstandingTM API 使用两种类型的调用:同步 (sync) 和异步 (async)。
此页面列出了有关 Document UnderstandingTM API 的常见问题。
检查错误代码列表及其相应的消息,了解一般
故障:
OperationIdNotFoundError - Operation Id not found.
DatabaseConcurrencyError - Unable to update job status for the given input.
RequestAbortedByClient - Request aborted by client
TooManyRequests - Rate limit exceeded. Try again later. For details around the applicable limitations, check out the official documentation.
DeploymentUnavailableError - The project version this resource is part of is not available. Please check the project deployment version status and try again.
DeploymentTagNotFoundError - There is no project version tag that matches your request.
DeploymentDocumentTypeNotFoundError - The requested project version deployment does not contain an extraction model for the requested document type.
UnexpectedInternalServerError - Internal Server Error. Please contact the UiPath support team.
ServiceUnavailableError - Service Unavailable. Please retry in a few moments.
OperationIdNotFoundError - Operation Id not found.
DatabaseConcurrencyError - Unable to update job status for the given input.
RequestAbortedByClient - Request aborted by client
TooManyRequests - Rate limit exceeded. Try again later. For details around the applicable limitations, check out the official documentation.
DeploymentUnavailableError - The project version this resource is part of is not available. Please check the project deployment version status and try again.
DeploymentTagNotFoundError - There is no project version tag that matches your request.
DeploymentDocumentTypeNotFoundError - The requested project version deployment does not contain an extraction model for the requested document type.
UnexpectedInternalServerError - Internal Server Error. Please contact the UiPath support team.
ServiceUnavailableError - Service Unavailable. Please retry in a few moments.
Document Understanding 项目错误代码
在某些情况下,与 Document Understanding 项目相关的故障可能会导致显示
错误消息。错误代码和相应的消息可具有
以下值
之一:
DiscoveryResourceNotFoundError - Resource not found.
DuCenterProjectNotFound - Du Center project not found.
DocumentIdNotFound - Cannot perform the operation for the given documentId: Ensure it is correct, the digitization is successful (retrieving the digitization result), and not more than 7 days since the digitization call passed (case in which, it expired).
DocumentIdInvalid - Required input DocumentId is missing or invalid.
DocumentTypeIdNotFound - Document Type Id not found in the given project.
ProjectVersionNotSupportedError - Project Version is not supported for classic projects.
ModernProjectExtractorRequestTooLargeError - Maximum number of pages per document exceeded for the given custom trained extractor.
DiscoveryResourceNotFoundError - Resource not found.
DuCenterProjectNotFound - Du Center project not found.
DocumentIdNotFound - Cannot perform the operation for the given documentId: Ensure it is correct, the digitization is successful (retrieving the digitization result), and not more than 7 days since the digitization call passed (case in which, it expired).
DocumentIdInvalid - Required input DocumentId is missing or invalid.
DocumentTypeIdNotFound - Document Type Id not found in the given project.
ProjectVersionNotSupportedError - Project Version is not supported for classic projects.
ModernProjectExtractorRequestTooLargeError - Maximum number of pages per document exceeded for the given custom trained extractor.
由于客户端错误,数字化失败
在某些情况下,由于客户端错误而导致的数字化失败可能会导致显示错误消息。这是一个
400
错误,具体显示如下:Code: [DigitizationErrorCode], Message: "DigitizationErrorMessage"
。错误代码和相应的消息可具有以下值之一:
[UnsupportedContentTypeError]", "Content type of the input document is not supported."
[UnexpectedPdfStructureError]", "Invalid or corrupt PDF structure."
[InvalidImageSizeError]", "Image size of the input document is not supported."
[UnableToProcessContentError]", "Unable to process document contents."
[ContentTypeMismatchError]", "Declared content-type of the input document does not match the binary content type."
[PasswordProtectedPdfError]", "Password protected PDFs are not supported."
[MaximumNumberOfPagesPerDocumentLimitExceededError]", "Maximum number of pages for digitization exceeded."
[InvalidRequestData]", "The form data in the request is invalid. Expected is a multi-part form data, with either one part consisting of the document to be digitized, or two named parts: File - the document, DigitizationResult - the digitization result, with content type application/json."
[UnexpectedDigitizationResultStructure]", "The digitization result object is invalid and non-serializable."
[InvalidDom]", "The provided DOM is invalid. Make sure the DOM is correctly built, including valid non-overlapping indices, well formed boxes and polygons and valid values for all properties."
[MismatchingDomAndContent]", "The provided DOM and content do not match. Make sure the DOM was generated on the provided document."
[MismatchingDomAndText]", "The provided DOM and text do not match. Make sure the text and DOM were generated on the same document."
[PreprocessingOptionIncompatibleWithDigitizationResult]", "Using the preprocessing option while also providing a digitization result input is not supported."
[InvalidOcrApiKeyError]", "OCR Api key is invalid."
[OcrTooManyRequestsError]", "OCR request quota exceeded."
[ExternalOcrTooManyRequestsError]", "OCR request quota exceeded."
[GoogleBillingNotEnabled]", "Google OCR billing is not enabled. Please enable billing in your Google Cloud Platform account."
[GoogleApiKeyExpired]", "Google OCR Api Key Expired."
[InvalidOcrUrlError]", "The provided OCR URL is invalid or malformed."
[InvalidResponseFromOcrEngineError]", "Invalid response received from the OCR engine. Please set another OCR engine for the project you are using."
[DigitizationFileRequired]", "Required input file(s) are missing or not packaged as a valid multipart/form-data."
[UnsupportedContentTypeError]", "Content type of the input document is not supported."
[UnexpectedPdfStructureError]", "Invalid or corrupt PDF structure."
[InvalidImageSizeError]", "Image size of the input document is not supported."
[UnableToProcessContentError]", "Unable to process document contents."
[ContentTypeMismatchError]", "Declared content-type of the input document does not match the binary content type."
[PasswordProtectedPdfError]", "Password protected PDFs are not supported."
[MaximumNumberOfPagesPerDocumentLimitExceededError]", "Maximum number of pages for digitization exceeded."
[InvalidRequestData]", "The form data in the request is invalid. Expected is a multi-part form data, with either one part consisting of the document to be digitized, or two named parts: File - the document, DigitizationResult - the digitization result, with content type application/json."
[UnexpectedDigitizationResultStructure]", "The digitization result object is invalid and non-serializable."
[InvalidDom]", "The provided DOM is invalid. Make sure the DOM is correctly built, including valid non-overlapping indices, well formed boxes and polygons and valid values for all properties."
[MismatchingDomAndContent]", "The provided DOM and content do not match. Make sure the DOM was generated on the provided document."
[MismatchingDomAndText]", "The provided DOM and text do not match. Make sure the text and DOM were generated on the same document."
[PreprocessingOptionIncompatibleWithDigitizationResult]", "Using the preprocessing option while also providing a digitization result input is not supported."
[InvalidOcrApiKeyError]", "OCR Api key is invalid."
[OcrTooManyRequestsError]", "OCR request quota exceeded."
[ExternalOcrTooManyRequestsError]", "OCR request quota exceeded."
[GoogleBillingNotEnabled]", "Google OCR billing is not enabled. Please enable billing in your Google Cloud Platform account."
[GoogleApiKeyExpired]", "Google OCR Api Key Expired."
[InvalidOcrUrlError]", "The provided OCR URL is invalid or malformed."
[InvalidResponseFromOcrEngineError]", "Invalid response received from the OCR engine. Please set another OCR engine for the project you are using."
[DigitizationFileRequired]", "Required input file(s) are missing or not packaged as a valid multipart/form-data."
由于超出页数,数字化失败
如果由于文档超过五页而导致同步调用失败,则会生成错误消息。这是一个
400
错误,具体显示如下:
Code: [SyncMaximumNumberOfPagesExceeded], Message: Maximum Number Of Pages
Exceeded
。
如果遇到此错误,请使用异步 API。我们始终建议将异步 API 用于生产用例。仅建议在以下情况下使用同步 API:
- 当您确定页面永远不会超过五个时。
- 如果您只有单页图像,没有 PDF 或 TIFF 时。
- 准备概念验证 (POC) 或演示时。
由于服务器错误,数字化失败
如果数字化由于服务器错误而失败,此操作可能会生成错误消息。这是一个
500
错误,具体显示如下:Code: [DigitizationFailedServerError], Message: Internal Server Error
。
如果您遇到此错误,建议您联系 UiPath™ 支持团队。
未找到数字化作业或文档 ID
如果未找到数字化作业或文档 ID,则会生成错误消息:
- 如果您使用分类或提取同步 API,则会显示
404, Code:[DocumentIdNotFound]
错误。 - 如果您使用异步 API 进行分类或提取,则获取结果将返回
200, Code:[DocumentIdNotFound]
错误。
在这两种情况下,都会生成以下错误消息:
Cannot perform the operation for the given documentId: Ensure it is correct, the
digitization is successful (retrieving the digitization result), and not more than 7 days
since the digitization call passed (case in which, it expired).
解决方案步骤
-
调用
/digitization/result/{documentId}
检查数字化的结果。 - 重试数字化流程。
- 通过重试生成新的文档 ID 后,请使用它来分类和提取数据。
分类错误代码
在某些情况下,与分类相关的失败可能会导致显示错误消息。这是一个错误,具体显示如下:
Code: [ClassifierErrorCode], Message: "ClassifierErrorMessage"
。错误代码和相应的消息可具有以下值之一:
ClassifierRequestTooLargeError - Maximum number of pages per document exceeded for the given classifier. Please use a custom trained model to process the input document.
ClassifierSyncMaximumNumberOfPagesExceeded - Maximum number of pages (5) for synchronous classification exceeded for this document. Please use the asynchronous APIs.
InvalidClassifierIdInput - The provided classifier id from the URL is different from the one in the classification result.
ClassifierIdNotFound - Classifier Id not found in the given project.
DeploymentClassifierNotFoundError - The requested project version deployment does not contain a classifier for the requested tag.
PrivateSkillClassifierUnavailable - The private skill classifier is not available at the moment. Check classifier status and try again.
ClassifierRequestTooLargeError - Maximum number of pages per document exceeded for the given classifier. Please use a custom trained model to process the input document.
ClassifierSyncMaximumNumberOfPagesExceeded - Maximum number of pages (5) for synchronous classification exceeded for this document. Please use the asynchronous APIs.
InvalidClassifierIdInput - The provided classifier id from the URL is different from the one in the classification result.
ClassifierIdNotFound - Classifier Id not found in the given project.
DeploymentClassifierNotFoundError - The requested project version deployment does not contain a classifier for the requested tag.
PrivateSkillClassifierUnavailable - The private skill classifier is not available at the moment. Check classifier status and try again.
提取错误代码
在某些情况下,与提取相关的失败可能会导致显示错误消息。这是一个
错误,具体显示如下:
Code: [ExtractorErrorCode], Message:
"ExtractorErrorMessage"
。错误代码和相应的消息可具有
以下值
之一:ExtractorRequestTooLargeError - Maximum number of pages per document exceeded for the given extractor. Please use a custom trained model to process the input document.
ExtractorSyncMaximumNumberOfPagesExceeded - Maximum number of pages (5) for synchronous extraction exceeded for this document. Please use the asynchronous APIs.
ExtractionAutoValidationConfidenceInvalidError - The parameter value for autovalidation Confidence is invalid. Please provide a confidence value in the 0-100 range.
ExtractionAutoValidationConfidenceInvalidGenerativeError - The autovalidation extraction confidence option is invalid for the generative extractor. Please select a different extractor and try again.
ExtractorIdNotFound - Extractor Id not found in the given project.
PrivateSkillExtractorUnavailable - The private skill extractor is not available at the moment. Check extractor status and try again.
PrivateSkillExtractorGetInfoModelCallFailedError - The private skill extractor is not available at the moment. Check extractor status and try again.
PrivateMLSkillUnavailable - The private ML Skill is not available at the moment. Check the skill configuration. Possible configuration issues might include: insufficient replica count, insufficient memory or cpu, using cpu when a gpu is more appropriate.
PublicSkillUnavailable - The public skill is not available at the moment. Check the status and try again.
ExtractorRequestTooLargeError - Maximum number of pages per document exceeded for the given extractor. Please use a custom trained model to process the input document.
ExtractorSyncMaximumNumberOfPagesExceeded - Maximum number of pages (5) for synchronous extraction exceeded for this document. Please use the asynchronous APIs.
ExtractionAutoValidationConfidenceInvalidError - The parameter value for autovalidation Confidence is invalid. Please provide a confidence value in the 0-100 range.
ExtractionAutoValidationConfidenceInvalidGenerativeError - The autovalidation extraction confidence option is invalid for the generative extractor. Please select a different extractor and try again.
ExtractorIdNotFound - Extractor Id not found in the given project.
PrivateSkillExtractorUnavailable - The private skill extractor is not available at the moment. Check extractor status and try again.
PrivateSkillExtractorGetInfoModelCallFailedError - The private skill extractor is not available at the moment. Check extractor status and try again.
PrivateMLSkillUnavailable - The private ML Skill is not available at the moment. Check the skill configuration. Possible configuration issues might include: insufficient replica count, insufficient memory or cpu, using cpu when a gpu is more appropriate.
PublicSkillUnavailable - The public skill is not available at the moment. Check the status and try again.
Orchestrator 错误代码
在某些情况下,与 Orchestrator 相关的故障可能会导致显示错误消息。错误代码和相应的消息可具有
以下值
之一:
OrchestratorDisabledError - The orchestrator service is not enabled for this tenant.
OrchestratorTaskCatalogNotFound - Orchestrator catalog does not exist.
OrchestratorFolderNotFoundError - Orchestrator folder does not exist.
OrchestratorBucketNotFoundError - Orchestrator bucket does not exist.
OrchestratorAssetNotFoundError - Orchestrator asset does not exist.
OrchestratorAlreadyExistsError - Orchestrator resource already exists.
OrchestratorDisabledError - The orchestrator service is not enabled for this tenant.
OrchestratorTaskCatalogNotFound - Orchestrator catalog does not exist.
OrchestratorFolderNotFoundError - Orchestrator folder does not exist.
OrchestratorBucketNotFoundError - Orchestrator bucket does not exist.
OrchestratorAssetNotFoundError - Orchestrator asset does not exist.
OrchestratorAlreadyExistsError - Orchestrator resource already exists.
验证错误代码
在某些情况下,与验证相关的失败可能会导致显示错误消息。这是一个
错误,具体显示如下:
Code: [ValidationErrorCode], Message:
"ValidationErrorMessage"
。错误代码和相应的消息可具有
以下值
之一:ValidationTaskNotFoundError - Validation task not found in Action Center.
ValidationInvalidActionTitle - Required input ActionTitle is missing or invalid.
ValidationMaxLengthActionCatalog - Validation action catalog maximum length exceeded: 50 characters.
ValidationMaxLengthActionFolder - Validation action folder maximum length exceeded: 200 characters.
ValidationActionCatalogNotFoundError - Action Catalog not found.
InvalidOrMalformedClassificationResultInput - The provided classification result is invalid or malformed.
InvalidFieldsValidationConfidenceInput - The value set for Extracted Fields Validation Confidence is not valid. It should be an integer value between 0 and 100.
InvalidOrMalformedExtractionResultInput - The provided extraction result is invalid or malformed.
InvalidExtractorIdMismatchInput - The provided extractor id from the URL is different from the one in the extraction result.
ValidationTaskNotFoundError - Validation task not found in Action Center.
ValidationInvalidActionTitle - Required input ActionTitle is missing or invalid.
ValidationMaxLengthActionCatalog - Validation action catalog maximum length exceeded: 50 characters.
ValidationMaxLengthActionFolder - Validation action folder maximum length exceeded: 200 characters.
ValidationActionCatalogNotFoundError - Action Catalog not found.
InvalidOrMalformedClassificationResultInput - The provided classification result is invalid or malformed.
InvalidFieldsValidationConfidenceInput - The value set for Extracted Fields Validation Confidence is not valid. It should be an integer value between 0 and 100.
InvalidOrMalformedExtractionResultInput - The provided extraction result is invalid or malformed.
InvalidExtractorIdMismatchInput - The provided extractor id from the URL is different from the one in the extraction result.