IXP - Consuming models via Document Understanding API

ixp

latest

false

Unstructured and complex documents user guide

Overview
Model building
Model validation
Model deployment
Consuming models
- Consuming models via a workflow
- Consuming models via Document Understanding API
API
- API audit events
Frequently asked questions
- Frequently asked questions

Consuming models via Document Understanding API

Access IXP Unstructured and Complex Documents projects through the Document Understanding framework API using tag-based or extractorId-based extraction endpoints.

IXP Unstructured and Complex Documents projects are accessible through the same Document Understanding framework API. IXP projects appear as ProjectType: "IXP" in Discovery and support both tag-based endpoints and extractorId-based endpoints for extraction.

Prerequisites

Before you call any Document Understanding or IXP API, you need an external application registered in Automation Cloud. This provides the AppID and AppSecret used for OAuth authentication.

Creating an external application

Navigate to Orchestrator at tenant level.
Select Manage Access, then Manage Accounts and Groups.
From the UiPath Administration header, select External Applications.
Select Add Application.
Fill in Application Name, for example, DU API Client.
Select Confidential application, which is required to get an app secret.
Under Resources, select Add Scopes:

Select Document Understanding from the Resource dropdown.
Switch to the Application Scope(s) tab.
Check the scopes you need:
- Du.Digitization.Api — digitize documents
- Du.Classification.Api — classify documents
- Du.Extraction.Api — extract data
- Du.Validation.Api — create validation tasks
- Du.DataDeletion.Api — delete document data
Select Save.

Select Add to create the registration.

Note:

The Copy the App Secret immediately pop-up is displayed only once and cannot be recovered. You can generate a new one later from the edit screen.

The Application ID is visible on the External Applications page at any time.

Getting an Access Token

Use the App ID and App Secret to request an OAuth token through the client credentials flow:

curl -X POST 'https://cloud.uipath.com/identity_/connect/token' \
  -d 'grant_type=client_credentials' \
  -d 'client_id=<APP_ID>' \
  -d 'client_secret=<APP_SECRET>' \
  -d 'scope=Du.Digitization.Api Du.Extraction.Api'
curl -X POST 'https://cloud.uipath.com/identity_/connect/token' \
  -d 'grant_type=client_credentials' \
  -d 'client_id=<APP_ID>' \
  -d 'client_secret=<APP_SECRET>' \
  -d 'scope=Du.Digitization.Api Du.Extraction.Api'

Response:

{
  "access_token": "eyJh...CRaKrg",
  "expires_in": 3600,
  "token_type": "Bearer",
  "scope": "Du.Digitization.Api Du.Extraction.Api"
}
{
  "access_token": "eyJh...CRaKrg",
  "expires_in": 3600,
  "token_type": "Bearer",
  "scope": "Du.Digitization.Api Du.Extraction.Api"
}

The token expires after 1 hour. Use it as Authorization: Bearer <token> on all subsequent API calls.

Note:

If you lose the App Secret, go to Admin, then External Applications, edit the app, and select Generate New under App Secret. Update all integrations with the new secret.

Key differences

The following table shows the key differences between Document Understanding and IXP projects:

	Document Understanding (Classic or Modern)	IXP
ProjectType	`Classic` or `Modern`	`IXP`
Classification	Yes	No (extraction only)
Extraction routing	By `tag` + `documentTypeId` (recommended) or `extractorId`	By `tag` + `documentTypeId` or by `extractorId` (`gpt_ixp_[version]`)
Versioning	Extractors/classifiers	Tags (Staging, Production)
Extraction model	Specialized or Generative	Generative only (GPT-4o, Gemini)
Schema definition	In-project or via prompts	Defined in IXP UI (taxonomy)

The IXP workflow

Discover project and tags.
Digitize and extract (in parallel).
Validate (optional).

Note:

There is no classification step, as IXP only handles extractions.

Parallel digitization and extraction (IXP only)

For IXP projects, you can skip polling for the digitization result and immediately start extraction after you submit digitization. The backend runs both operations in parallel. Digitization and IXP extraction proceed concurrently, and the final extraction result is returned only after both complete.

This is an IXP-specific optimization that does not work with Document Understanding Classic or Modern projects, where you must wait for digitization to finish before calling extraction.

The optimized flow:

# 1. Start digitization (fire and forget — do not poll for result).
POST /projects/{projectId}/digitization/start
# → returns { "documentId": "..." }
# 2. Immediately start extraction with the documentId (no need to wait).
POST /projects/{projectId}/{tag}/document-types/{documentTypeId}/extraction/start
# → returns { "operationId": "..." }
# 3. Poll extraction result only — it waits for both digitization and extraction.
GET /projects/{projectId}/{tag}/document-types/{documentTypeId}/extraction/result/{operationId}
# 1. Start digitization (fire and forget — do not poll for result).
POST /projects/{projectId}/digitization/start
# → returns { "documentId": "..." }
# 2. Immediately start extraction with the documentId (no need to wait).
POST /projects/{projectId}/{tag}/document-types/{documentTypeId}/extraction/start
# → returns { "operationId": "..." }
# 3. Poll extraction result only — it waits for both digitization and extraction.
GET /projects/{projectId}/{tag}/document-types/{documentTypeId}/extraction/result/{operationId}

This flow eliminates the idle time between digitization and extraction, reducing total latency.

Step 1: Discover the IXP project

# List all projects — filter for type "IXP"
curl -X GET \
  'https://cloud.uipath.com/<Org>/<Tenant>/du_/api/framework/projects?api-version=1' \
  -H 'Authorization: Bearer <TOKEN>'
# List all projects — filter for type "IXP"
curl -X GET \
  'https://cloud.uipath.com/<Org>/<Tenant>/du_/api/framework/projects?api-version=1' \
  -H 'Authorization: Bearer <TOKEN>'

From the response, note the IXP project's id.

Get tags (published versions)

Tags correspond to published model versions marked as Staging or Production in the IXP user interface. Each tag includes its associated extractors and document types. To get tags, run the following:

curl -X GET \
  'https://cloud.uipath.com/<Org>/<Tenant>/du_/api/framework/projects/<ProjectID>/tags?api-version=1' \
  -H 'Authorization: Bearer <TOKEN>'
curl -X GET \
  'https://cloud.uipath.com/<Org>/<Tenant>/du_/api/framework/projects/<ProjectID>/tags?api-version=1' \
  -H 'Authorization: Bearer <TOKEN>'

Get document types

To get document types, run the following:

curl -X GET \
  'https://cloud.uipath.com/<Org>/<Tenant>/du_/api/framework/projects/<ProjectID>/document-types?api-version=1' \
  -H 'Authorization: Bearer <TOKEN>'
curl -X GET \
  'https://cloud.uipath.com/<Org>/<Tenant>/du_/api/framework/projects/<ProjectID>/document-types?api-version=1' \
  -H 'Authorization: Bearer <TOKEN>'

Step 2: Digitize the document

Similar to Document Understanding, upload the file to get a documentId:

curl -X POST \
  'https://cloud.uipath.com/<Org>/<Tenant>/du_/api/framework/projects/<ProjectID>/digitization/start?api-version=1' \
  -H 'Authorization: Bearer <TOKEN>' \
  -H 'Content-Type: multipart/form-data' \
  -F '[email protected];type=application/pdf'
curl -X POST \
  'https://cloud.uipath.com/<Org>/<Tenant>/du_/api/framework/projects/<ProjectID>/digitization/start?api-version=1' \
  -H 'Authorization: Bearer <TOKEN>' \
  -H 'Content-Type: multipart/form-data' \
  -F '[email protected];type=application/pdf'

Returns { "documentId": "..." }.

Step 3: Extract

IXP extraction supports the following routing approaches:

Tag-based - Route by tag and documentTypeId. This is recommended for Production or Staging workflows.
ExtractorId-based - Route by extractorId using the format: gpt_ixp_[version]. For example, gpt_ixp_67), the same as for Document Understanding Classic or Modern projects.

Tag-based extraction

Uses the tag-based path with the documentTypeId from Discovery.

Synchronous (up to 5 pages)

curl -X POST \
  'https://cloud.uipath.com/<Org>/<Tenant>/du_/api/framework/projects/<ProjectID>/<Tag>/document-types/<DocumentTypeId>/extraction?api-version=1' \
  -H 'Authorization: Bearer <TOKEN>' \
  -H 'Content-Type: application/json' \
  -d '{ "documentId": "<documentId>" }'
curl -X POST \
  'https://cloud.uipath.com/<Org>/<Tenant>/du_/api/framework/projects/<ProjectID>/<Tag>/document-types/<DocumentTypeId>/extraction?api-version=1' \
  -H 'Authorization: Bearer <TOKEN>' \
  -H 'Content-Type: application/json' \
  -d '{ "documentId": "<documentId>" }'

Asynchronous (multi-page)

Start:

curl -X POST \
  'https://cloud.uipath.com/<Org>/<Tenant>/du_/api/framework/projects/<ProjectID>/<Tag>/document-types/<DocumentTypeId>/extraction/start?api-version=1' \
  -H 'Authorization: Bearer <TOKEN>' \
  -H 'Content-Type: application/json' \
  -d '{ "documentId": "<documentId>" }'
curl -X POST \
  'https://cloud.uipath.com/<Org>/<Tenant>/du_/api/framework/projects/<ProjectID>/<Tag>/document-types/<DocumentTypeId>/extraction/start?api-version=1' \
  -H 'Authorization: Bearer <TOKEN>' \
  -H 'Content-Type: application/json' \
  -d '{ "documentId": "<documentId>" }'

Returns { "operationId": "..." }. Poll for result:

curl -X GET \
  'https://cloud.uipath.com/<Org>/<Tenant>/du_/api/framework/projects/<ProjectID>/<Tag>/document-types/<DocumentTypeId>/extraction/result/<operationId>?api-version=1' \
  -H 'Authorization: Bearer <TOKEN>'
curl -X GET \
  'https://cloud.uipath.com/<Org>/<Tenant>/du_/api/framework/projects/<ProjectID>/<Tag>/document-types/<DocumentTypeId>/extraction/result/<operationId>?api-version=1' \
  -H 'Authorization: Bearer <TOKEN>'

Poll until status is Succeeded or Failed.

ExtractorId-based extraction

Uses the same extractor-based endpoints as Document Understanding Classic or Modern. The ExtractorId for IXP follows the format gpt_ixp_[version], which is visible in the discovery response. Synchronous (up to 5 pages):

curl -X POST \
  'https://cloud.uipath.com/<Org>/<Tenant>/du_/api/framework/projects/<ProjectID>/extractors/<ExtractorId>/extraction?api-version=1' \
  -H 'Authorization: Bearer <TOKEN>' \
  -H 'Content-Type: application/json' \
  -d '{ "documentId": "<documentId>" }'
curl -X POST \
  'https://cloud.uipath.com/<Org>/<Tenant>/du_/api/framework/projects/<ProjectID>/extractors/<ExtractorId>/extraction?api-version=1' \
  -H 'Authorization: Bearer <TOKEN>' \
  -H 'Content-Type: application/json' \
  -d '{ "documentId": "<documentId>" }'

Asynchronous (multi-page):

curl -X POST \
  'https://cloud.uipath.com/<Org>/<Tenant>/du_/api/framework/projects/<ProjectID>/extractors/<ExtractorId>/extraction/start?api-version=1' \
  -H 'Authorization: Bearer <TOKEN>' \
  -H 'Content-Type: application/json' \
  -d '{ "documentId": "<documentId>" }'
curl -X POST \
  'https://cloud.uipath.com/<Org>/<Tenant>/du_/api/framework/projects/<ProjectID>/extractors/<ExtractorId>/extraction/start?api-version=1' \
  -H 'Authorization: Bearer <TOKEN>' \
  -H 'Content-Type: application/json' \
  -d '{ "documentId": "<documentId>" }'

Step 4: Validate (optional)

curl -X POST \
  'https://cloud.uipath.com/<Org>/<Tenant>/du_/api/framework/projects/<ProjectID>/<Tag>/document-types/<DocumentTypeId>/validation/start?api-version=1' \
  -H 'Authorization: Bearer <TOKEN>' \
  -H 'Content-Type: application/json' \
  -d '{
    "documentId": "<documentId>",
    "actionTitle": "Review IXP extraction",
    "actionPriority": "Medium",
    "actionCatalog": "default_du_actions",
    "actionFolder": "Shared",
    "storageBucketName": "du_storage_bucket",
    "storageBucketDirectoryPath": "du_storage_bucket",
    "extractionResult": { }
  }'
curl -X POST \
  'https://cloud.uipath.com/<Org>/<Tenant>/du_/api/framework/projects/<ProjectID>/<Tag>/document-types/<DocumentTypeId>/validation/start?api-version=1' \
  -H 'Authorization: Bearer <TOKEN>' \
  -H 'Content-Type: application/json' \
  -d '{
    "documentId": "<documentId>",
    "actionTitle": "Review IXP extraction",
    "actionPriority": "Medium",
    "actionCatalog": "default_du_actions",
    "actionFolder": "Shared",
    "storageBucketName": "du_storage_bucket",
    "storageBucketDirectoryPath": "du_storage_bucket",
    "extractionResult": { }
  }'

IXP extraction response structure

API v1 or v1.1

In v1 and v1.1, IXP field groups map to FieldType: "Table" in the response, with individual fields as table columns. All values are represented as text (string), regardless of their original IXP data type:

{
  "extractionResult": {
    "DocumentId": "...",
    "ResultsDocument": {
      "DocumentTypeId": "00000000-0000-0000-0000-000000000000",
      "DocumentTypeName": "Default",
      "Fields": [
        {
          "FieldId": "Fleet member transaction details",
          "FieldName": "Fleet member transaction details",
          "FieldType": "Table",
          "Values": []
        }
      ],
      "Tables": [
        {
          "FieldId": "Fleet member transaction details",
          "FieldName": "Fleet member transaction details",
          "Values": [
            {
              "Cells": [
                { "FieldId": "Fleet Code", "Value": "FL-7892", "Confidence": 0.95 },
                { "FieldId": "Fuel type", "Value": "Diesel", "Confidence": 0.97 }
              ]
            }
          ]
        }
      ]
    }
  }
}
{
  "extractionResult": {
    "DocumentId": "...",
    "ResultsDocument": {
      "DocumentTypeId": "00000000-0000-0000-0000-000000000000",
      "DocumentTypeName": "Default",
      "Fields": [
        {
          "FieldId": "Fleet member transaction details",
          "FieldName": "Fleet member transaction details",
          "FieldType": "Table",
          "Values": []
        }
      ],
      "Tables": [
        {
          "FieldId": "Fleet member transaction details",
          "FieldName": "Fleet member transaction details",
          "Values": [
            {
              "Cells": [
                { "FieldId": "Fleet Code", "Value": "FL-7892", "Confidence": 0.95 },
                { "FieldId": "Fuel type", "Value": "Diesel", "Confidence": 0.97 }
              ]
            }
          ]
        }
      ]
    }
  }
}

Key structural differences from Document Understanding (v1 or v1.1):

All fields belong to field groups, which appear as Table type in the response.
Even single-value fields are wrapped in a table row structure.
The Tables array contains the actual cell values.

API v2

In v2, IXP field groups map to FieldType: "FieldGroup" instead of Table. This is an exact mapping of the IXP field group concept. Each field preserves its actual IXP data type, such as Text, Number, Date, MonetaryQuantity, rather than representing everything as strings.

For more details, check Migrating from API v1 to v2 for details.

{
  "extractionResult": {
    "ResultsDocument": {
      "Fields": [
        {
          "FieldId": "Default.Seller",
          "FieldName": "Seller",
          "FieldType": "FieldGroup",
          "IsMissing": false,
          "DataSource": "Automatic",
          "Values": [
            {
              "Components": [
                {
                  "FieldId": "Default.Seller.Name",
                  "FieldName": "Name",
                  "FieldType": "Text",
                  "Values": [
                    {
                      "Value": "John Doe",
                      "Confidence": 0.9999834
                    }
                  ]
                }
              ]
            }
          ]
        }
      ]
    }
  }
}
{
  "extractionResult": {
    "ResultsDocument": {
      "Fields": [
        {
          "FieldId": "Default.Seller",
          "FieldName": "Seller",
          "FieldType": "FieldGroup",
          "IsMissing": false,
          "DataSource": "Automatic",
          "Values": [
            {
              "Components": [
                {
                  "FieldId": "Default.Seller.Name",
                  "FieldName": "Name",
                  "FieldType": "Text",
                  "Values": [
                    {
                      "Value": "John Doe",
                      "Confidence": 0.9999834
                    }
                  ]
                }
              ]
            }
          ]
        }
      ]
    }
  }
}

Key differences from v1:

FieldType: "FieldGroup" replaces FieldType: "Table".
The Tables array is removed. Field groups are returned directly in Fields.
Individual fields preserve their IXP data types instead of all being strings.
FieldIds use dot-notation, for example, Default.Seller.Name).

IXP discovery response structure

IXP projects expose versioning through tags and projectVersions:

{
  "id": "044fedbc-40a6-8078-8f06-02a0d362ab44",
  "name": "Transcom Invoices - Andras",
  "type": "IXP",
  "properties": ["SupportsTags", "SupportsVersions"],
  "extractors": [
    {
      "id": "gpt_ixp_67",
      "documentTypeId": "00000000-0000-0000-0000-000000000000",
      "projectVersion": 67
    }
  ],
  "projectVersions": [
    { "version": 67, "tag": "live", "deployed": true }
  ],
  "classifiers": []
}
{
  "id": "044fedbc-40a6-8078-8f06-02a0d362ab44",
  "name": "Transcom Invoices - Andras",
  "type": "IXP",
  "properties": ["SupportsTags", "SupportsVersions"],
  "extractors": [
    {
      "id": "gpt_ixp_67",
      "documentTypeId": "00000000-0000-0000-0000-000000000000",
      "projectVersion": 67
    }
  ],
  "projectVersions": [
    { "version": 67, "tag": "live", "deployed": true }
  ],
  "classifiers": []
}

The tag name, for example, live maps to the Production or Staging label in the IXP user interface.

Consider the following when calling the IXP extraction endpoints:

No prompts needed: Unlike the Document Understanding generative extractor or classifier, IXP extraction schema is pre-defined in the IXP project taxonomy. You do not pass prompts in the API call.
Tag = model version: Use the tag that corresponds to the Production or Staging version you want to call.
DocumentTypeId: IXP projects typically use a single default document type (00000000-0000-0000-0000-000000000000).
Page limits: GPT-4o up to 50 pages, Gemini up to 500 pages per call.
Metering: IXP extraction is charged as follows depending on the pricing plan you have:
- Flex Plan: 1 AI Unit per page, or 0.8 AI Units per page when the page is already classified upstream, for example, in a Document Understanding modern project.
- Unified Pricing: 0.2 Platform Units per page. Failed requests do not consume units.
Data retention: Digitization 7 days, extraction 24 hours.

Note:

Document Understanding and IXP licenses can be used together. For more details, refer to Metering and charging logic (Flex Plan) and IXP Flex Pricing Plan.

Was this page helpful?

PREVIOUSConsuming models via a workflow

NEXTAPI audit events

Prerequisites​

Creating an external application​

Getting an Access Token​

Key differences​

The IXP workflow​

Parallel digitization and extraction (IXP only)​

Step 1: Discover the IXP project​

Get tags (published versions)​

Get document types​

Step 2: Digitize the document​

Step 3: Extract​

Tag-based extraction​

Synchronous (up to 5 pages)​

Asynchronous (multi-page)​

ExtractorId-based extraction​

Step 4: Validate (optional)​

IXP extraction response structure​

API v1 or v1.1​

API v2​

IXP discovery response structure​

Was this page helpful?

Prerequisites

Creating an external application

Getting an Access Token

Key differences

The IXP workflow

Parallel digitization and extraction (IXP only)

Step 1: Discover the IXP project

Get tags (published versions)

Get document types

Step 2: Digitize the document

Step 3: Extract

Tag-based extraction

Synchronous (up to 5 pages)

Asynchronous (multi-page)

ExtractorId-based extraction

Step 4: Validate (optional)

IXP extraction response structure

API v1 or v1.1

API v2

IXP discovery response structure