ixp

latest

false

Overview
Model building
Model validation
Model deployment
Frequently asked questions
- Frequently asked questions

Unstructured and complex documents user guide

Last updated Oct 27, 2025

Model configuration

Overview

You can configure the underlying LLM as well as its settings in the Model configuration option from the Build tab.

The available settings are:

Intelligent pre-processing:
- None
- Table model - mini
- Table model
Extraction model:
- GPT-4o
- Gemini
Advanced options:
- Attribution
- Temperature
- Top P
- Seed
- Frequency penalty
- Prompt override

Adjust these settings to improve the accuracy of the model's predictions and enhance its performance.

Intelligent pre-processing

Intelligent pre-processing options improve prediction performance when documents are difficult for models to interpret, due to complex formatting.

This includes the following options:

None - This default option is suitable for most documents that do not have tabular content.
Table model - mini - Optimized for tabular content and latency. This option is best suited for documents with simple tables or multiple tables.
Table model - Optimized for more complex tabular content. This option is best suited for documents with complex nested tables, tables with merged cells, bullet points, or tables spanning across multiple pages.
Note:
- While this performs best on complex tables, it increases the latency of predictions.
- This feature relies on Gemini models through the AI Trust Layer.

Example of Intelligent pre-processing

The following image contains an example of an extraction querying LLM without using the Table model mode, where the values from the column

this
                        period

are confused for the ones in the year to date column.

The following image contains an example of an extraction using the Table model mode, where the values from both columns, this period and

year
                        to date

, are extracted correctly.

Extraction models

The Extraction model option represents the underlying LLM used for extraction.

The available models are:

GPT-4o
Gemini

Choosing the most suitable model

Different models will perform differently for different use cases, but you are recommended to use Gemini where possible. Several other pre- and post-processing features, which help optimize performance and user experience, are also Gemini-based.

GPT-4o has a restriction of 50 pages and can only process more using the currently previewed iterative calling feature.

Gemini can process documents in IXP up to 200 pages in a single call, with higher page counts supported in preview. The Gemini limit may vary slightly based on the density of field values within the document. In addition, the Gemini model has an input limit of 200 pages by default, compared to the 50-page input limit of GPT-4o.

Switching from one model to another

To switch from one model to another, use the dropdown list of the Extraction model option and select Save. This will trigger a new project version to be created and new predictions to be generated automatically.

Important: For mature projects, taxonomies, particularly instructions, and confirmed predictions, particularly for inferred fields, are typically optimized for one model type over the other. It is likely that after switching, performance scores can drop, as some iteration on instructions and re-reviewing predictions may be required to undo model-specific optimizations that may be impacting the performance of the other model.

If you need to switch the model for performance reasons, check first whether the alternative model can solve the core problem that the current model cannot solve. If it can, optimize the new model to improve the performance metrics in Measure.

Advanced options

Advanced options allow you to customize the settings for your models, select which attribution method to use, and use the prompt override.

Note: Using prompt override is only recommended in exceptional cases.

Expand the setting to view all available options:

Attribution - The method used for attributing predictions to the relevant part or text in the document. Select one of the following options:
- Rules-based - Uses an extensive set of rules and heuristics to match the correct spans on a page to the predicted values from the model. This is a low-latency option, but it sacrifices performance in terms of successful attributions compared to the model-based option.
- Model-based - Uses an additional LLM call to successfully match the predicted values to the correct spans on the page, as these values can often be repeated in different parts of the page. This is the most performant option in terms of successful attributions, but it does add some latency to predictions. This option relies on using Gemini models.
Temperature - The sampling temperature to use. Select a number between 0.0 and 2.0. Higher values make the output more random.
Top P - Samples only from tokens with the top_p probability mass. Select a number between 0.0 and 1.0.
Seed - If specified, repeated requests with the same seed and parameters should return the same result.
Frequency penalty - Select a number between -2.0 and 2.0. Positive values reduce the probability of the model repeating tokens that have already appeared in the text.
Prompt override - Overrides the default system prompt with a new value. This option is disabled by default. Once enabled, the Append task instructions prompt and the Append field instructions prompt options are enabled for configuration.

Note: The UiPath® team has researched and optimized the defaults for model settings such as Temperature, Top P, and Frequency. As a result, you do not need to adjust these values unless you know what specific settings you need.

On this page

Overview
Intelligent pre-processing
Example of Intelligent pre-processing
Extraction models
Advanced options

Was this page helpful?

PREVIOUSValidating extraction predictions

NEXTOverview

Support and Services

Get The Help You Need

UiPath Academy

Learning RPA - Automation Courses

UiPath Forum

UiPath Community Forum

Trust and Security

Cookies Policy