- Overview
- Model building
- Model validation
- Model deployment
- Frequently asked questions

Unstructured and complex documents user guide
Managing fields
This section describes how to create and configure field groups, fields, and field types, as well as how to add prompt instructions.
- Overall extraction instructions - provide context to the model that is relevant across the taxonomy, including details on the overall extraction task and the documents in the project.
- Instructions - provide context for the model on how to successfully extract data from the document. Iterating on instructions helps to improve predictions.
- Go to the Build tab, and select Manage taxonomy.
- Add your Overall extraction instructions.
Note: Project-level instructions can include a description of the industry or document type, or document-specific considerations, such as multiple occurrences of the document within one file.
- Select New field group, under the Fields tab, and fill in the required fields:Note: You can only add individual fields within their respective field groups after you create the field groups.
- Field group name: Give your field group a name using natural language.
You can use the greater-than sign
>
to define the field group hierarchy, if applicable to your use case. This sign establishes a relationship between the parent field group and the child field group. If there are no predictions found for the parent field group, none are returned for the child. You can think of the parent field group as the initial classification.Note: The instructions for a parent field group do not influence a child field group. - Instruction: Give your field group a description, using natural language.
- Field group name: Give your field group a name using natural language.
- Select Add.
You can configure additional fields and field types directly from the Validate predictions page at any point during the annotation process.
>
to separate them.
Check out the following example of field groups and their hierarchies:
- Invoice
- Invoice Number
- Invoice > Line Items
- Unit Price
- Quantity
- Line Amount
- Expand the relevant field group by selecting the drop-down icon.
- Select New field to create individual fields.
- Fill in the required details as follows:
- Field name: Give your field a name that accurately describes the data it represents.
- Instructions: Give your field a relevant and detailed description to provide the necessary context for extraction.
- Each field must have assigned a field type, which can be one of the pre-configured or custom field types. Use the Field type drop-down menu and select one of the pre-configured options: Date, Exact Text, Inferred Text, Monetary Quantity, Number, or Boolean.
Note:
- You can reuse field types across different fields, which allows you to share the instructions.
- You can change the field types once the field is created. However, changing the field type will result in the loss of any existing annotations.
- Text field type can have input values that are present in the document and extracted as-is (Exact Text), or can be inferred from the document, if it is not explicitly stated in the text (Inferred Text).
- If you want to create a custom field type, select the New field type option from the drop-down list. For more details, check Creating and configuring field types.
- If you want to create an additional field, select Create another field and fill in the required details as previously explained.
- Select Save.
To create new field types, follow these steps:
- Select New field type in the Field
types tab from the Manage taxonomy page.
- Fill in the required fields:
- Name - the name of the field type.
- Instruction - should include common instructions on how the data is formatted and should be extracted for all fields that share the field type.
Note:- You can reuse field types across different fields, which allows you to share the instructions.
- The field type instruction is used as a formatting instruction to normalize the outputs into a specific format. For example,
to extract all dates as
YYYY-MM-DD
.
- Use the Data type drop-down list to select
one of the following values:
- String: can include
any characters such as letters, numbers, and so on. It can also have input
values that are explicitly present in or inferred from the document. For
example, organization name, first name, address line, or phone number.
- Select one of the
following for the Input value:
- Must be present in the document: the value must be extracted exactly as it appears within the document.
- Inferred from the document: the extracted value can be inferred from context and does not need to exactly match the text within the document.
- Select one of the
following for the Input value:
- Date: comes in unstructured formats that vary, and uses the UiPath pre-trained date field. For example, start dates, expiration dates.
- Number: comes in unstructured formats that vary, and uses the UiPath pre-configured field type to structure the values in a standardized format. For example, the number of items, change in percentage, decimal values.
- Monetary Quantity: comes in unstructured formats that vary, and uses the UiPath pre-trained monetary quantity model. For example, total premium value, fees due.
- Boolean: True or False values that are inferred from documents. For example, True can be for an existing customer and False for a non-existent customer.
- Choice: the Inferred
or Exact values that are mapped to a set of pre-defined values. For
example:
- Languages: English, German, French.
- Document types: water bill, gas bill, energy bill.
- Product categories: investment account, savings account, current account.
- Customer types: tier 1, tier 2, tier 3.
Once you select Choice as your data type, the following options are displayed:- Display value
- Alternate Values
- Add choice
You can input values and optionally annotate evidence. The value will be mapped to a set of given values where possible.
Important: Once the data type is configured, you cannot change it. Make sure you select the correct data type, otherwise, you must delete the field type and recreate it with the correct data type. This is because you cannot remap annotations for incompatible field types that have different data types. - String: can include
any characters such as letters, numbers, and so on. It can also have input
values that are explicitly present in or inferred from the document. For
example, organization name, first name, address line, or phone number.
- Select Save.
Inferred fields
I work for the underwriting operations team at an insurance company, and we have hundreds of set policy categories, for example, automotive, home, health, luxury goods, and so on, that we offer to customers, each corresponding to a Type category, for example, Type A, B, C, and so on.
Based on the content of the document, I want to be able to extract and identify the Type category of the policy that needs to be processed.
In this example, there is nowhere in the message that explicitly states that this email pertains to Type E. In the instructions, context is provided for each insurance type to inform the predictions of the model. For example, claims related to luxury goods all belong to the Type E category.
- Values that are not present anywhere in a document, but are implied from its context
- Values that need to be concatenated across different areas in a document.
- Values that span across multiple paragraphs, lines, or columns.
Exact fields
To facilitate this request, I may need the existing policy number, name, and the value claimed. These are values that I know need to be explicitly stated in the document itself and extracted into a downstream process.