ixp
latest
false
- Overview
- Model building
- Model validation
- Model deployment
- Frequently asked questions
Overview

Unstructured and complex documents user guide
Last updated Aug 11, 2025
This section outlines the process involved in validating the performance of model versions in a project. Validating model performance is critical to ensuring the accuracy and reliability of the model before it is deployed to a production environment.
- Evaluate model performance by comparing different model versions.
- Gather validation statistics.
- Refine the model until it reaches the performance level suitable for your use case as follows:
- Review model predictions.
- Iterate on the extraction schema.
The dashboard from the Measure tab includes the following details:
- The performance of complete extractions for a specific field group and all fields of a field group.
- The average performance of all fields in a specific field group.
- The individual field-level performance.
The following list contains a description of all field performance indicators:
- Red dial - A red field performance dial indicates that not enough annotated examples have been provided.
- Amber circle - An amber performance indicator is displayed when a field’s performance is less than satisfactory.
- Red circle - A red performance indicator is displayed when a field is performing poorly.
- Recall - Among the true extractions, how many extractions the model actually predicted.
- Precision - Among the extractions that the model applied, how many extractions were actually correct.
- F1 Score - Harmonic mean between precision and recall.
When you understand the field-level performance and the impact of changing field instructions, these can help you determine if the model is production-ready.
- Annotate at least 10 documents and 10 fields to get a meaningful project and field score.
- You should decide when to stop training the model based on your specific business needs and use case objectives. This means that you may require certain fields to have a higher precision and recall than others.
Note: High-precision models minimize false positives, while high-recall models reduce false negatives.