ixp

latest

false

Communications Mining user guide

Last updated Nov 10, 2025

Model training FAQs

The information on this page is split into the following sections:

General model training
Label training

General model training

What is the objective of training a model?

The objective of training a model is to create a set of training data that is as representative as possible of the dataset as a whole, so that the platform can accurately and confidently predict the relevant labels and general fields for each message. The labels and general fields within a dataset should be intrinsically linked to the overall objectives of the use case and provide significant business value.

Why can I not see anything in Discover if I've just uploaded data into the platform?

As soon as data is uploaded to the platform, the platform begins a process called unsupervised learning, through which it groups messages into clusters of similar semantic intent. This process can take up to a couple of hours, depending on the size of the dataset, and clusters will appear once it is complete.

How much historical data do I need to train a model?

To be able to train a model, you need a minimum amount of existing historical data. This is used as training data to provide the platform with the necessary information to confidently predict each of the relevant concepts for your analysis and/or automation.

The recommendation for any use case is a minimum of 12 months of historical data, in order to properly capture any seasonality or irregularity in the data, such as month-end processes and busy seasons.

Do I need to save my model every time I make a change?

No, you do not need to save your model after any changes are made. Every time you train the platform on your data, that is, annotating any messages, a new model version is created for your dataset. Performance statistics for older model versions can be viewed in the Validation page.

How do I know what the performance of the model is?

Check the Validation page in the platform, which reports various performance measures and provides a holistic model health rating. This page updates after every training event and it can be used to identify areas where the model may need more training examples or some label corrections in order to ensure consistency.

For complete explanations of model performance and how to improve it, check Validation .

Why are there only 30 clusters available and can we set them individually?

The clusters are a helpful way to help you quickly build up your taxonomy, but users will spend most of their time training in the Explore page rather than in Discover.

If users spend too much time annotating via clusters, there’s a risk of overfitting the model to look for messages that only fit these clusters when making predictions. The more varied examples there are for each label, the better the model will be at finding the different ways of expressing the same intent or concept. This is one of the main reasons why we only show 30 clusters at a time.

Once enough training has been completed or a significant volume of data has been added to the platform, however, Discover does retrain. When it retrains, it takes into account the existing training to-date, and will try to present new clusters that are not well covered by the current taxonomy.

For more details, check Discover.

How many messages are in each cluster?

There are 30 clusters in total, each containing 12 messages. In the platform, you are able to filter the number of messages shown on the page in increments between 6 and 12 per page. Our recommendation is annotating 6 at a time to ensure that you reduce the risk of partially annotating any messages.

What do precision and recall mean?

Precision and recall are metrics used to measure the performance of a machine learning model. A detailed description of each can be found under the Using Validation section of our how-to guides.

Can I return to an earlier version of my model?

You can access the validation overview of earlier models by hovering over Model Version in the Validation page. This can be helpful for tracking and comparing progress as you train out your model.

If you need to roll your model back to a previous pinned version, check Model rollback for more details.

Label training

Can I change the name of a label later on?

Yes, it’s really easy to do. You can go into the settings for each label and rename it at any point. For more details, check Label editing.

How do I find out the number of messages I have annotated?

Information about your dataset, including how many message that have been annotated, is displayed in the Datasets Settings page. For more details how to access it, check Amend dataset settings.

One of my labels is performing poorly, what can I do to improve it?

If the Validation page shows that your label is performing poorly, there are various ways to improve its performance. To understand more, check Understanding and improving model performance.

What does the red dial next to my label or general field indicate? How do I get rid of it?

The little red dials next to each label/general field indicate whether more examples are needed for the platform to accurately estimate the label/general field's performance. The dials start to disappear as you provide more training examples and will disappear completely once you reach 25 examples.

After this, the platform will be able to effectively evaluate the performance of a given label/general field and may return a performance warning if the label or general field is not healthy.

Should I avoid annotating empty or uninformative messages?

The platform is able to learn from empty messages and uninformative messages as long as they are annotated correctly. However, it is worth noting that uninformative labels will likely need a significant number of training examples, as well as to be loosely grouped by concept, to ensure best performance.

On this page

General model training
What is the objective of training a model?
Why can I not see anything in Discover if I've just uploaded data into the platform?
How much historical data do I need to train a model?
Do I need to save my model every time I make a change?
How do I know what the performance of the model is?
Why are there only 30 clusters available and can we set them individually?
How many messages are in each cluster?
What do precision and recall mean?
Can I return to an earlier version of my model?
Label training
Can I change the name of a label later on?
How do I find out the number of messages I have annotated?
One of my labels is performing poorly, what can I do to improve it?
What does the red dial next to my label or general field indicate? How do I get rid of it?
Should I avoid annotating empty or uninformative messages?

Was this page helpful?

PREVIOUSData upload & management FAQs

NEXTAnalytics FAQs

Support and Services

Get The Help You Need

UiPath Academy

Learning RPA - Automation Courses

UiPath Forum

UiPath Community Forum

Trust and Security

Cookies Policy