communications-mining
latest
false
UiPath logo, featuring letters U and I in white
Communications Mining User Guide
Last updated Nov 19, 2024

Dastaset status

Understanding the status of your dataset

Each time that you apply labels or review general fields in your dataset, your model will retrain and a new model version is created. To understand more about using different model versions, see here.

When the model retrains, it takes the latest information it's been supplied with and recomputes all of its predictions across the dataset. This process begins when you start training and often when Communications Mining finishes applying the predictions for one model version, it is already recalculating the predictions for a newer model version. When you stop training after a period of time, Communications Mining will shortly catch up and apply the predictions that reflect the very latest training completed in the dataset.

This process can take some time, depending on the amount of training completed, the size of the dataset, and the number of labels in the taxonomy. Communications Mining has a helpful status feature to help users understand when their model is up to date, or if it is retraining and how long that is expected to take.

When you are in a dataset, one of these two icons at the top of the page will indicate its current status:

docs imageThis icon indicates that the dataset is up to date and the predictions from the latest model version have been applied.
docs imageThis indicates that the model is retraining and predictions may not be up to date.

If you hover over the icon with your mouse, you'll see more detail about the status as shown below:

Dataset status modal

Note: You may sometimes notice that Communications Mining is in the process of retraining, despite you not having applied any labels or reviewed any general fields, this can be due to our team deploying improvements to our platform and our models that can require the models to retrain. Any automations relying on a specific model version number will be unaffected.

Troubleshooting slow model training

Why is my model training slowly?
To begin, it is crucial to differentiate between two distinct processes that are often confused:
  1. Model training

    This process involves retraining the current model version to create a new one, incorporating any recent changes such as taxonomy updates or data annotations. Model training is generally fast, although the duration can vary based on several factors

  2. Applying predictions

    This process occurs after model training, where the platform retrieves and applies predictions from the trained model version to each message. Applying predictions is typically slower and the duration is primarily influenced by the size and complexity of the dataset.

Several factors can contribute to a particular model version for a dataset taking longer than expected to train or apply predictions. These include:
  • Complexity of the taxonomy of labels and fields

    Impact: The more labels and fields in your dataset, the longer it takes to train the model and apply predictions across messages.

  • Use of generative extraction

    Impact: Generative extraction requires understanding complex relationships between labels and fields, necessitating a larger and more powerful model, which can slow down training.

  • Size of your dataset (annotated and unannotated data)

    Impact: High volumes of annotated messages increase the data points the model must consider during training, extending the process. Similarly, high volumes of unannotated messages can prolong the time needed to apply predictions.

    Note: Predictions are surfaced as soon as they are available, so you don’t need to wait for them to finish applying while annotating. The platform will switch to applying predictions from the latest trained model version if it trains before the previous version's predictions are complete.
  • Number of datasets training simultaneously

    Impact: If multiple models are training simultaneously in your Communications Mining environment, this can cause temporary slowdowns as the platform load balances the required services.

  • When to contact support
    • Training: If none of the above reasons explain the slow training and it has been ongoing for more than 4 hours, please contact Support.
    • Applying Predictions: For large and complex datasets, expect applying predictions to take a long time. Only contact Support if this process has been ongoing for more than 24 hours for a single model version.
    Note: This should not block data annotation, as you will always benefit from new predictions as they become available
Why does my model not appear to be training at all?

If your model does not start training within an hour after completing an action that should trigger training (such as annotating messages with labels or fields), please contact Support.

Checking Training Status: You can verify if your model is training by checking the dataset status in the top right-hand corner of any page within a dataset

Was this page helpful?

Get The Help You Need
Learning RPA - Automation Courses
UiPath Community Forum
Uipath Logo White
Trust and Security
© 2005-2024 UiPath. All rights reserved.