- Introduction
- Setting up your account
- Balance
- Clusters
- Concept drift
- Coverage
- Datasets
- General fields
- Labels (predictions, confidence levels, label hierarchy, and label sentiment)
- Models
- Streams
- Model Rating
- Projects
- Precision
- Recall
- Annotated and unannotated messages
- Extraction Fields
- Sources
- Taxonomies
- Training
- True and false positive and negative predictions
- Validation
- Messages
- Access Control and Administration
- Manage sources and datasets
- Understanding the data structure and permissions
- Creating or deleting a data source in the GUI
- Uploading a CSV file into a source
- Preparing data for .CSV upload
- Creating a dataset
- Multilingual sources and datasets
- Enabling sentiment on a dataset
- Amending dataset settings
- Deleting a message
- Deleting a dataset
- Exporting a dataset
- Using Exchange integrations
- Model training and maintenance
- Understanding labels, general fields, and metadata
- Label hierarchy and best practices
- Comparing analytics and automation use cases
- Turning your objectives into labels
- Overview of the model training process
- Generative Annotation
- Dastaset status
- Model training and annotating best practice
- Training with label sentiment analysis enabled
- Training chat and calls data
- Understanding data requirements
- Train
- Introduction to Refine
- Precision and recall explained
- Precision and Recall
- How validation works
- Understanding and improving model performance
- Reasons for label low average precision
- Training using Check label and Missed label
- Training using Teach label (Refine)
- Training using Search (Refine)
- Understanding and increasing coverage
- Improving Balance and using Rebalance
- When to stop training your model
- Using general fields
- Generative extraction
- Using analytics and monitoring
- Automations and Communications Mining™
- Developer
- Exchange Integration with Azure service user
- Exchange Integration with Azure Application Authentication
- Exchange Integration with Azure Application Authentication and Graph
- Fetching data for Tableau with Python
- Elasticsearch integration
- Self-hosted Exchange integration
- UiPath® Automation Framework
- UiPath® Marketplace activities
- UiPath® official activities
- How machines learn to understand words: a guide to embeddings in NLP
- Prompt-based learning with Transformers
- Efficient Transformers II: knowledge distillation & fine-tuning
- Efficient Transformers I: attention mechanisms
- Deep hierarchical unsupervised intent modelling: getting value without training data
- Fixing annotating bias with Communications Mining™
- Active learning: better ML models in less time
- It's all in the numbers - assessing model performance with metrics
- Why model validation is important
- Comparing Communications Mining™ and Google AutoML for conversational data intelligence
- Licensing
- FAQs and more

Communications Mining user guide
Dastaset status
Each time that you apply labels or review general fields in your dataset, your model will retrain, and a new model version is created. To understand more about using different model versions, check Pinning and tagging a model version.
When the model retrains, it takes the latest information it has been supplied with and recomputes all of its predictions across the dataset. This process begins when you start training and often when Communications Mining™ finishes applying the predictions for one model version, it is already recalculating the predictions for a newer model version. When you stop training after a period of time, Communications Mining will shortly catch up and apply the predictions that reflect the very latest training completed in the dataset.
This process can take some time, depending on the amount of training completed, the size of the dataset, and the number of labels in the taxonomy. Communications Mining has a helpful status feature to help users understand when their model is up to date, or if it is retraining and how long that is expected to take.
When you are in a dataset, one of the following icons indicate its current status:
The dataset is up to date, and the predictions from the latest model version have been applied. | |
The model is retraining, and predictions may not be up to date. |
To view more details about the dataset status, hover over the icon with your mouse:
- Model training - This process involves retraining the current model version to create a new one, incorporating any recent changes such as taxonomy updates or data annotations. Model training is generally fast, although the duration can vary based on several factors.
- Applying predictions - This process occurs after model training, where the platform retrieves and applies predictions from the trained model version to each message. Applying predictions is typically slower and the duration is primarily influenced by the size and complexity of the dataset.
- Complexity of the taxonomy of labels and fields
Impact: the more labels and fields in your dataset, the longer it takes to train the model and apply predictions across messages.
- Use of generative extraction
Impact: generative extraction requires understanding complex relationships between labels and fields, necessitating a larger and more powerful model, which can slow down training.
- Size of your dataset (annotated and unannotated data)
Impact: high volumes of annotated messages increase the data points the model must consider during training, extending the process. Similarly, high volumes of unannotated messages can prolong the time needed to apply predictions.
Note: Predictions are surfaced as soon as they are available, so you don’t need to wait for them to finish applying while annotating. The platform will switch to applying predictions from the latest trained model version if it trains before the previous version's predictions are complete. - Number of datasets training simultaneously
Impact: if multiple models are training simultaneously in your Communications Mining™ environment, this can cause temporary slowdowns as the platform load balances the required services.
- When to contact support
- Training - If none of the previous reasons explain the slow training and it has been ongoing for more than 4 hours, contact the UiPath® Product Support team.
- Applying predictions - For large and complex datasets, expect applying predictions to take a long time. Only contact the Product Support team if this process has been ongoing for more than 24 hours for a single model version.
Note: This should not block data annotation, as you will always benefit from new predictions as they become available.
Model appears to not train at all
If your model does not start training within an hour after completing an action that should trigger training, such as annotating messages with labels or fields, contact the UiPath® Product Support team.
You can verify if your model is training by checking the dataset status on any page within a dataset.