- Introduction
- Setting up your account
- Balance
- Clusters
- Concept drift
- Coverage
- Datasets
- General fields
- Labels (predictions, confidence levels, label hierarchy, and label sentiment)
- Models
- Streams
- Model Rating
- Projects
- Precision
- Recall
- Annotated and unannotated messages
- Extraction Fields
- Sources
- Taxonomies
- Training
- True and false positive and negative predictions
- Validation
- Messages
- Access Control and Administration
- Manage sources and datasets
- Understanding the data structure and permissions
- Creating or deleting a data source in the GUI
- Uploading a CSV file into a source
- Preparing data for .CSV upload
- Creating a dataset
- Multilingual sources and datasets
- Enabling sentiment on a dataset
- Amending dataset settings
- Deleting a message
- Deleting a dataset
- Exporting a dataset
- Using Exchange integrations
- Model training and maintenance
- Understanding labels, general fields, and metadata
- Label hierarchy and best practices
- Comparing analytics and automation use cases
- Turning your objectives into labels
- Overview of the model training process
- Generative Annotation
- Dastaset status
- Model training and annotating best practice
- Training with label sentiment analysis enabled
- Training chat and calls data
- Understanding data requirements
- Train
- Introduction to Refine
- Precision and recall explained
- Precision and Recall
- How validation works
- Understanding and improving model performance
- Reasons for label low average precision
- Training using Check label and Missed label
- Training using Teach label (Refine)
- Training using Search (Refine)
- Understanding and increasing coverage
- Improving Balance and using Rebalance
- When to stop training your model
- Using general fields
- Generative extraction
- Using analytics and monitoring
- Automations and Communications Mining™
- Developer
- Exchange Integration with Azure service user
- Exchange Integration with Azure Application Authentication
- Exchange Integration with Azure Application Authentication and Graph
- Fetching data for Tableau with Python
- Elasticsearch integration
- Self-hosted Exchange integration
- UiPath® Automation Framework
- UiPath® Marketplace activities
- UiPath® official activities
- How machines learn to understand words: a guide to embeddings in NLP
- Prompt-based learning with Transformers
- Efficient Transformers II: knowledge distillation & fine-tuning
- Efficient Transformers I: attention mechanisms
- Deep hierarchical unsupervised intent modelling: getting value without training data
- Fixing annotating bias with Communications Mining™
- Active learning: better ML models in less time
- It's all in the numbers - assessing model performance with metrics
- Why model validation is important
- Comparing Communications Mining™ and Google AutoML for conversational data intelligence
- Licensing
- FAQs and more

Communications Mining user guide
Precision and Recall
When you build a taxonomy by annotating data, you are creating a model. This model will use the labels you have applied to a set of data to identify similar concepts and intents in other messages and predict which labels apply to them.
In doing so, each label will have its own set of precision and recall scores.
For example, consider having, as part of a taxonomy, a label in the platform called Request for information. For this scenario, precision and recall could relate to this as follows:
- Precision - For every 100 messages predicted as having the ‘Request for information’ label, it is the percentage of times that the ‘Request for information’ was correctly predicted out of the total times it was predicted. A 95% precision would mean that for every 100 messages, 95 would correctly be annotated as ‘Request for information’, and 5 would be wrongly annotated (i.e. they should not have been annotated with that label)
- Recall - For every 100 messages which should have been annotated as ‘Request for information’, how many did the platform find. A 77% recall would mean that there were 23 messages which should have been predicted as having the ‘Request for information’ label apply, but it missed them
Recall across all labels is directly related to the coverage of your model.
If you are confident that your taxonomy covers all of the relevant concepts within your dataset, and your labels have adequate precision, then the recall of those labels will determine how well covered your dataset is by label predictions. If all of your labels have high recall, then your model will have high coverage.
We also need to understand the trade-off between precision and recall within a particular model version.
The precision and recall statistics for each label in a particular model version are determined by a confidence threshold (i.e. how confident is the model that this label applies?).
The platform publishes precision and recall statistics live in the Validation page, and users are able to understand how different confidence thresholds affect the precision and recall scores using the adjustable slider.
As you increase the confidence threshold, the model is more certain that a label applies and therefore, precision will typically increase. At the same time, because the model needs to be more confident to apply a prediction, it will make fewer predictions and recall will typically decrease. The opposite is also typically the case as you decrease the confidence threshold.
So, as a rule of thumb, when you adjust the confidence threshold and precision improves, recall will typically decrease, and vice versa.
Within the platform, it’s important to understand this trade-off and what it means when setting up automations using the platform. Users will have to set a confidence threshold for the label that they want to form part of their automation, and this threshold needs to be adjusted to provide precision and recall statistics that are acceptable for that process.
Certain processes may value high recall (catching as many instances of an event as possible), whilst others will value high precision (correctly identifying instances of an event).