- Getting started
- Balance
- Clusters
- Concept drift
- Coverage
- Datasets
- General fields (previously entities)
- Labels (predictions, confidence levels, hierarchy, etc.)
- Models
- Streams
- Model Rating
- Projects
- Precision
- Recall
- Reviewed and unreviewed messages
- Sources
- Taxonomies
- Training
- True and false positive and negative predictions
- Validation
- Messages
- Administration
- Manage sources and datasets
- Understanding the data structure and permissions
- Create a data source in the GUI
- Uploading a CSV file into a source
- Create a new dataset
- Multilingual sources and datasets
- Enabling sentiment on a dataset
- Amend a dataset's settings
- Delete messages via the UI
- Delete a dataset
- Delete a source
- Export a dataset
- Using Exchange Integrations
- Preparing data for .CSV upload
- Model training and maintenance
- Understanding labels, general fields and metadata
- Label hierarchy and best practice
- Defining your taxonomy objectives
- Analytics vs. automation use cases
- Turning your objectives into labels
- Building your taxonomy structure
- Taxonomy design best practice
- Importing your taxonomy
- Overview of the model training process
- Generative Annotation (NEW)
- Dastaset status
- Model training and annotating best practice
- Training with label sentiment analysis enabled
- Train
- Introduction to Refine
- Precision and recall explained
- Precision and recall
- How does Validation work?
- Understanding and improving model performance
- Why might a label have low average precision?
- Training using Check label and Missed label
- Training using Teach label (Refine)
- Training using Search (Refine)
- Understanding and increasing coverage
- Improving Balance and using Rebalance
- When to stop training your model
- Using general fields
- Generative extraction
- Using analytics and monitoring
- Automations and Communications Mining
- Licensing information
- FAQs and more
Overview
The Explore page has various training modes, and this phase focuses primarily on three of them:
'Shuffle' - shows a random selection of messages for users to annotate. It's vital to complete a significant chunk of training in Shuffle, in order to create a training set of examples that is representative of the wider dataset.
'Teach' (for unreviewed messages) - as soon as the platform is making some reasonable predictions for a label, you can improve its' ability to predict the label for more varied examples by reviewing messages in the default Teach mode (which is for unreviewed messages). This will show you messages where the platform is unsure whether the selected label applies or not.
'Low Confidence' - shows you messages that are not well covered by informative label predictions. These messages will have either no predictions or very low confidence predictions for labels that the platform understands to be informative.
This section of the Knowledge Base will also cover training using Search in Explore, which is very similar to training using Search in Discover.
There is another training mode in Explore - Teach (for reviewed messages) - that is explained in the 'Refining Models & Using Validation' section of the Knowledge Base here.
Layout explained:
A | Adjust the date range or period of messages shown |
B | Add various other filters based on the metadata of the messages, e.g. score or sender |
C | Add a general field filter |
D | Toggle from all messages to either reviewed or unreviewed messages, also adjusts pinned vs predicted label counts |
E | Add a label filter |
F | Search for specific labels within your taxonomy |
G | Add additional labels |
H | Expand message's metadata |
I | Refresh the current query |
J | Switch between different training modes such as recent, shuffle, teach and low confidence, and select label to sort by |
K | Search the dataset for messages containing specific words or phrases |
L | Download all of the messages on this page or export the dataset with applied filters as a CSV file |
The number of examples required to accurately predict each label can vary a lot depending on the breadth or specificity of a label concept.
It may be that a label is typically associated with very specific and easily identifiable words, phrases or intents, and the platform is able to predict it consistently with relatively few training examples. It could also that a label captures a broad topic with lots of different variations of language that would be associated with it, in which case it could require significantly more training examples to allow the platform to consistently identify instances where the label should apply.
The platform can often start making predictions for a label with as little as five examples, though in order to accurately estimate the performance of a label (how well the platform is able to predict it), each label requires at least 25 examples.
When annotating in Explore, the little red dials (examples shown below) next to each label indicate whether more examples are needed to accurately estimate the label's performance. The dial starts to disappear as you provide more training examples and will disappear completely once you reach 25.
This does not mean that with 25 examples the platform will be able to accurately predict every label, but it will at least be able to validate how well it's able to predict each label and alert you if additional training is required.
During the Explore phase, you should therefore ensure that you've provided at least 25 examples for all of the labels that you are interested in, using a combination of the steps mentioned above (mostly Shuffle and Teach + Unreviewed).
During the Refine phase it may become clear that more training is required for certain labels to improve their performance, and this is covered in detail here.
In Explore, once you reach 25 pinned examples for a label, you may see one of the below label performance indicators in place of the training dial:
- The grey circle is an indicator that the platform is calculating the performance of that label - it will update to either disappear, or an amber or red circle once calculated
- Amber is an indicator that the label has slightly less than satisfactory performance and could be improved
- Red is an indicator that the label is performing poorly and needs additional training / corrective actions to improve it
- If there is no circle, then this means that the label is performing at a satisfactory level (though still may need improving depending on the use case and desired accuracy levels)
- To understand more about label performance and how to improve it, you can start here
If you click the tick icon (as shown below) at the top of the label filter bar to filter to reviewed messages, you will be shown the number of reviewed messages that have that label applied.
If you click the computer icon to filter to unreviewed messages, you will be shown the total number of predictions for that label (which includes the number of reviewed examples too).
In Explore, when neither reviewed or unreviewed is selected, the platform shows the total number of pinned messages for a label as default. In Reports, the default is to show the total predicted.
- The model can start to make predictions with only a few annotated messages, though for it to make reliable predictions, you should annotate at a minimum of 25 messages per label. Some will require more than this, it will depend on the complexity of the data, the label and the consistency with which the labels have been applied
- In Explore, you should also try and find messages where the model has predicted a label incorrectly. You should remove incorrect labels and apply correct ones. This process helps to prevent the model from making a similar incorrect prediction in future