- Introduction
- Setting up your account
- Balance
- Clusters
- Concept drift
- Coverage
- Datasets
- General fields
- Labels (predictions, confidence levels, label hierarchy, and label sentiment)
- Models
- Streams
- Model Rating
- Projects
- Precision
- Recall
- Annotated and unannotated messages
- Extraction Fields
- Sources
- Taxonomies
- Training
- True and false positive and negative predictions
- Validation
- Messages
- Access Control and Administration
- Manage sources and datasets
- Understanding the data structure and permissions
- Creating or deleting a data source in the GUI
- Uploading a CSV file into a source
- Preparing data for .CSV upload
- Creating a dataset
- Multilingual sources and datasets
- Enabling sentiment on a dataset
- Amending dataset settings
- Deleting a message
- Deleting a dataset
- Exporting a dataset
- Using Exchange integrations
- Model training and maintenance
- Understanding labels, general fields, and metadata
- Label hierarchy and best practices
- Comparing analytics and automation use cases
- Turning your objectives into labels
- Overview of the model training process
- Generative Annotation
- Dastaset status
- Model training and annotating best practice
- Training with label sentiment analysis enabled
- Training chat and calls data
- Understanding data requirements
- Train
- Introduction to Refine
- Precision and recall explained
- Precision and Recall
- How validation works
- Understanding and improving model performance
- Reasons for label low average precision
- Training using Check label and Missed label
- Training using Teach label (Refine)
- Training using Search (Refine)
- Understanding and increasing coverage
- Improving Balance and using Rebalance
- When to stop training your model
- Using general fields
- Generative extraction
- Using analytics and monitoring
- Automations and Communications Mining™
- Developer
- Exchange Integration with Azure service user
- Exchange Integration with Azure Application Authentication
- Exchange Integration with Azure Application Authentication and Graph
- Fetching data for Tableau with Python
- Elasticsearch integration
- Self-hosted Exchange integration
- UiPath® Automation Framework
- UiPath® Marketplace activities
- UiPath® official activities
- How machines learn to understand words: a guide to embeddings in NLP
- Prompt-based learning with Transformers
- Efficient Transformers II: knowledge distillation & fine-tuning
- Efficient Transformers I: attention mechanisms
- Deep hierarchical unsupervised intent modelling: getting value without training data
- Fixing annotating bias with Communications Mining™
- Active learning: better ML models in less time
- It's all in the numbers - assessing model performance with metrics
- Why model validation is important
- Comparing Communications Mining™ and Google AutoML for conversational data intelligence
- Licensing
- FAQs and more

Communications Mining user guide
Understanding labels, general fields, and metadata
Before designing your taxonomy, you need to understand what labels, general fields, and metadata should capture to meet your objectives. There should be minimal overlaps as they all complement each other.
Labels capture concepts, themes, and intents. For example, change of address request, urgent, status update request, and so on. You should not use labels to capture information that is present in the metadata.
General fields capture structured data points extracted from the text. For example, policy numbers, trade IDs, URLs, dates, monetary quantities, and so on.
- User properties - Defined and added pre-upload, such as NPS score.
- Email properties - Captured from emails, such as sender, recipients, domains, and so on.
- Thread properties - The platform automatically derives them for threaded data, such as emails and chats. For example, the number of messages in a thread, thread duration, and so on.
This section lists the key distinctions and similarities between labels and general fields. The two are typically used in combination for automation, but individually they serve different purposes:
Labels
- Captures intents, themes, and concepts.
- Normalizes varied expressions into one structured data point, for example, determining whether a concept applies or not.
- Assigned at the message level.
- Learns from all of the communication text, as well as certain metadata properties.
- Structured in hierarchies to add levels of specificity.
General fields
- Captures specific values of a certain type, such as the date, extracted from the text.
- Can be entirely rules-based and follow a very specific format.
- Some types can be normalized into a structured format from varied expressions.
- Communications Mining™ learns from the value of the general fields and the context of the paragraph it is in, as well as the surrounding text.
- Assigned at the paragraph level.
Common to both labels and general fields
- You can pre-train or train them from scratch.
- Pre-trained labels and general fields are predicted as soon as you enable them, and the platform automatically retrains.
- You can accept and reject label and general field predictions, and assign them when they are not predicted.
- You can use both for analytics and automation use cases.
The platform makes label predictions based on the text of the message, as well as some metadata properties. For example, for emails, this means the subject and body of the email. For general fields, it learns from the assigned span of text, and the context of the text surrounding that span.
- Subject line
- Body of the text - For threaded data, Communications Mining™ makes predictions based on the latest email only, not the full thread, which a thread ID links them.
- Some metadata - Communications Mining learns from some properties where themes can be identified, such as the sender or recipient domains, NPS scores, and so on. It does not learn from the specific senders and recipients of emails, that is, the full email addresses, and unique properties such as IDs.
The following image contains an example of a message that shows how labels, general fields, and metadata are distinct, but complementary to one another. To automate this inbound request, you may require each of them for a specific purpose:
- Labels - The Address change label is required to identify the nature of the request, that is, the intent.
- General fields - The address line, town or city, and postcode are used to capture the new values that the address would be updated to. Labels would not capture the specific values.
- Metadata - This process may only be implemented for certain clients,
identifiable through sender domain. There is no need to create labels for
specific clients as it is captured in the metadata.