- Introduction
- Overview
- How businesses can use Communications Mining™
- Getting started using Communications Mining™
- Setting up your account
- Balance
- Clusters
- Concept drift
- Coverage
- Datasets
- General fields
- Labels (predictions, confidence levels, label hierarchy, and label sentiment)
- Models
- Streams
- Model Rating
- Projects
- Precision
- Recall
- Annotated and unannotated messages
- Extraction Fields
- Sources
- Taxonomies
- Training
- True and false positive and negative predictions
- Validation
- Messages
- Access Control and Administration
- Manage sources and datasets
- Understanding the data structure and permissions
- Creating or deleting a data source in the GUI
- Uploading a CSV file into a source
- Preparing data for .CSV upload
- Creating a dataset
- Multilingual sources and datasets
- Enabling sentiment on a dataset
- Amending dataset settings
- Deleting a message
- Deleting a dataset
- Exporting a dataset
- Using Exchange integrations
- Model training and maintenance
- Understanding labels, general fields, and metadata
- Label hierarchy and best practices
- Comparing analytics and automation use cases
- Turning your objectives into labels
- Overview of the model training process
- Generative Annotation
- Dastaset status
- Model training and annotating best practice
- Training with label sentiment analysis enabled
- Training chat and calls data
- Understanding data requirements
- Train
- Introduction to Refine
- Precision and recall explained
- Precision and Recall
- How validation works
- Understanding and improving model performance
- Reasons for label low average precision
- Training using Check label and Missed label
- Training using Teach label (Refine)
- Training using Search (Refine)
- Understanding and increasing coverage
- Improving Balance and using Rebalance
- When to stop training your model
- Using general fields
- Generative extraction
- Using analytics and monitoring
- Automations and Communications Mining™
- Developer
- Exchange Integration with Azure service user
- Exchange Integration with Azure Application Authentication
- Exchange Integration with Azure Application Authentication and Graph
- Fetching data for Tableau with Python
- Elasticsearch integration
- Self-hosted Exchange integration
- UiPath® Automation Framework
- UiPath® Marketplace activities
- UiPath® official activities
- How machines learn to understand words: a guide to embeddings in NLP
- Prompt-based learning with Transformers
- Efficient Transformers II: knowledge distillation & fine-tuning
- Efficient Transformers I: attention mechanisms
- Deep hierarchical unsupervised intent modelling: getting value without training data
- Fixing annotating bias with Communications Mining™
- Active learning: better ML models in less time
- It's all in the numbers - assessing model performance with metrics
- Why model validation is important
- Comparing Communications Mining™ and Google AutoML for conversational data intelligence
- Licensing
- FAQs and more

Communications Mining user guide
Getting started using Communications Mining™
The this page describes key steps required to set up and deliver a Communications Mining use case:
Automation Cloud users
If you are an Automation Cloud user and have AI Units or Platform Units enabled, you can access Communications Mining through the UiPath® IXP service in Automation Cloud. If you do not have any units, but want to start using Communications Mining, contact your account manager.
To access Communications Mining on Automation Cloud, the following conditions must be met:
- An administrator must enable IXP as a service on your Automation Cloud tenant. For this action, an enterprise licence is required, and your Automation Cloud organization must have AI Units or Platform Units available. For more details, check Enabling Communications Mining.
- You must be an existing user on the Automation Cloud tenant. If not, ask an administrator from your Automation Cloud tenant to add you.
- how to access Communications Mining on Automation Cloud for the first time, check Getting set up as an Automation Cloud user.
- how to manage your account on Automation Cloud, check Account management.
Legacy users
You do not need to be an Automation Cloud user to access Communications Mining.
- how to access Communications Mining on Automation Cloud for the first time, check Getting set up as a legacy user.
- how to manage your account, check Account management (Legacy access).
Projects can be thought of as restricted workspaces. Each dataset and data source is associated with a specific project, with users requiring permissions in those projects to be able to work with the data within them. Datasets in one project can be made up of data sources from multiple projects. Users will just require permissions in both projects to view and annotate the data.
For more details on data structure, check Understanding the data structure and permissions.
For Automation Cloud users, every tenant has a Default Project that all users within the tenant have access to. Before uploading data, creating datasets, and training models, it is strongly recommended to create a new project with access limited to only those individuals who require access to that data. Once created, it is difficult to move data sources and datasets into different projects.
To create a new project, follow the steps described in Creating a new project (Automation Cloud).
Strict user permissions control access to Communications Mining tenants, projects, data sources, and datasets. You need to allocate permissions to each user. Permissions can provide access to sensitive data and allow users to perform a range of different actions in the platform. As a result, users should only be given the permissions they need to fulfil their roles. For a more detailed explanation of user permissions, check Roles and their underlying permissions.
- creating a new legacy user, check Creating a new user (non-Automation Cloud admins).
- adding a user to a project, check Adding a user to a project.
- updating user permissions, check Updating roles and permissions.
Data sources are collections of raw, unannotated communications data of a similar type, for example, emails from a shared mailbox or a collection of NPS survey responses.
Creating a source in the GUI sets up an empty source with defined properties, and data can then be uploaded through the API. The setup of this source can also be done through the API.
Once the source is created, data can be uploaded through:
- Integration, that is, Exchange integration, Salesforce integration, and so on.
- Static CSV upload.
- creating a new data source in the GUI, check Create or delete a data source in the GUI.
- uploading a CSV file into a source, check Uploading a CSV file into a source.
- integration guidance and technical documentation, check the Integration guides overview.
Datasets are comprised of one or more data sources, a maximum of 20, and the model that you train.
If there are multiple sources in a dataset, they should share a similar intended purpose for your analysis or automation.
When you create a new dataset, you can choose to create a copy of a pre-existing dataset. This means that you copy over the same sources, general fields, sentiment selection, labels, and reviewed examples.
Model training involves creating and training a set of labels, that is, a set of intents or concepts, and messages, that is, structured data points, applied to individual communications within the dataset. As we begin to train the model, the machine learning models within the platform will train in real-time and start predicting where else in the dataset these labels and entities may apply.
Training a model requires a model trainer who knows the data inside out. The model trainer imparts their knowledge to the model by training a small set of training data that represents the dataset as a whole, and empowers the model to make predictions on the entire dataset.
Prerequisites before you start training a Communications Mining model include:
- Defined objectives and success criteria.
- Designed a taxonomy of labels and fields.
- Business SMEs with domain-specific knowledge.
- Ring-fenced time to train the model.
The model training process consists of the following key phases: Discover, Explore, and Refine. The Train feature provides a guided training experience that walks users through each phase of training step-by-step.
Any model used in production needs to be effectively maintained to ensure continued high performance. This includes preventing concept drift, and creating an exception process.
For more details on model training, check the following resources:
- Preparing for model training
- Model training:
- Model maintenance
The platform has a built-in reporting and analytics capability that can help you identify potential issues and improvement opportunities across your communications channels. For example:
- Requests that are transactional in nature can be good candidates for automation or self-service.
- Requests that get no response or follow-up can potentially be eliminated.
- No-action required emails, that is OOO, spam, auto-generated emails, and thank you emails, can potentially be deleted from a mailbox.
- Urgent queries that need to be prioritized and resolved immediately.
- Root causes that are driving customer dissatisfaction, escalations, or chasers.
For more details on generating insight and building reports, check Using analytics and monitoring overview.
The platform enables downstream automation by creating a queue of communications that a robot can read.
Confidence thresholds levels drive these queues. Setting a threshold means that for the message to enter the queue, the platform must predict that label with a confidence that is equal to or greater than the threshold you set.
- creating and managing streams, check Selecting label confidence thresholds.
- the overview of the Communications Mining automation framework, check the UiPath Automation Framework.