- Introduction
- Setting up your account
- Balance
- Clusters
- Concept drift
- Coverage
- Datasets
- General fields
- Labels (predictions, confidence levels, label hierarchy, and label sentiment)
- Models
- Streams
- Model Rating
- Projects
- Precision
- Recall
- Annotated and unannotated messages
- Extraction Fields
- Sources
- Taxonomies
- Training
- True and false positive and negative predictions
- Validation
- Messages
- Access Control and Administration
- Manage sources and datasets
- Understanding the data structure and permissions
- Creating or deleting a data source in the GUI
- Uploading a CSV file into a source
- Preparing data for .CSV upload
- Creating a dataset
- Multilingual sources and datasets
- Enabling sentiment on a dataset
- Amending dataset settings
- Deleting a message
- Deleting a dataset
- Exporting a dataset
- Using Exchange integrations
- Model training and maintenance
- Understanding labels, general fields, and metadata
- Label hierarchy and best practices
- Comparing analytics and automation use cases
- Turning your objectives into labels
- Overview of the model training process
- Generative Annotation
- Dastaset status
- Model training and annotating best practice
- Training with label sentiment analysis enabled
- Training chat and calls data
- Understanding data requirements
- Train
- Introduction to Refine
- Precision and recall explained
- Precision and Recall
- How validation works
- Understanding and improving model performance
- Reasons for label low average precision
- Training using Check label and Missed label
- Training using Teach label (Refine)
- Training using Search (Refine)
- Understanding and increasing coverage
- Improving Balance and using Rebalance
- When to stop training your model
- Using general fields
- Generative extraction
- Using analytics and monitoring
- Automations and Communications Mining™
- Developer
- Exchange Integration with Azure service user
- Exchange Integration with Azure Application Authentication
- Exchange Integration with Azure Application Authentication and Graph
- Fetching data for Tableau with Python
- Elasticsearch integration
- Self-hosted Exchange integration
- UiPath® Automation Framework
- UiPath® Marketplace activities
- UiPath® official activities
- How machines learn to understand words: a guide to embeddings in NLP
- Prompt-based learning with Transformers
- Efficient Transformers II: knowledge distillation & fine-tuning
- Efficient Transformers I: attention mechanisms
- Deep hierarchical unsupervised intent modelling: getting value without training data
- Fixing annotating bias with Communications Mining™
- Active learning: better ML models in less time
- It's all in the numbers - assessing model performance with metrics
- Why model validation is important
- Comparing Communications Mining™ and Google AutoML for conversational data intelligence
- Licensing
- FAQs and more

Communications Mining user guide
Datasets
Following the IXP migration, the original Datasets page in Communications Mining™ has been replaced by the IXP homepage, which includes the Communications Mining datasets.
To get to the Datasets page, select the IXP service from Automation Cloud. The Datasets page is displayed by default because the Communications Data capability, which includes Communication Mining, is preselected.
- View all datasets you have access to.
- Edit or delete datasets.
Note: You must have the Dataset - Manage permission assigned to edit or delete datasets.
- Navigate to other IXP capabilities. For more details on each capability, check the Capability types page in the IXP Overview guide.
- Search for a specific dataset by name using the Search option
After accessing IXP, the Datasets page is displayed. Select a dataset from the list to access Communications Mining, which allows you to handle your datasets through the following tabs: Train, Discover, Explore, Validation, Reports, Models, Streams, and Settings.
When you create a new dataset, you can choose to make a carbon copy of an existing dataset. This means you copy over the same sources, general fields, sentiment selection, labels, and reviewed examples from the dataset you are copying.
Then, you can work on the copy dataset, which requires a different name, and make changes to it without impacting the original dataset.
We recommend copying an existing dataset for the following scenarios:
- You want to make major changes to your model, in terms of dataset structure, for instance, and want to preserve the original dataset in case you want to revert to it.
- You want to use the work already done by annotating the original dataset and creating a new dataset to which you can add additional sources of a similar nature.
- Dataset Name
- API Name
- Project
- Model language - choose between English and Multilingual.
Each dataset has its own settings page, which contains useful information about that dataset. To access the Settings page, select the ellipsis next to a specific dataset, and then select Dataset Settings.
The page is split into the following tabs:
- Dataset - update the global settings of the dataset, including the title, description, and sources.
- Taxonomy - create, read, update, and delete labels, as well as their descriptions, extraction fields, general fields, and field types. You can also download the complete label taxonomy.
- Statistics - view annotating statistics and the message metadata properties.
- Select the ellipsis next to a specific dataset from the homepage, and then select Delete.
- Select the Delete dataset permanently option in the Settings tab.
After signing in, you are redirected to the Datasets page.
Alternatively, you can navigate to this page anytime by selecting the Communications Mining™ logo at the top of the page.
From the Datasets page, you can:
- View all datasets you have access to.
- Edit or delete datasets.
Note: You must have the Datasets admin permission assigned to edit or delete datasets.
- Navigate to other pages in the platform.
Select one of the options listed on a dataset, such as Explore, Train, or Reports, to navigate straight to that dataset.
For the datasets you have access to, you can use the drop-down menu to filter to a specific project that you are part of. This helps to restrict the number of datasets that are displayed.
In addition, you can search for a specific dataset by name using the Search option.
Each dataset card provides some useful information on the dataset:
Each dataset card references:
- the dataset title and description.
- the project the dataset is linked to, and the dataset name, that is the project or name.
- the sources connected to the dataset.
- the model family, that is, the language.
- if sentiment analysis is enabled.
- when the dataset was last changed, and, if you hover, it also shows when it was created.
Select Explore, Train, and Reports on the dataset information card to navigate to those pages.
When you create a new dataset, you can choose to create a carbon copy of a pre-existing dataset. This means that you copy over the same sources, general fields, sentiment selection, labels, and reviewed examples as the dataset you have copied the taxonomy from.
You can then work on the copy dataset, which will require a different name, and make changes to it freely, without impacting the original.
- You want to make major changes to your model, in terms of dataset structure, for instance, and want to preserve the original dataset in case you want to revert to it.
- You want to use the work already done by annotating the original dataset and creating a new dataset to which you can add additional sources of a similar nature.
To copy an existing dataset from another dataset, select the ellipsis on the dataset card and then Duplicate. This action will auto-select the same sources and sentiment selection as that dataset.
Once you have duplicated the dataset, select all the additional sources that you want to connect to the dataset.
Similar to the Datasets overview page, each dataset has its own individual Settings page, which you can access by selecting the Settings tab.
The Settings page contains useful information about the dataset and is the place where you can perform various actions.
The page is split into the following tabs:
- Dataset - Update the global settings of the dataset, including title, description, and sources.
- Taxonomy - Create, read, update, and delete labels and their descriptions, extraction fields, general fields, and field types. You can also download the complete label taxonomy.
- Statistics - View annotating statistics and the message metadata properties.
From the Datasets page, you can also delete the dataset by selecting Delete dataset permanently.