- Introduction
- Setting up your account
- Balance
- Clusters
- Concept drift
- Coverage
- Datasets
- General fields
- Labels (predictions, confidence levels, label hierarchy, and label sentiment)
- Models
- Streams
- Model Rating
- Projects
- Precision
- Recall
- Annotated and unannotated messages
- Extraction Fields
- Sources
- Taxonomies
- Training
- True and false positive and negative predictions
- Validation
- Messages
- Access Control and Administration
- Manage sources and datasets
- Understanding the data structure and permissions
- Creating or deleting a data source in the GUI
- Uploading a CSV file into a source
- Preparing data for .CSV upload
- Creating a dataset
- Multilingual sources and datasets
- Enabling sentiment on a dataset
- Amending dataset settings
- Deleting a message
- Deleting a dataset
- Exporting a dataset
- Using Exchange integrations
- Model training and maintenance
- Understanding labels, general fields, and metadata
- Label hierarchy and best practices
- Comparing analytics and automation use cases
- Turning your objectives into labels
- Overview of the model training process
- Generative Annotation
- Dastaset status
- Model training and annotating best practice
- Training with label sentiment analysis enabled
- Training chat and calls data
- Understanding data requirements
- Train
- Introduction to Refine
- Precision and recall explained
- Precision and Recall
- How validation works
- Understanding and improving model performance
- Reasons for label low average precision
- Training using Check label and Missed label
- Training using Teach label (Refine)
- Training using Search (Refine)
- Understanding and increasing coverage
- Improving Balance and using Rebalance
- When to stop training your model
- Using general fields
- Generative extraction
- Using analytics and monitoring
- Automations and Communications Mining™
- Developer
- Exchange Integration with Azure service user
- Exchange Integration with Azure Application Authentication
- Exchange Integration with Azure Application Authentication and Graph
- Fetching data for Tableau with Python
- Elasticsearch integration
- Self-hosted Exchange integration
- UiPath® Automation Framework
- UiPath® Marketplace activities
- UiPath® official activities
- How machines learn to understand words: a guide to embeddings in NLP
- Prompt-based learning with Transformers
- Efficient Transformers II: knowledge distillation & fine-tuning
- Efficient Transformers I: attention mechanisms
- Deep hierarchical unsupervised intent modelling: getting value without training data
- Fixing annotating bias with Communications Mining™
- Active learning: better ML models in less time
- It's all in the numbers - assessing model performance with metrics
- Why model validation is important
- Comparing Communications Mining™ and Google AutoML for conversational data intelligence
- Licensing
- FAQs and more

Communications Mining user guide
Batch upload
BILLABLE OPERATION
You will be charged 1 AI Unit or 0.2 Platform Units per created comment, or per updated comment (based on its unique ID) if its text was modified.
The CLI allows you to upload comments (including pre-annotated comments) in batch. In addition to importing data into Communications Mining™ in those cases where a live connection is not required, it can be used to upload pre-existing training data into Communications Mining, or to overwrite existing comments or labels in Communications Mining.
The CLI expects data in JSONL format (also called newline-delimited JSON), where each line is a JSON value. Many tools will be able to export JSONL files out-of-the-box. Please contact support if you have any questions.
Each line in the JSONL file represents a comment object. Each comment object should have at least a unique ID, a timestamp, and a piece of text, but can have other fields such as metadata. To learn which fields to set for your data, check the Comment reference.
Each line in the JSONL file should have the following format (only required fields shown). (Note that this is shown with indentation for readability, but should be all on one line in your file.)
{
"comment": {
"id": "<unique id>",
"timestamp": "<timestamp>",
"messages": [
{
"body": {
"text": "<text of the comment>"
}
}
]
}
}
{
"comment": {
"id": "<unique id>",
"timestamp": "<timestamp>",
"messages": [
{
"body": {
"text": "<text of the comment>"
}
}
]
}
}
If you would like to upload labels alongside comments, you can include them like so (same as previously mentioned, this is shown with indentation for readability, but should be all on one line in your file):
{
"comment": {
"id": "<unique id>",
"timestamp": "<timestamp>",
"messages": [
{
"body": {
"text": "<text of the comment>"
}
}
]
},
"labelling": {
"assigned": [
{
"name": "<Your Label Name>",
"sentiment": "<positive|negative>"
},
{
"name": "<Another Label Name>",
"sentiment": "<positive|negative>"
}
]
}
}
{
"comment": {
"id": "<unique id>",
"timestamp": "<timestamp>",
"messages": [
{
"body": {
"text": "<text of the comment>"
}
}
]
},
"labelling": {
"assigned": [
{
"name": "<Your Label Name>",
"sentiment": "<positive|negative>"
},
{
"name": "<Another Label Name>",
"sentiment": "<positive|negative>"
}
]
}
}
Uploading Comments
The command below will upload comments to the specified source. We recommend to upload comments into a new empty source, as it makes rolling back easier if something went wrong - you just delete the source.
re create comments \
--source <project_name/source_name> \
--file <file_name.jsonl>
re create comments \
--source <project_name/source_name> \
--file <file_name.jsonl>
--overwrite
flag. The comments will be overwritten based on the comment.id
field. We recommend that you make a backup copy of the source before updating comments in order to be able to recover the
original comments if something goes wrong.
Uploading Comments with Labels
If you would like to upload labels together with your comments, you should specify a dataset into which the labels should be uploaded. The dataset should be connected to the source before you start uploading.
re create comments \
--source <project_name/source_name> \
--dataset <project_name/dataset_name> \
--file <file_name.jsonl>
re create comments \
--source <project_name/source_name> \
--dataset <project_name/dataset_name> \
--file <file_name.jsonl>
--overwrite
flag. Note that this will replace existing labels with new labels (not add existing labels to new labels). We recommend that
you make a backup copy of the dataset before overwriting labels in order to be able to recover the original labels if something
goes wrong.