- Introduction
- Setting up your account
 - Balance
 - Clusters
 - Concept drift
 - Coverage
 - Datasets
 - General fields
 - Labels (predictions, confidence levels, label hierarchy, and label sentiment)
 - Models
 - Streams
 - Model Rating
 - Projects
 - Precision
 - Recall
 - Annotated and unannotated messages
 - Extraction Fields
 - Sources
 - Taxonomies
 - Training
 - True and false positive and negative predictions
 - Validation
 - Messages
 
 - Access control and administration
 - Manage sources and datasets
- Understanding the data structure and permissions
 - Creating or deleting a data source in the GUI
 - Uploading a CSV file into a source
 - Preparing data for .CSV upload
 - Creating a dataset
 - Multilingual sources and datasets
 - Enabling sentiment on a dataset
 - Amending dataset settings
 - Deleting a message
 - Deleting a dataset
 - Exporting a dataset
 - Using Exchange integrations
 
 - Model training and maintenance
- Understanding labels, general fields, and metadata
 - Label hierarchy and best practices
 - Comparing analytics and automation use cases
 - Turning your objectives into labels
 - Overview of the model training process
 - Generative Annotation
 - Dastaset status
 - Model training and annotating best practice
 - Training with label sentiment analysis enabled
 
- Understanding data requirements
 - Train
 - Introduction to Refine
 - Precision and recall explained
 - Precision and Recall
 - How validation works
 - Understanding and improving model performance
 - Reasons for label low average precision
 - Training using Check label and Missed label
 - Training using Teach label (Refine)
 - Training using Search (Refine)
 - Understanding and increasing coverage
 - Improving Balance and using Rebalance
 - When to stop training your model
 
- Using general fields
 
 - Generative extraction
 - Using analytics and monitoring
 - Automations and Communications Mining™
 - Developer
- Uploading data
 - Downloading data
 - Exchange Integration with Azure service user
 - Exchange Integration with Azure Application Authentication
 - Exchange Integration with Azure Application Authentication and Graph
 - Fetching data for Tableau with Python
 - Elasticsearch integration
 - General field extraction
 - Self-hosted Exchange integration
 - UiPath® Automation Framework
 - UiPath® official activities
 
- How machines learn to understand words: a guide to embeddings in NLP
 - Prompt-based learning with Transformers
 - Efficient Transformers II: knowledge distillation & fine-tuning
 - Efficient Transformers I: attention mechanisms
 - Deep hierarchical unsupervised intent modelling: getting value without training data
 - Fixing annotating bias with Communications Mining™
 - Active learning: better ML models in less time
 - It's all in the numbers - assessing model performance with metrics
 - Why model validation is important
 - Comparing Communications Mining™ and Google AutoML for conversational data intelligence
 
 - Licensing
 - FAQs and more
 

Communications Mining user guide
Communications Mining™ supports multilingual sources and datasets. This means that the models can understand sources that contain multiple different supported languages, without actually having to translate them.
- English
 - Dutch
 - French
 - German
 - Italian
 - Japanese
 - Portuguese
 - Spanish
 
If you work and do business in several languages that the platform supports, you can train on messages in those languages, rather than translating everything into a single language.
- If a dataset is multilingual, you cannot view translations of any messages, as provided for translated datasets. As a result, you will need to understand all of the languages in the dataset to effectively train their model.
 - Understanding multiple languages is a more complex machine-learning problem than understanding a single language. As a result, these datasets may potentially experience a slight drop in performance compared to datasets in a single language.
 - If the dataset contains other
                  languages than the supported ones, applying labels used for supported languages
                  may cause confusion. Instead, annotate these instances with language-specific
                  labels.
                  Note: The platform cannot process or understand the content of unsupported languages.
 
When creating a data source or a dataset, the platform selects by default the English language for both of them.
To change the language while creating your data source or dataset, proceed as follows:
- Navigate to the Set the language, and enable translation for your source step.
 - In the Language dropdown menu, select Multilingual.
 
- You can no longer change the language once the data source or dataset is created.
 - Multilingual datasets can contain sources of any language family that the platform supports.
 - To learn how to create data sources and datasets, check Creating a data source and Creating a dataset.
 
We currently support a wide range of additional languages in Preview mode, as shown in the following list. This means that our team refines them based on your usage.
- Afrikaans
 - Albanian
 - Amharic
 - Arabic
 - Armenian
 - Assamese
 - Azerbaijani
 - Basque
 - Belarusian
 - Bengali
 - Bengali (Romanized)
 - Bosnian
 - Breton
 - Bulgarian
 - Burmese
 - Burmese
 - Catalan
 - Chinese (Simplified)
 - Chinese (Traditional)
 - Croatian
 - Czech
 - Danish
 - Esperanto
 - Estonian
 - Filipino
 - Finnish
 - Galician
 - Georgian
 - Greek
 - Gujarati
 - Hausa
 - Hebrew
 - Hindi
 - Hindi (Romanized)
 - Hungarian
 - Icelandic
 - Indonesian
 - Irish
 - Javanese
 - Kannada
 - Kazakh
 - Khmer
 - Korean
 - Kurdish (Kurmanji)
 - Kyrgyz
 - Lao
 - Latin
 - Latvian
 - Lithuanian
 - Macedonian
 - Malagasy
 - Malay
 - Malayalam
 - Marathi
 - Mongolian
 - Nepali
 - Norwegian
 - Oriya
 - Oromo
 - Pashto
 - Persian
 - Polish
 - Punjabi
 - Romanian
 - Russian
 - Sanskrit
 - Scottish Gaelic
 - Serbian
 - Sindhi
 - Sinhala
 - Slovak
 - Slovenian
 - Somali
 - Sundanese
 - Swahili
 - Swedish
 - Swiss German
 - Tamil
 - Tamil (Romanized)
 - Telugu
 - Telugu (Romanized)
 - Thai
 - Turkish
 - Ukrainian
 - Urdu
 - Urdu (Romanized)
 - Uyghur
 - Uzbek
 - Vietnamese
 - Welsh
 - Western Frisian
 - Xhosa
 - Yiddish