Communications Mining user guide

Last updated Nov 10, 2025

Multilingual sources and datasets

Communications Mining™ supports multilingual sources and datasets. This means that the models can understand sources that contain multiple different supported languages, without actually having to translate them.

The languages available within multilingual sources and datasets are:

English
Dutch
French
German
Italian
Japanese
Portuguese
Spanish

If you work and do business in several languages that the platform supports, you can train on messages in those languages, rather than translating everything into a single language.

Key considerations

If a dataset is multilingual, you cannot view translations of any messages, as provided for translated datasets. As a result, you will need to understand all of the languages in the dataset to effectively train their model.
Understanding multiple languages is a more complex machine-learning problem than understanding a single language. As a result, these datasets may potentially experience a slight drop in performance compared to datasets in a single language.
If the dataset contains other languages than the supported ones, applying labels used for supported languages may cause confusion. Instead, annotate these instances with language-specific labels.
Note: The platform cannot process or understand the content of unsupported languages.

Creating multilingual sources and datasets

When creating a data source or a dataset, the platform selects by default the English language for both of them.

To change the language while creating your data source or dataset, proceed as follows:

Navigate to the Set the language, and enable translation for your source step.
In the Language dropdown menu, select Multilingual.

Note:

You can no longer change the language once the data source or dataset is created.
Multilingual datasets can contain sources of any language family that the platform supports.
To learn how to create data sources and datasets, check Creating a data source and Creating a dataset.

Supported languages in Preview

Note: Register on the Insider Portal to provide feedback or raise issues.

We currently support a wide range of additional languages in Preview mode, as shown in the following list. This means that our team refines them based on your usage.