- API docs
- CLI
- Integration guides
- Blog
- How machines learn to understand words: a guide to embeddings in NLP
- Prompt-based learning with Transformers
- Efficient Transformers II: knowledge distillation & fine-tuning
- Efficient Transformers I: attention mechanisms
- Deep hierarchical unsupervised intent modelling: getting value without training data
- Fixing annotating bias with Communications Mining
- Active learning: better ML models in less time
- It's all in the numbers - assessing model performance with metrics
- Why model validation is important
- Comparing Communications Mining and Google AutoML for conversational data intelligence

Communications Mining Developer Guide
Self-hosted Exchange integration
The Exchange Sync App is delivered as a Docker image. The sections below explain how to configure and deploy the appliance.
Since the Exchange Sync App requires a JSON configuration file to be present at startup, this section explains the contents of the file. Refer to the Deployment section for instructions on how to make the config file available to the Exchange Sync App.
If you are using the OAuth 2.0 authentication type, you can use Graph API or EWS API. Both allow you to authenticate with client secret or with client certificate.
The token grant flow used is the client credentials flow.
Graph API with client secret
{
"microsoft_api": "graph",
"auth_oauth_authority": "https://login.microsoftonline.com/<tenant_id>/",
"auth_oauth_client_id": "<client_id>",
"auth_oauth_client_secret": "<client_secret>",
"mailboxes": {
"[email protected]": {
"bucket": {
"owner": "project-name",
"name": "bucket-name"
},
"start_from": "bucket",
"start_timestamp": "2020-01-01T00:00:00+00:00"
},
"[email protected]": {
"bucket": {
"owner": "project-name",
"name": "bucket-name"
},
"start_from": "bucket",
"start_timestamp": "2020-01-01T00:00:00+00:00"
}
}
}
{
"microsoft_api": "graph",
"auth_oauth_authority": "https://login.microsoftonline.com/<tenant_id>/",
"auth_oauth_client_id": "<client_id>",
"auth_oauth_client_secret": "<client_secret>",
"mailboxes": {
"[email protected]": {
"bucket": {
"owner": "project-name",
"name": "bucket-name"
},
"start_from": "bucket",
"start_timestamp": "2020-01-01T00:00:00+00:00"
},
"[email protected]": {
"bucket": {
"owner": "project-name",
"name": "bucket-name"
},
"start_from": "bucket",
"start_timestamp": "2020-01-01T00:00:00+00:00"
}
}
}
Graph API with client certificate
{
"microsoft_api": "graph",
"auth_oauth_authority": "https://login.microsoftonline.com/<tenant_id>/",
"auth_oauth_client_id": "<client_id>",
"auth_oauth_client_credential_private_key": "<private_key>",
"mailboxes": {
"[email protected]": {
"bucket": {
"owner": "project-name",
"name": "bucket-name"
},
"start_from": "bucket",
"start_timestamp": "2020-01-01T00:00:00+00:00"
},
"[email protected]": {
"bucket": {
"owner": "project-name",
"name": "bucket-name"
},
"start_from": "bucket",
"start_timestamp": "2020-01-01T00:00:00+00:00"
}
}
}
{
"microsoft_api": "graph",
"auth_oauth_authority": "https://login.microsoftonline.com/<tenant_id>/",
"auth_oauth_client_id": "<client_id>",
"auth_oauth_client_credential_private_key": "<private_key>",
"mailboxes": {
"[email protected]": {
"bucket": {
"owner": "project-name",
"name": "bucket-name"
},
"start_from": "bucket",
"start_timestamp": "2020-01-01T00:00:00+00:00"
},
"[email protected]": {
"bucket": {
"owner": "project-name",
"name": "bucket-name"
},
"start_from": "bucket",
"start_timestamp": "2020-01-01T00:00:00+00:00"
}
}
}
EWS API with client secret
{
"ews_endpoint": "https://outlook.office365.com/EWS/Exchange.asmx",
"auth_type": "oauth2",
"auth_oauth_authority": "https://login.microsoftonline.com/<tenant_id>/",
"auth_oauth_client_id": "<client_id>",
"auth_oauth_client_secret": "<client_secret>",
"access_type": "impersonation",
"mailboxes": {
"[email protected]": {
"bucket": {
"owner": "project-name",
"name": "bucket-name"
},
"start_from": "bucket",
"start_timestamp": "2020-01-01T00:00:00+00:00"
},
"[email protected]": {
"bucket": {
"owner": "project-name",
"name": "bucket-name"
},
"start_from": "bucket",
"start_timestamp": "2020-01-01T00:00:00+00:00"
}
}
}
{
"ews_endpoint": "https://outlook.office365.com/EWS/Exchange.asmx",
"auth_type": "oauth2",
"auth_oauth_authority": "https://login.microsoftonline.com/<tenant_id>/",
"auth_oauth_client_id": "<client_id>",
"auth_oauth_client_secret": "<client_secret>",
"access_type": "impersonation",
"mailboxes": {
"[email protected]": {
"bucket": {
"owner": "project-name",
"name": "bucket-name"
},
"start_from": "bucket",
"start_timestamp": "2020-01-01T00:00:00+00:00"
},
"[email protected]": {
"bucket": {
"owner": "project-name",
"name": "bucket-name"
},
"start_from": "bucket",
"start_timestamp": "2020-01-01T00:00:00+00:00"
}
}
}
EWS API with client certificate
{
"ews_endpoint": "https://outlook.office365.com/EWS/Exchange.asmx",
"auth_type": "oauth2",
"auth_oauth_authority": "https://login.microsoftonline.com/<tenant_id>/",
"auth_oauth_client_id": "<client_id>",
"auth_oauth_client_credential_private_key": "<private_key>",
"auth_oauth_client_credential_thumbprint": "<thumbprint>",
"access_type": "impersonation",
"mailboxes": {
"[email protected]": {
"bucket": {
"owner": "project-name",
"name": "bucket-name"
},
"start_from": "bucket",
"start_timestamp": "2020-01-01T00:00:00+00:00"
},
"[email protected]": {
"bucket": {
"owner": "project-name",
"name": "bucket-name"
},
"start_from": "bucket",
"start_timestamp": "2020-01-01T00:00:00+00:00"
}
}
}
{
"ews_endpoint": "https://outlook.office365.com/EWS/Exchange.asmx",
"auth_type": "oauth2",
"auth_oauth_authority": "https://login.microsoftonline.com/<tenant_id>/",
"auth_oauth_client_id": "<client_id>",
"auth_oauth_client_credential_private_key": "<private_key>",
"auth_oauth_client_credential_thumbprint": "<thumbprint>",
"access_type": "impersonation",
"mailboxes": {
"[email protected]": {
"bucket": {
"owner": "project-name",
"name": "bucket-name"
},
"start_from": "bucket",
"start_timestamp": "2020-01-01T00:00:00+00:00"
},
"[email protected]": {
"bucket": {
"owner": "project-name",
"name": "bucket-name"
},
"start_from": "bucket",
"start_timestamp": "2020-01-01T00:00:00+00:00"
}
}
}
If you are using NTLM authentication, it can only be used with EWS API.
{
"host": "https://exchange-server.example.com",
"port": 443,
"auth_type": "ntlm",
"auth_user": "[email protected]",
"access_type": "delegate",
"mailboxes": {
"[email protected]": {
"bucket": {
"owner": "project-name",
"name": "bucket-name"
},
"start_from": "bucket",
"start_timestamp": "2020-01-01T00:00:00+00:00"
},
"[email protected]": {
"bucket": {
"owner": "project-name",
"name": "bucket-name"
},
"start_from": "bucket",
"start_timestamp": "2020-01-01T00:00:00+00:00"
}
}
}
{
"host": "https://exchange-server.example.com",
"port": 443,
"auth_type": "ntlm",
"auth_user": "[email protected]",
"access_type": "delegate",
"mailboxes": {
"[email protected]": {
"bucket": {
"owner": "project-name",
"name": "bucket-name"
},
"start_from": "bucket",
"start_timestamp": "2020-01-01T00:00:00+00:00"
},
"[email protected]": {
"bucket": {
"owner": "project-name",
"name": "bucket-name"
},
"start_from": "bucket",
"start_timestamp": "2020-01-01T00:00:00+00:00"
}
}
}
host
, port
, and auth_user
with their real values, and change access_type
if required. For a description of these parameters and their allowed values, check the configuration reference.
REINFER_EWS_AUTH_PASS
environment variable. For more details, check the Deployment section.
The full list of environment variables that you can set to override values in the config is the following:
NAME | DESCRIPTION |
---|---|
REINFER_EWS_AUTH_USER | Exchange server user |
REINFER_EWS_AUTH_PASS | Exchange server password |
REINFER_EWS_ACCESS_TYPE | Access type: "delegate" or "impersonation" |
REINFER_EWS_HOST | Exchange server host |
REINFER_EWS_PORT | Exchange server port |
You can specify one or more mailboxes in your configuration. For each mailbox, you have to provide the mailbox address and specify the following parameters:
NAME | DESCRIPTION |
---|---|
bucket.owner | Project of the bucket in which the mailbox should be synced. |
bucket.name | Name of the bucket in which the mailbox should be synced. |
start_from | Whether to start from last synced time ("bucket") or ignore last synced time and always start from start_timestamp ("config"). Should be set to "bucket" for normal operation, but "config" can be useful in some cases when debugging.
|
start_timestamp | Timestamp from which to start syncing email. If not set, all emails will be synced. |
The configuration uses the default values for a number of settings such as polling frequency or batch size. To customize your configuration further, refer to the configuration reference.
Buckets
The Exchange integration syncs raw email data into Communications Mining buckets. Same as other Communications Mining resources, a bucket is created in a project, which allows you to control access to the bucket.
You can deploy the Exchange Sync App either with Kubernetes or with Docker.
Deploying with Kubernetes allows you to run multiple instances of the Exchange Sync App, with each instance handling a subset of mailboxes to be synced.
Using Kubernetes is a popular way to run and manage containerized applications. This section explains how to deploy the Exchange Sync App using Kubernetes.
- basic Kubernetes knowledge. To get started with Kubernetes, visit Deploy to Kubernetes.
- have the
kubectl
installed.
kind: StatefulSet
metadata:
name: uipath-exchange-sync-app
labels:
app: uipath-exchange-sync-app
spec:
podManagementPolicy: Parallel
replicas: 1
selector:
matchLabels:
app: uipath-exchange-sync-app
serviceName: uipath-exchange-sync-app
template:
metadata:
labels:
app: uipath-exchange-sync-app
name: uipath-exchange-sync-app
spec:
containers:
- args:
- "uipath-exchange-sync-app"
- "--bind"
- "0.0.0.0:8000"
- "--reinfer-api-endpoint"
- "https://<mydomain>.reinfer.io/api/"
- "--shard-name"
- "$(POD_NAME)"
# This value should match `spec.replicas` above
- "--total-shards"
- "1"
env:
- name: REINFER_EWS_CONFIG
value: "/mnt/config/example_exchange_sync_config"
- name: REINFER_API_TOKEN
valueFrom:
secretKeyRef:
key: reinfer-api-token
name: reinfer-credentials
# Only needed when using EWS API
- name: REINFER_EWS_AUTH_PASS
valueFrom:
secretKeyRef:
key: ews-auth-pass
name: reinfer-credentials
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
image: "your.private.registry.com/reinfer/ews:TAG"
name: uipath-exchange-sync-app
resources:
requests:
cpu: 0.05
memory: 128Mi
volumeMounts:
- mountPath: /mnt/config
name: config-vol
volumes:
- configMap:
name: exchange-sync-config
items:
- key: example_exchange_sync_config
path: example_exchange_sync_config
name: config-vol
kind: StatefulSet
metadata:
name: uipath-exchange-sync-app
labels:
app: uipath-exchange-sync-app
spec:
podManagementPolicy: Parallel
replicas: 1
selector:
matchLabels:
app: uipath-exchange-sync-app
serviceName: uipath-exchange-sync-app
template:
metadata:
labels:
app: uipath-exchange-sync-app
name: uipath-exchange-sync-app
spec:
containers:
- args:
- "uipath-exchange-sync-app"
- "--bind"
- "0.0.0.0:8000"
- "--reinfer-api-endpoint"
- "https://<mydomain>.reinfer.io/api/"
- "--shard-name"
- "$(POD_NAME)"
# This value should match `spec.replicas` above
- "--total-shards"
- "1"
env:
- name: REINFER_EWS_CONFIG
value: "/mnt/config/example_exchange_sync_config"
- name: REINFER_API_TOKEN
valueFrom:
secretKeyRef:
key: reinfer-api-token
name: reinfer-credentials
# Only needed when using EWS API
- name: REINFER_EWS_AUTH_PASS
valueFrom:
secretKeyRef:
key: ews-auth-pass
name: reinfer-credentials
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
image: "your.private.registry.com/reinfer/ews:TAG"
name: uipath-exchange-sync-app
resources:
requests:
cpu: 0.05
memory: 128Mi
volumeMounts:
- mountPath: /mnt/config
name: config-vol
volumes:
- configMap:
name: exchange-sync-config
items:
- key: example_exchange_sync_config
path: example_exchange_sync_config
name: config-vol
- Replace
<mydomain>.reinfer.io
with your tenant API endpoint. - Create the secrets as
follows:
kubectl create secret generic reinfer-credentials \ --from-literal=reinfer-api-token=<REINFER_TOKEN> \ --from-literal=ews-auth-pass=<MSEXCHANGE_PASSWORD>
kubectl create secret generic reinfer-credentials \ --from-literal=reinfer-api-token=<REINFER_TOKEN> \ --from-literal=ews-auth-pass=<MSEXCHANGE_PASSWORD>Note:- To avoid storing credentials as cleartext in our YAML file, the
REINFER_TOKEN
andREINFER_EWS_AUTH_PASS
environment variables are populated from Kubernetes secrets. - The
ews-auth-pass
secret is only needed when using EWS API.
- To avoid storing credentials as cleartext in our YAML file, the
- To load the appliance config from a local file, mount that file into the pod by storing the data in a Kubernetes ConfigMap and mounting the ConfigMap as a volume.
- Create the ConfigMap as
follows:
kubectl create configmap exchange-sync-config \ --from-file=example_exchange_sync_config=your-exchange-sync-config.json
kubectl create configmap exchange-sync-config \ --from-file=example_exchange_sync_config=your-exchange-sync-config.jsonNote: As an alternative to storing the config file locally, you can upload it to Communications Mining and let the Exchange Sync App fetch it via the Communications Mining API. This is described Store configuration in Communications Mining. If both local and remote config files are specified, the appliance uses the local config file.
StatefulSet
and check that everything
is running via the
following:
kubectl apply -f uipath-exchange-sync.yaml
kubectl get sts
kubectl apply -f uipath-exchange-sync.yaml
kubectl get sts
Alternatively, you can run the Exchange Sync App in Docker. The following command starts the appliance with the same parameters that are used in the Kubernetes section:
EWS_CONFIG_DIR=
REINFER_API_TOKEN=
TAG=
sudo docker run \
-v $EWS_CONFIG_DIR:/mnt/config \
--env REINFER_EWS_CONFIG=/mnt/config/your_exchange_sync_config.json \
--env REINFER_API_TOKEN=$REINFER_API_TOKEN \
eu.gcr.io/reinfer-gcr/ews:$TAG \
--reinfer-api-endpoint https://<mydomain>.reinfer.io/api/ \
&> ews_$(date -Iseconds).log
EWS_CONFIG_DIR=
REINFER_API_TOKEN=
TAG=
sudo docker run \
-v $EWS_CONFIG_DIR:/mnt/config \
--env REINFER_EWS_CONFIG=/mnt/config/your_exchange_sync_config.json \
--env REINFER_API_TOKEN=$REINFER_API_TOKEN \
eu.gcr.io/reinfer-gcr/ews:$TAG \
--reinfer-api-endpoint https://<mydomain>.reinfer.io/api/ \
&> ews_$(date -Iseconds).log
- Replace
<mydomain>.reinfer.io
with your tenant API endpoint. - Replace
your_exchange_sync_config.json
by the name of your Exchange Sync App config JSON file.
The Exchange Sync App runs continuously syncing emails into the Communications Mining platform. If you stop and start it again, it picks up from the last stored bucket sync state.
The Exchange Sync App can save extracted emails locally instead of pushing them into the Communications Mining platform via the following:
EWS_LOCAL_DIR=
CONFIG_OWNER=
CONFIG_KEY=
TAG=
sudo docker run \
-v $EWS_LOCAL_DIR:/mnt/ews \
eu.gcr.io/reinfer-gcr/ews:$TAG \
--local-files-prefix /mnt/ews \
--remote-config-owner $CONFIG_OWNER --remote-config-key $CONFIG_KEY \
&> ews_$(date -Iseconds).log
EWS_LOCAL_DIR=
CONFIG_OWNER=
CONFIG_KEY=
TAG=
sudo docker run \
-v $EWS_LOCAL_DIR:/mnt/ews \
eu.gcr.io/reinfer-gcr/ews:$TAG \
--local-files-prefix /mnt/ews \
--remote-config-owner $CONFIG_OWNER --remote-config-key $CONFIG_KEY \
&> ews_$(date -Iseconds).log
Overview
- The Exchange Sync App expects to find the
config in
$EWS_LOCAL_DIR/config/$CONFIG_OWNER/$CONFIG_KEY.json
. Alternatively, you can provide the path to the config by setting the$REINFER_EWS_CONFIG
environment variable as shown in the previous example. - The Exchange Sync App saves the sync
state to
$EWS_LOCAL_DIR/state
. If you stop and start it again, it picks up from the last stored sync state. - The Exchange Sync App saves data to
$EWS_LOCAL_DIR/data
.
The Exchange Sync App can save extracted emails to Azure Blob Storage instead of pushing them into the Communications Mining platform via the following:
EWS_CONFIG_DIR=
AZ_STORAGE_ACCOUNT_NAME=
AZ_CONTAINER_NAME=
TAG=
sudo docker run \
-v $EWS_CONFIG_DIR:/mnt/config \
--env REINFER_EWS_CONFIG=/mnt/config/your_exchange_sync_config.json \
eu.gcr.io/reinfer-gcr/ews:$TAG \
--private-file-prefix az://$AZ_STORAGE_ACCOUNT_NAME/$AZ_CONTAINER_NAME \
&> ews_$(date -Iseconds).log
EWS_CONFIG_DIR=
AZ_STORAGE_ACCOUNT_NAME=
AZ_CONTAINER_NAME=
TAG=
sudo docker run \
-v $EWS_CONFIG_DIR:/mnt/config \
--env REINFER_EWS_CONFIG=/mnt/config/your_exchange_sync_config.json \
eu.gcr.io/reinfer-gcr/ews:$TAG \
--private-file-prefix az://$AZ_STORAGE_ACCOUNT_NAME/$AZ_CONTAINER_NAME \
&> ews_$(date -Iseconds).log
Overview
- You should provide the path to the config by setting the
$REINFER_EWS_CONFIG
environment variable. - The Exchange Sync App authenticates against Azure Blob Storage using one of the DefaultAzureCredential methods. Make sure you use a method that is convenient for you. Regardless of the method used, make sure you grant the Storage Blob Data Contributor role to the Exchange Sync App.
- The Exchange Sync App saves the sync state to
az://$AZ_STORAGE_ACCOUNT_NAME/$AZ_CONTAINER_NAME/state
. If you stop and start it again, it picks up from the last stored sync state. - The Exchange Sync App saves data to
az://$AZ_STORAGE_ACCOUNT_NAME/$AZ_CONTAINER_NAME/data
.
Instead of providing a local config file to the appliance like you did if you followed the Exchange Sync App deployment guide, you can instead manage the config file in Communications Mining. Note that if both local and remote config files are specified, the appliance will default to using the local config file.
First, upload your JSON config file to Communications Mining:
curl -H "Authorization: Bearer $REINFER_TOKEN" \
-H "Content-Type: multipart/form-data" \
-F 'file=@your_exchange_sync_config.json' \
-XPUT https://<mydomain>.reinfer.io/api/v1/appliance-configs/<project-name>/<config-name>
curl -H "Authorization: Bearer $REINFER_TOKEN" \
-H "Content-Type: multipart/form-data" \
-F 'file=@your_exchange_sync_config.json' \
-XPUT https://<mydomain>.reinfer.io/api/v1/appliance-configs/<project-name>/<config-name>
To see the current config:
curl -H "Authorization: Bearer $REINFER_TOKEN" \
-XGET https://<mydomain>.reinfer.io/api/v1/appliance-configs/<project-name>/<config-name>
curl -H "Authorization: Bearer $REINFER_TOKEN" \
-XGET https://<mydomain>.reinfer.io/api/v1/appliance-configs/<project-name>/<config-name>
--remote-config-owner
parameter to the project name, and the --remote-config-key
parameter to the config name.
The following table contains a list of available application parameters. To learn more about running the Exchange Sync App, check the Deployment section.
Parameter | Description |
---|---|
--reinfer-api-endpoint | Endpoint to connect to the Reinfer API. Mutually exclusive with --local-files-prefix .
|
--local-files-prefix | Path to store synced emails and bucket sync state. Mutually exclusive with --reinfer-api-endpoint and REINFER_API_TOKEN .
|
--remote-config-owner | Project that owns the remote Exchange Sync App config file. |
--remote-config-key | Name of the remote Exchange Sync App config file. |
--debug-level | Debug level, where:
1 .
|
--shard-name | Shard name, which is uipath-exchange-sync-app-N , to extract shard number
from. When running in Kubernetes, you can set it to the pod
name.
|
--total-shards | The total number of instances in the appliance cluster. When running in Kubernetes, the parameter must be set to the same value as the number of instances in the StatefulSet. |
--restart-on-unrecoverable-errors | If enabled, unrecoverable failures will cause the entire service to restart without crashing. |
The following table a list of available configuration parameters. To learn more about writing the Exchange Sync App configuration file, check the Configuration section.
Name | Description |
---|---|
host | Only used with EWS API. Exchange server host. Can be overriden by the REINFER_EWS_HOST environment variable.
|
port | Only used with EWS API. Exchange server port. The default is 80 . You can
override it with the REINFER_EWS_PORT environment
variable.
|
auth_type | Only used with EWS API. Only ntlm allowed.
|
auth_user | Only used with EWS API. Exchange server user. You can override it with the
REINFER_EWS_AUTH_USER environment
variable.
|
auth_password | Only used with EWS API. Exchange server password. You can override it with the
REINFER_EWS_AUTH_PASS environment
variable.
|
access_type | Only used with EWS API. The access type can be delegate or
impersonation . The default is
delegate . You can override it with the
REINFER_EWS_ACCESS_TYPE environment
variable.
|
ews_ssl_verify | Only used with EWS API. If set to false , it will not verify certificates.
The default is true .
|
poll_frequency | The waiting time between batches, in seconds. The default is 15 .
|
poll_message_sleep | The waiting time between individual emails in a batch, in seconds. The default is
0.1 .
|
max_concurrent_uploads | Number of concurrent uploads to Communications Mining, between 0 and 32. The default is
8 .
|
emails_per_folder | Maximum number of emails to fetch from each folder per batch, between 1 and 100,000. The
default is 2520 . This setting allows the Exchange
Sync App to make progress on all folders evenly in case there is a
very large folder.
|
reinfer_batch_size | How many emails to fetch per batch, between 1 and 1000. The default is
80 .
|
mailboxes | List of mailboxes to fetch. For more details on how to configure the mailboxes, check the Configuration section. |
audit_email | If you have configured the appliance with a remote
config, Communications Mining sends an email to this
address whenever the config is updated. The default is
None .
|
ews_ssl_ciphers | Only used with EWS API. Make Exchange Sync App use specific ciphers. The ciphers should be a
string in the OpenSSL cipher list
format. The default is None .
|