ixp

latest

false

Communications Mining user guide

Comments

Each message in Communications Mining™ is represented by a single comment object in the API. As a result, they can be considered equivalent. The developer documentation and API will refer primarily to comments, whilst the user guide and the Communications Mining UI refers primarily to messages.

When uploading data to Communications Mining or fetching data from Communications Mining, it is important to understand how different types of data (such as emails or support tickets) should be represented as comments. This page explains how to model your data as Communications Mining comments to prepare it for upload, and how to understand data fetched from Communications Mining.

Example of a comment created from an email

Communications Mining™ comment created from a review

The Overview section describes the overall structure of a comment object. If you want to upload data to Communications Mining™ via the API, or to understand how to process data uploaded to Communications Mining via the API, check the Comments created via the API section. You can find detailed descriptions for each of the commonly used types of comments (emails or support tickets). If you want to better understand how to process data uploaded to Communications Mining via an integration, check the Comments created by integrations section. Finally, for a full list of available comment object fields, check the Reference section.

Overview

Communications Mining™ works with various types of text data such as emails, survey responses, support tickets, or customer reviews. What these types of data have in common is that they all consist of units of communication (an email, a survey response, a support ticket, a customer review). In Communications Mining, a single message is represented as a comment, for example.

No matter what kind of communication unit a comment symbolizes, it consistently maintains this fundamental structure:

{
  "id": <UNIQUE ID>,
  "timestamp": <TIMESTAMP>,
  "messages": [
    {
      "body": { "text": <TEXT> },
      ...
    }
  ],
  "user_properties": { ... },
}
{
  "id": <UNIQUE ID>,
  "timestamp": <TIMESTAMP>,
  "messages": [
    {
      "body": { "text": <TEXT> },
      ...
    }
  ],
  "user_properties": { ... },
}

As shown in the previous code snippet, in addition to the actual piece of text, a comment always has an ID and a timestamp. The ID needs to be unique within the containing the message. The timestamp is used in the platform UI to filter and sort by date, and to generate date-based analytics.

In addition to these required fields, other fields should be set depending on the type of the comment. If your data has been uploaded to Communications Mining™ via an integration, Communications Mining automatically populates all necessary fields. Check the following sections for a more detailed description.

Comments created via the API

Emails

While the easiest way to sync emails into Communications Mining™ is via the Exchange integration, in cases where you do your own email extraction, you can sync emails via the API. Use the sync-raw-emails endpoint for raw emails, and the sync endpoint for processed emails.

When syncing raw emails, provide the extracted MIME email headers and email body as-is (check the Reference for a description of the raw email format). Communications Mining parses the headers and cleans the email body.

Note:

The following raw email example shows a very small number of headers for brevity. Send all the extracted headers to Communications Mining, which are likely to be much longer than in the example.

Important:

How does Communications Mining process raw emails?

Sets the email-specific fields in the message object messages[0]
Sets the thread_id field and thread_properties object
Cleans up the email body by stripping quoted emails and putting the signature into a separate signature field
Populates the user_properties object with metadata extracted from email headers. If a field is not present in the email, it won't be set in the comment at all (rather than being set to a null or empty value). For example, the comment in the following example does not contain a BCC: field.

If you enrich emails with other data prior to uploading to Communications Mining, you can provide this additional data in the user properties of the comment.

The processed raw email looks like the following processed email example. Check the number of additional fields that Communications Mining created. If you want to upload processed emails, structure them as in the processed email example.

Example email

Raw Email

{
  "raw_email": {
    "body": {
      "plain": "Hi Bob,\n\nCould you send me the figures for today?\n\nThanks,\nAlice"
    },
    "headers": {
      "raw": "From: Alice Smith <[email protected]>\nDate: Tue, 3 Aug 2021 10:57:42 +0100\nMessage-ID: <[email protected]>\nSubject: Figures for today\nTo: Bob <[email protected]>\nCc: Joe <[email protected]>"
    }
  },
  "user_properties": {
    "string:Team": "Team XYZ"
  }
}
{
  "raw_email": {
    "body": {
      "plain": "Hi Bob,\n\nCould you send me the figures for today?\n\nThanks,\nAlice"
    },
    "headers": {
      "raw": "From: Alice Smith <[email protected]>\nDate: Tue, 3 Aug 2021 10:57:42 +0100\nMessage-ID: <[email protected]>\nSubject: Figures for today\nTo: Bob <[email protected]>\nCc: Joe <[email protected]>"
    }
  },
  "user_properties": {
    "string:Team": "Team XYZ"
  }
}

Processed Email

{
  "comment": {
    "id": "3c6537373834623562406d61696c2e6578616d706c652e636f6d3e",
    "timestamp": "2021-08-03T09:57:42Z",
    "user_properties": {
      "string:Has Signature": "Yes",
      "string:Sender": "[email protected]",
      "string:Thread": "<[email protected]>",
      "string:Message ID": "<[email protected]>",
      "number:Recipient Count": 2,
      "number:Participant Count": 3,
      "number:Position in Thread": 1,
      "string:Sender Domain": "example.com",
      "string:Team": "Team XYZ"
    },
    "messages": [
      {
        "body": {
          "text": "Hi Bob,\n\nCould you send me the figures for today?"
        },
        "signature": {
          "text": "Thanks,\nAlice"
        },
        "subject": {
          "text": "Figures for today"
        },
        "to": ["\"Bob\" <[email protected]>"],
        "cc": ["\"Joe\" <[email protected]>"],
        "sent_at": "2021-08-03T09:57:42Z",
        "from": "\"Alice Smith\" <[email protected]>"
      }
    ],
    "thread_id": "3c6537373834623562406d61696c2e6578616d706c652e636f6d3e"
  },
  "thread_properties": {
    "duration": null,
    "response_time": null,
    "num_messages": 1,
    "num_participants": 3,
    "first_sender": "[email protected]",
    "thread_position": 0
  }
}
{
  "comment": {
    "id": "3c6537373834623562406d61696c2e6578616d706c652e636f6d3e",
    "timestamp": "2021-08-03T09:57:42Z",
    "user_properties": {
      "string:Has Signature": "Yes",
      "string:Sender": "[email protected]",
      "string:Thread": "<[email protected]>",
      "string:Message ID": "<[email protected]>",
      "number:Recipient Count": 2,
      "number:Participant Count": 3,
      "number:Position in Thread": 1,
      "string:Sender Domain": "example.com",
      "string:Team": "Team XYZ"
    },
    "messages": [
      {
        "body": {
          "text": "Hi Bob,\n\nCould you send me the figures for today?"
        },
        "signature": {
          "text": "Thanks,\nAlice"
        },
        "subject": {
          "text": "Figures for today"
        },
        "to": ["\"Bob\" <[email protected]>"],
        "cc": ["\"Joe\" <[email protected]>"],
        "sent_at": "2021-08-03T09:57:42Z",
        "from": "\"Alice Smith\" <[email protected]>"
      }
    ],
    "thread_id": "3c6537373834623562406d61696c2e6578616d706c652e636f6d3e"
  },
  "thread_properties": {
    "duration": null,
    "response_time": null,
    "num_messages": 1,
    "num_participants": 3,
    "first_sender": "[email protected]",
    "thread_position": 0
  }
}

Thread Properties

The following thread properties are available.

NAME	DESCRIPTION
`thread_position`	Position of comment in thread, calculated by ordering the comment by `timestamp` . Starts at `0` .
`num_messages`	Number of comments in thread.
`num_participants`	Total number of unique participants (From, To, CC, BCC) in thread.
`first_sender`	Sender of the first comment in thread.
`duration`	Difference (in seconds) between the `timestamps` of first and last comment in thread. Will be set to `null` if `num_messages` is 1 (i.e. thread contains only 1 comment). Note: The `timestamp` of a comment corresponds to the `sent_at` field of the corresponding raw email.
`response_time`	Difference (in seconds) between the first comment in thread and the first response in thread. The first response in thread is the oldest comment where sender is not `first_sender` . Will be set to `null` if there are no responses in thread (i.e. if all emails in thread are from the same sender).

Each time a new comment is added to the platform, the thread properties of the corresponding thread are updated.

Note:

Apart from thread_position, all properties are the same for each comment in thread.

Support tickets

In addition to the main text, a typical support ticket submitted via a form may have a subject, information about the sender (such as name or email address), and additional structured data (such as the topic of the ticket) which can be uploaded as part of the user properties of the comment.

The following example shows how to format a support ticket as a Communications Mining™ comment and how that comment is displayed in the platform's UI. Your user properties may be different depending on the data you collect.

Example support ticket

{
  "id": "dbcb03ad",
  "timestamp": "2020-02-26T16:09:00Z",
  "messages": [
    {
      "body": {
        "text": "Hi Support Team\n\nPlease could you look into my broadband service network status. I don't have any signal."
      },
      "subject": {
        "text": "Network Outage for over 24 hours - Customer account number 1234567"
      },
      "from": "[email protected]"
    }
  ],
  "user_properties": {
    "string:Customer Name": "Alice Smith",
    "string:Source": "Support Form",
    "string:Topic": "Broadband"
  }
}
{
  "id": "dbcb03ad",
  "timestamp": "2020-02-26T16:09:00Z",
  "messages": [
    {
      "body": {
        "text": "Hi Support Team\n\nPlease could you look into my broadband service network status. I don't have any signal."
      },
      "subject": {
        "text": "Network Outage for over 24 hours - Customer account number 1234567"
      },
      "from": "[email protected]"
    }
  ],
  "user_properties": {
    "string:Customer Name": "Alice Smith",
    "string:Source": "Support Form",
    "string:Topic": "Broadband"
  }
}

Comments created by integrations

Emails (Microsoft Exchange)

Microsoft Exchange emails ingested into Communications Mining via the Exchange integration are automatically converted into comment objects in the same way as raw emails.

Attachments and attachment contents

Comments may have files attached to them. If a comment has attachments, the attachments field contains metadata about them:

json
{ "id": "3c484531505230324d423", "attachments": [ { "name": "account-statement.pdf", "size": 49078, "content_type": "application/pdf", } ], // other comment fields omitted ... },
json
{ "id": "3c484531505230324d423", "attachments": [ { "name": "account-statement.pdf", "size": 49078, "content_type": "application/pdf", } ], // other comment fields omitted ... },

In addition, you can also download the attachment's content. Downloading the attachment's content returns the attachment_reference field:

json
{ "id": "3c484531505230324d423", "attachments": [ { "name": "account-statement.pdf", "size": 49078, "content_type": "application/pdf", "attachment_reference": "CjQSEIExTHEqtdntoxz2WtbZDNEiIIVqcP1Sfx2L4epyRQDasa1RSODvheQ3bvLhj3L-_81G" } ], // other comment fields omitted ... },
json
{ "id": "3c484531505230324d423", "attachments": [ { "name": "account-statement.pdf", "size": 49078, "content_type": "application/pdf", "attachment_reference": "CjQSEIExTHEqtdntoxz2WtbZDNEiIIVqcP1Sfx2L4epyRQDasa1RSODvheQ3bvLhj3L-_81G" } ], // other comment fields omitted ... },

Use attachment_reference to retrieve the binary file content from the attachments API. For the previous example, you fetch the following URL: https://cloud.uipath.com///reinfer_/api/v1/attachments/CjQSEIExTHEqtdntoxz2WtbZDNEiIIVqcP1Sfx2L4epyRQDasa1RSODvheQ3bvLhj3L-_81G.

Check the API Reference for further details about this type of request.

If the attachment object doesn't have an attachment_reference property, you can't download the attachment's content. This may be because:

Communications Mining™ didn't receive the attachment's content.
The attachment content exceeded the size limit for uploading to Communications Mining.
Communications Mining processed the attachment before it supported file contents.

Learn more about the Attachment contents on the Attachment page.

Reference

Comments

Check the following table for a list of available comment fields. If you are unfamiliar with Communications Mining™ comment objects, check the Overview.

NAME	TYPE	REQUIRED	DESCRIPTION
`id`	string	yes	Identifies a comment uniquely within a source. Any hexadecimal string of up to 1024 characters is valid (conforms to `/[0-9a-f]{1,1024}/`).
`timestamp`	string	yes	A ISO-8601 timestamp indicating when the comment was created. If the timestamp does not specify a timezone, UTC will be assumed. The timestamp must be in the range 1950-01-01T00:00:00Z to 2049-12-31T23:59:59Z inclusive.
`messages`	`array<Message>`	yes	An array of zero or one message.
`user_properties`	`map<string, string	number>`	no
`thread_id`	string	no	An ID uniquely identifying an email thread. Any hexadecimal string of up to 1024 characters is valid (conforms to `/[0-9a-f]{1,1024}/`).
`uid`	string	set by Communications Mining™	A combined source and comment ID in the form of `source_id.comment_id`. You should not be setting this field directly as it's automatically generated by Communications Mining for uploaded comments.
`created_at`	string	set by Communications Mining	A ISO-8601 timestamp with the same constraints as the `timestamp` field. You should not be setting this field directly as it's automatically generated by Communications Mining when the comment is created.
`updated_at`	string	set by Communications Mining	A ISO-8601 timestamp with the same constraints as the `timestamp` field. You should not be setting this field directly as it's automatically generated by Communications Mining when the comment is updated.
`attachments`	`array<Attachment>`	no	An array of zero or more attachments. An attachment represents a file attached to a comment.

NAME	TYPE	REQUIRED	DESCRIPTION
`name`	string	yes	The attachment's file name.
`size`	number	yes	The size of the attachment's file content in bytes.
`content_type`	string	yes	The Media type of the attachment. For a list of possible values, check the IANA Media Types list.
`attachment_reference`	string	no	Used to retrieve the binary file content from the attachments API

Where Message has the following format:

NAME	TYPE	REQUIRED	DESCRIPTION
`body`	Content	yes	An object containing the main body text of the message.
`subject`	Content	no	An object containing the message's subject.
`signature`	Content	no	An object containing the message's signature.
`from`	string	no	The message sender.
`to`	`array<string>`	no	An array of primary recipients.
`cc`	`array<string>`	no	An array of carbon-copy recipients.
`bcc`	`array<string>`	no	An array of blind carbon-copy recipients.
`sent_at`	string	no	A ISO-8601 timestamp indicating when the message was created. If the timestamp does not specify a timezone, UTC will be assumed.
`language`	string	no	The original language of the message. If this is supplied, both `text` and `translated_from` should be supplied for the Content fields.

Where Content has the following format:

NAME	TYPE	REQUIRED	DESCRIPTION
`text`	string	yes	If `language` (other than the source's `language`) has been supplied, this should be the translated text of the content. Otherwise, it should be in the original language it was collected; it will be translated if not in the source's `language` and the source has `should_translate` set to `true`. Maximim 65536 characters.
`translated_from`	string	no	If `language` (other than the source's `language`) has been supplied, this should by the original text of the content. Supplying this field without having supplied a `language` will result in an error. At most 65536 characters.

Raw Emails

Check the following table for a list of available raw email fields.

NAME	TYPE	REQUIRED	DESCRIPTION
`headers`	Headers	yes	An object containing the headers of the email.
`body`	Body	yes	An object containing the main body of the email.

Where Headers has the following format:

NAME	TYPE	REQUIRED	DESCRIPTION
`raw`	string	no	One of `raw` and `parsed` is required. The raw email headers, given as a single string, with each header on its own line.
`parsed`	`map<string, string	array>`	no

Where Body has the following format:

NAME	TYPE	REQUIRED	DESCRIPTION
`plain`	string	no	At least one of `plain` and `html` is required. The plaintext content of the email. At most 65536 characters.
`html`	string	no	At least one of `plain` and `html` is required. The HTML content of the email.

Was this page helpful?

PREVIOUSCore concepts

NEXTLabels and general fields

Communications Mining user guide

Overview​

Comments created via the API​

Emails​

Thread Properties​

Support tickets​

Comments created by integrations​

Emails (Microsoft Exchange)​

Attachments and attachment contents​

Reference​

Comments​

Raw Emails​

Was this page helpful?

Overview

Comments created via the API

Emails

Thread Properties

Support tickets

Comments created by integrations

Emails (Microsoft Exchange)

Attachments and attachment contents

Reference

Comments

Raw Emails