- API docs
- Introduction
- Using the API
- API tutorial
- CLI
- Integration guides
- Blog
- How machines learn to understand words: a guide to embeddings in NLP
- Prompt-based learning with Transformers
- Efficient Transformers II: knowledge distillation & fine-tuning
- Efficient Transformers I: attention mechanisms
- Deep hierarchical unsupervised intent modelling: getting value without training data
- Fixing annotating bias with Communications Mining
- Active learning: better ML models in less time
- It's all in the numbers - assessing model performance with metrics
- Why model validation is important
- Comparing Communications Mining and Google AutoML for conversational data intelligence
API tutorial
This is a tutorial style introduction to the API - jump straight to the reference if you know what you are looking for.
All data, individual pieces of which are called messages, are grouped into sources. A source should correspond to the origin of the data, like a single mailbox, or a particular feedback channel. These can be combined for the purposes of a single inference model, so it's better to err on the side of multiple different sources than a single monolith if you're in any doubt.
A dataset is a combination of sources together with the associated label categories. For instance one dataset may be built on a website feedback source, with labels like Ease of Use or Available Information, while a different dataset could base itself on various post-purchase survey response sources and apply completely different labels about Packaging or Speed of Delivery.
So before adding any comments, you need to create a source to put them in.
- Bash
curl -X PUT 'https://<my_api_endpoint>/api/v1/sources/<project>/example' \ -H "Authorization: Bearer $REINFER_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "source": { "description": "An optional long form description.", "title": "An Example Source" } }'
curl -X PUT 'https://<my_api_endpoint>/api/v1/sources/<project>/example' \ -H "Authorization: Bearer $REINFER_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "source": { "description": "An optional long form description.", "title": "An Example Source" } }' - Node
const request = require("request"); request.put( { url: "https://<my_api_endpoint>/api/v1/sources/<project>/example", headers: { Authorization: "Bearer " + process.env.REINFER_TOKEN, }, json: true, body: { source: { description: "An optional long form description.", title: "An Example Source", }, }, }, function (error, response, json) { // digest response console.log(JSON.stringify(json, null, 2)); } );
const request = require("request"); request.put( { url: "https://<my_api_endpoint>/api/v1/sources/<project>/example", headers: { Authorization: "Bearer " + process.env.REINFER_TOKEN, }, json: true, body: { source: { description: "An optional long form description.", title: "An Example Source", }, }, }, function (error, response, json) { // digest response console.log(JSON.stringify(json, null, 2)); } ); - Python
import json import os import requests response = requests.put( "https://<my_api_endpoint>/api/v1/sources/<project>/example", headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]}, json={ "source": { "title": "An Example Source", "description": "An optional long form description.", } }, ) print(json.dumps(response.json(), indent=2, sort_keys=True))
import json import os import requests response = requests.put( "https://<my_api_endpoint>/api/v1/sources/<project>/example", headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]}, json={ "source": { "title": "An Example Source", "description": "An optional long form description.", } }, ) print(json.dumps(response.json(), indent=2, sort_keys=True)) - Response
{ "source": { "created_at": "2018-10-16T10:43:56.463000Z", "description": "An optional long form description.", "id": "22f0f76e82fd8867", "language": "en", "last_modified": "2018-10-16T10:43:56.463000Z", "name": "example", "owner": "<project>", "sensitive_properties": [], "should_translate": false, "title": "An Example Source", "updated_at": "2018-10-16T10:43:56.463000Z" }, "status": "ok" }
{ "source": { "created_at": "2018-10-16T10:43:56.463000Z", "description": "An optional long form description.", "id": "22f0f76e82fd8867", "language": "en", "last_modified": "2018-10-16T10:43:56.463000Z", "name": "example", "owner": "<project>", "sensitive_properties": [], "should_translate": false, "title": "An Example Source", "updated_at": "2018-10-16T10:43:56.463000Z" }, "status": "ok" }
To create a source you need four things:
- A project. This is an existing project you are a part of.
- A name. Alphanumeric characters, hyphens and underscores are all OK (e.g. 'post-purchase').
- A title. A nice, short human-readable title for your source to display in the UI (e.g. 'Post Purchase Survey Responses').
- A description. Optionally, a longer form description of the source to show on the sources overview page.
The first two form the 'fully qualified' name of your source, which is used to refer to it programatically. The latter two are meant for human consumption in the UI.
example
source.
You should now be the proud owner of a source! Check out your sources page, then come back.
Let's programmatically retrieve the same information available on the sources page with all metadata for all sources. You should see your source.
- Bash
curl -X GET 'https://<my_api_endpoint>/api/v1/sources' \ -H "Authorization: Bearer $REINFER_TOKEN"
curl -X GET 'https://<my_api_endpoint>/api/v1/sources' \ -H "Authorization: Bearer $REINFER_TOKEN" - Node
const request = require("request"); request.get( { url: "https://<my_api_endpoint>/api/v1/sources", headers: { Authorization: "Bearer " + process.env.REINFER_TOKEN, }, }, function (error, response, json) { // digest response console.log(JSON.stringify(json, null, 2)); } );
const request = require("request"); request.get( { url: "https://<my_api_endpoint>/api/v1/sources", headers: { Authorization: "Bearer " + process.env.REINFER_TOKEN, }, }, function (error, response, json) { // digest response console.log(JSON.stringify(json, null, 2)); } ); - Python
import json import os import requests response = requests.get( "https://<my_api_endpoint>/api/v1/sources", headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]}, ) print(json.dumps(response.json(), indent=2, sort_keys=True))
import json import os import requests response = requests.get( "https://<my_api_endpoint>/api/v1/sources", headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]}, ) print(json.dumps(response.json(), indent=2, sort_keys=True)) - Response
{ "sources": [ { "created_at": "2018-10-16T10:43:56.463000Z", "description": "An optional long form description.", "id": "22f0f76e82fd8867", "language": "en", "last_modified": "2018-10-16T10:43:56.463000Z", "name": "example", "owner": "<project>", "sensitive_properties": [], "should_translate": false, "title": "An Example Source", "updated_at": "2018-10-16T10:43:56.463000Z" } ], "status": "ok" }
{ "sources": [ { "created_at": "2018-10-16T10:43:56.463000Z", "description": "An optional long form description.", "id": "22f0f76e82fd8867", "language": "en", "last_modified": "2018-10-16T10:43:56.463000Z", "name": "example", "owner": "<project>", "sensitive_properties": [], "should_translate": false, "title": "An Example Source", "updated_at": "2018-10-16T10:43:56.463000Z" } ], "status": "ok" }
If you only want the sources belonging to a specific project you can add its name to the endpoint.
Deleting a source irretrievably destroys all messages and any other information associated with it. Any datasets which use this source will also lose the training data supplied by any labels which have been added to messages in this source, so this endpoint should be used with caution. That said, it should be safe to delete the source we created for your project in the previous section.
- Bash
curl -X DELETE 'https://<my_api_endpoint>/api/v1/sources/id:22f0f76e82fd8867' \ -H "Authorization: Bearer $REINFER_TOKEN"
curl -X DELETE 'https://<my_api_endpoint>/api/v1/sources/id:22f0f76e82fd8867' \ -H "Authorization: Bearer $REINFER_TOKEN" - Node
const request = require("request"); request.delete( { url: "https://<my_api_endpoint>/api/v1/sources/id:22f0f76e82fd8867", headers: { Authorization: "Bearer " + process.env.REINFER_TOKEN, }, }, function (error, response, json) { // digest response console.log(JSON.stringify(json, null, 2)); } );
const request = require("request"); request.delete( { url: "https://<my_api_endpoint>/api/v1/sources/id:22f0f76e82fd8867", headers: { Authorization: "Bearer " + process.env.REINFER_TOKEN, }, }, function (error, response, json) { // digest response console.log(JSON.stringify(json, null, 2)); } ); - Python
import json import os import requests response = requests.delete( "https://<my_api_endpoint>/api/v1/sources/id:22f0f76e82fd8867", headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]}, ) print(json.dumps(response.json(), indent=2, sort_keys=True))
import json import os import requests response = requests.delete( "https://<my_api_endpoint>/api/v1/sources/id:22f0f76e82fd8867", headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]}, ) print(json.dumps(response.json(), indent=2, sort_keys=True)) - Response
{ "status": "ok" }
{ "status": "ok" }
{"status": "ok"}
. To be sure it's gone, you can request all sources again.
- Bash
curl -X GET 'https://<my_api_endpoint>/api/v1/sources' \ -H "Authorization: Bearer $REINFER_TOKEN"
curl -X GET 'https://<my_api_endpoint>/api/v1/sources' \ -H "Authorization: Bearer $REINFER_TOKEN" - Node
const request = require("request"); request.get( { url: "https://<my_api_endpoint>/api/v1/sources", headers: { Authorization: "Bearer " + process.env.REINFER_TOKEN, }, }, function (error, response, json) { // digest response console.log(JSON.stringify(json, null, 2)); } );
const request = require("request"); request.get( { url: "https://<my_api_endpoint>/api/v1/sources", headers: { Authorization: "Bearer " + process.env.REINFER_TOKEN, }, }, function (error, response, json) { // digest response console.log(JSON.stringify(json, null, 2)); } ); - Python
import json import os import requests response = requests.get( "https://<my_api_endpoint>/api/v1/sources", headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]}, ) print(json.dumps(response.json(), indent=2, sort_keys=True))
import json import os import requests response = requests.get( "https://<my_api_endpoint>/api/v1/sources", headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]}, ) print(json.dumps(response.json(), indent=2, sort_keys=True)) - Response
{ "sources": [], "status": "ok" }
{ "sources": [], "status": "ok" }
Sources would be useless without the comments that go in them. A comment in Communications Mining is either an individual piece of text, or multiple text items that are combined into a conversation. Examples of the former include survey responses, support tickets, and customer reviews, while examples of the latter include email chains.
We will go ahead and add a couple of comments to the 'example' source created in the previous section.
Adding emails
- Bash
curl -X POST 'https://<my_api_endpoint>/api/v1/sources/<project>/example/sync' \ -H "Authorization: Bearer $REINFER_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "comments": [ { "id": "0123456789abcdef", "messages": [ { "body": { "text": "Hi Bob,\n\nCould you send me today'"'"'s figures?\n\nThanks,\nAlice" }, "from": "[email protected]", "sent_at": "2011-12-11T11:02:03.000000+00:00", "to": [ "[email protected]" ] }, { "body": { "text": "Alice,\n\nHere are the figures for today.\n\nRegards,\nBob" }, "from": "[email protected]", "sent_at": "2011-12-11T11:05:10.000000+00:00", "to": [ "[email protected]" ] }, { "body": { "text": "Hi Bob,\n\nI think these are the wrong numbers - could you check?\n\nThanks again,\nAlice" }, "from": "[email protected]", "sent_at": "2011-12-11T11:18:43.000000+00:00", "to": [ "[email protected]" ] } ], "timestamp": "2011-12-11T01:02:03.000000+00:00" }, { "id": "abcdef0123456789", "messages": [ { "body": { "text": "All,\n\nJust to let you know that processing is running late today.\n\nRegards,\nBob" }, "from": "[email protected]", "sent_at": "2011-12-12T10:04:30.000000+00:00", "to": [ "[email protected]", "[email protected]" ] }, { "body": { "text": "Hi Bob,\n\nCould you estimate when you'"'"'ll be finished?\n\nThanks,\nCarol" }, "from": "[email protected]", "sent_at": "2011-12-12T10:06:22.000000+00:00", "to": [ "[email protected]", "[email protected]" ] }, { "body": { "text": "Carol,\n\nWe should be done by 12pm. Sorry about the delay.\n\nBest,\nBob" }, "from": "[email protected]", "sent_at": "2011-12-11T10:09:40.000000+00:00", "to": [ "[email protected]", "[email protected]" ] } ], "timestamp": "2011-12-11T02:03:04.000000+00:00", "user_properties": { "number:severity": 3, "string:Recipient Domain": "company.com", "string:Sender Domain": "organisation.org" } } ] }'
curl -X POST 'https://<my_api_endpoint>/api/v1/sources/<project>/example/sync' \ -H "Authorization: Bearer $REINFER_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "comments": [ { "id": "0123456789abcdef", "messages": [ { "body": { "text": "Hi Bob,\n\nCould you send me today'"'"'s figures?\n\nThanks,\nAlice" }, "from": "[email protected]", "sent_at": "2011-12-11T11:02:03.000000+00:00", "to": [ "[email protected]" ] }, { "body": { "text": "Alice,\n\nHere are the figures for today.\n\nRegards,\nBob" }, "from": "[email protected]", "sent_at": "2011-12-11T11:05:10.000000+00:00", "to": [ "[email protected]" ] }, { "body": { "text": "Hi Bob,\n\nI think these are the wrong numbers - could you check?\n\nThanks again,\nAlice" }, "from": "[email protected]", "sent_at": "2011-12-11T11:18:43.000000+00:00", "to": [ "[email protected]" ] } ], "timestamp": "2011-12-11T01:02:03.000000+00:00" }, { "id": "abcdef0123456789", "messages": [ { "body": { "text": "All,\n\nJust to let you know that processing is running late today.\n\nRegards,\nBob" }, "from": "[email protected]", "sent_at": "2011-12-12T10:04:30.000000+00:00", "to": [ "[email protected]", "[email protected]" ] }, { "body": { "text": "Hi Bob,\n\nCould you estimate when you'"'"'ll be finished?\n\nThanks,\nCarol" }, "from": "[email protected]", "sent_at": "2011-12-12T10:06:22.000000+00:00", "to": [ "[email protected]", "[email protected]" ] }, { "body": { "text": "Carol,\n\nWe should be done by 12pm. Sorry about the delay.\n\nBest,\nBob" }, "from": "[email protected]", "sent_at": "2011-12-11T10:09:40.000000+00:00", "to": [ "[email protected]", "[email protected]" ] } ], "timestamp": "2011-12-11T02:03:04.000000+00:00", "user_properties": { "number:severity": 3, "string:Recipient Domain": "company.com", "string:Sender Domain": "organisation.org" } } ] }' - Node
const request = require("request"); request.post( { url: "https://<my_api_endpoint>/api/v1/sources/<project>/example/sync", headers: { Authorization: "Bearer " + process.env.REINFER_TOKEN, }, json: true, body: { comments: [ { id: "0123456789abcdef", messages: [ { body: { text: "Hi Bob,\n\nCould you send me today's figures?\n\nThanks,\nAlice", }, from: "[email protected]", sent_at: "2011-12-11T11:02:03.000000+00:00", to: ["[email protected]"], }, { body: { text: "Alice,\n\nHere are the figures for today.\n\nRegards,\nBob", }, from: "[email protected]", sent_at: "2011-12-11T11:05:10.000000+00:00", to: ["[email protected]"], }, { body: { text: "Hi Bob,\n\nI think these are the wrong numbers - could you check?\n\nThanks again,\nAlice", }, from: "[email protected]", sent_at: "2011-12-11T11:18:43.000000+00:00", to: ["[email protected]"], }, ], timestamp: "2011-12-11T01:02:03.000000+00:00", }, { id: "abcdef0123456789", messages: [ { body: { text: "All,\n\nJust to let you know that processing is running late today.\n\nRegards,\nBob", }, from: "[email protected]", sent_at: "2011-12-12T10:04:30.000000+00:00", to: ["[email protected]", "[email protected]"], }, { body: { text: "Hi Bob,\n\nCould you estimate when you'll be finished?\n\nThanks,\nCarol", }, from: "[email protected]", sent_at: "2011-12-12T10:06:22.000000+00:00", to: ["[email protected]", "[email protected]"], }, { body: { text: "Carol,\n\nWe should be done by 12pm. Sorry about the delay.\n\nBest,\nBob", }, from: "[email protected]", sent_at: "2011-12-11T10:09:40.000000+00:00", to: ["[email protected]", "[email protected]"], }, ], timestamp: "2011-12-11T02:03:04.000000+00:00", user_properties: { "number:severity": 3, "string:Recipient Domain": "company.com", "string:Sender Domain": "organisation.org", }, }, ], }, }, function (error, response, json) { // digest response console.log(JSON.stringify(json, null, 2)); } );
const request = require("request"); request.post( { url: "https://<my_api_endpoint>/api/v1/sources/<project>/example/sync", headers: { Authorization: "Bearer " + process.env.REINFER_TOKEN, }, json: true, body: { comments: [ { id: "0123456789abcdef", messages: [ { body: { text: "Hi Bob,\n\nCould you send me today's figures?\n\nThanks,\nAlice", }, from: "[email protected]", sent_at: "2011-12-11T11:02:03.000000+00:00", to: ["[email protected]"], }, { body: { text: "Alice,\n\nHere are the figures for today.\n\nRegards,\nBob", }, from: "[email protected]", sent_at: "2011-12-11T11:05:10.000000+00:00", to: ["[email protected]"], }, { body: { text: "Hi Bob,\n\nI think these are the wrong numbers - could you check?\n\nThanks again,\nAlice", }, from: "[email protected]", sent_at: "2011-12-11T11:18:43.000000+00:00", to: ["[email protected]"], }, ], timestamp: "2011-12-11T01:02:03.000000+00:00", }, { id: "abcdef0123456789", messages: [ { body: { text: "All,\n\nJust to let you know that processing is running late today.\n\nRegards,\nBob", }, from: "[email protected]", sent_at: "2011-12-12T10:04:30.000000+00:00", to: ["[email protected]", "[email protected]"], }, { body: { text: "Hi Bob,\n\nCould you estimate when you'll be finished?\n\nThanks,\nCarol", }, from: "[email protected]", sent_at: "2011-12-12T10:06:22.000000+00:00", to: ["[email protected]", "[email protected]"], }, { body: { text: "Carol,\n\nWe should be done by 12pm. Sorry about the delay.\n\nBest,\nBob", }, from: "[email protected]", sent_at: "2011-12-11T10:09:40.000000+00:00", to: ["[email protected]", "[email protected]"], }, ], timestamp: "2011-12-11T02:03:04.000000+00:00", user_properties: { "number:severity": 3, "string:Recipient Domain": "company.com", "string:Sender Domain": "organisation.org", }, }, ], }, }, function (error, response, json) { // digest response console.log(JSON.stringify(json, null, 2)); } ); - Python
import json import os import requests response = requests.post( "https://<my_api_endpoint>/api/v1/sources/<project>/example/sync", headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]}, json={ "comments": [ { "id": "0123456789abcdef", "timestamp": "2011-12-11T01:02:03.000000+00:00", "messages": [ { "from": "[email protected]", "to": ["[email protected]"], "sent_at": "2011-12-11T11:02:03.000000+00:00", "body": { "text": "Hi Bob,\n\nCould you send me today's figures?\n\nThanks,\nAlice" }, }, { "from": "[email protected]", "to": ["[email protected]"], "sent_at": "2011-12-11T11:05:10.000000+00:00", "body": { "text": "Alice,\n\nHere are the figures for today.\n\nRegards,\nBob" }, }, { "from": "[email protected]", "to": ["[email protected]"], "sent_at": "2011-12-11T11:18:43.000000+00:00", "body": { "text": "Hi Bob,\n\nI think these are the wrong numbers - could you check?\n\nThanks again,\nAlice" }, }, ], }, { "id": "abcdef0123456789", "timestamp": "2011-12-11T02:03:04.000000+00:00", "messages": [ { "from": "[email protected]", "to": ["[email protected]", "[email protected]"], "sent_at": "2011-12-12T10:04:30.000000+00:00", "body": { "text": "All,\n\nJust to let you know that processing is running late today.\n\nRegards,\nBob" }, }, { "from": "[email protected]", "to": ["[email protected]", "[email protected]"], "sent_at": "2011-12-12T10:06:22.000000+00:00", "body": { "text": "Hi Bob,\n\nCould you estimate when you'll be finished?\n\nThanks,\nCarol" }, }, { "from": "[email protected]", "to": ["[email protected]", "[email protected]"], "sent_at": "2011-12-11T10:09:40.000000+00:00", "body": { "text": "Carol,\n\nWe should be done by 12pm. Sorry about the delay.\n\nBest,\nBob" }, }, ], "user_properties": { "string:Sender Domain": "organisation.org", "string:Recipient Domain": "company.com", "number:severity": 3, }, }, ] }, ) print(json.dumps(response.json(), indent=2, sort_keys=True))
import json import os import requests response = requests.post( "https://<my_api_endpoint>/api/v1/sources/<project>/example/sync", headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]}, json={ "comments": [ { "id": "0123456789abcdef", "timestamp": "2011-12-11T01:02:03.000000+00:00", "messages": [ { "from": "[email protected]", "to": ["[email protected]"], "sent_at": "2011-12-11T11:02:03.000000+00:00", "body": { "text": "Hi Bob,\n\nCould you send me today's figures?\n\nThanks,\nAlice" }, }, { "from": "[email protected]", "to": ["[email protected]"], "sent_at": "2011-12-11T11:05:10.000000+00:00", "body": { "text": "Alice,\n\nHere are the figures for today.\n\nRegards,\nBob" }, }, { "from": "[email protected]", "to": ["[email protected]"], "sent_at": "2011-12-11T11:18:43.000000+00:00", "body": { "text": "Hi Bob,\n\nI think these are the wrong numbers - could you check?\n\nThanks again,\nAlice" }, }, ], }, { "id": "abcdef0123456789", "timestamp": "2011-12-11T02:03:04.000000+00:00", "messages": [ { "from": "[email protected]", "to": ["[email protected]", "[email protected]"], "sent_at": "2011-12-12T10:04:30.000000+00:00", "body": { "text": "All,\n\nJust to let you know that processing is running late today.\n\nRegards,\nBob" }, }, { "from": "[email protected]", "to": ["[email protected]", "[email protected]"], "sent_at": "2011-12-12T10:06:22.000000+00:00", "body": { "text": "Hi Bob,\n\nCould you estimate when you'll be finished?\n\nThanks,\nCarol" }, }, { "from": "[email protected]", "to": ["[email protected]", "[email protected]"], "sent_at": "2011-12-11T10:09:40.000000+00:00", "body": { "text": "Carol,\n\nWe should be done by 12pm. Sorry about the delay.\n\nBest,\nBob" }, }, ], "user_properties": { "string:Sender Domain": "organisation.org", "string:Recipient Domain": "company.com", "number:severity": 3, }, }, ] }, ) print(json.dumps(response.json(), indent=2, sort_keys=True)) - Response
null
null
This example shows how to add a comment that consists of multiple messages. This is most commonly used for adding emails.
id
, timestamp
, and messages.body.text
. You can learn more about available fields in the Comment Reference.
user_properties
field that holds arbitrary user-defined metadata.
The timestamp should be in UTC and refer to the time when the comment was recorded (e.g. the survey was responded to), not the current time.
The response should confirm that two new comments have been created.
- Bash
curl -X POST 'https://<my_api_endpoint>/api/v1/sources/<project>/example/sync' \ -H "Authorization: Bearer $REINFER_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "comments": [ { "id": "fedcba098765", "messages": [ { "body": { "text": "I was impressed with the speed of your service, but the price is quite high.", "translated_from": "J'"'"'ai \u00e9t\u00e9 impressionn\u00e9 par la rapidit\u00e9 de votre service, mais le prix est assez \u00e9lev\u00e9." }, "language": "fr" } ], "timestamp": "2011-12-12T20:00:00.000000+00:00" } ] }'
curl -X POST 'https://<my_api_endpoint>/api/v1/sources/<project>/example/sync' \ -H "Authorization: Bearer $REINFER_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "comments": [ { "id": "fedcba098765", "messages": [ { "body": { "text": "I was impressed with the speed of your service, but the price is quite high.", "translated_from": "J'"'"'ai \u00e9t\u00e9 impressionn\u00e9 par la rapidit\u00e9 de votre service, mais le prix est assez \u00e9lev\u00e9." }, "language": "fr" } ], "timestamp": "2011-12-12T20:00:00.000000+00:00" } ] }' - Node
const request = require("request"); request.post( { url: "https://<my_api_endpoint>/api/v1/sources/<project>/example/sync", headers: { Authorization: "Bearer " + process.env.REINFER_TOKEN, }, json: true, body: { comments: [ { id: "fedcba098765", messages: [ { body: { text: "I was impressed with the speed of your service, but the price is quite high.", translated_from: "J'ai \u00e9t\u00e9 impressionn\u00e9 par la rapidit\u00e9 de votre service, mais le prix est assez \u00e9lev\u00e9.", }, language: "fr", }, ], timestamp: "2011-12-12T20:00:00.000000+00:00", }, ], }, }, function (error, response, json) { // digest response console.log(JSON.stringify(json, null, 2)); } );
const request = require("request"); request.post( { url: "https://<my_api_endpoint>/api/v1/sources/<project>/example/sync", headers: { Authorization: "Bearer " + process.env.REINFER_TOKEN, }, json: true, body: { comments: [ { id: "fedcba098765", messages: [ { body: { text: "I was impressed with the speed of your service, but the price is quite high.", translated_from: "J'ai \u00e9t\u00e9 impressionn\u00e9 par la rapidit\u00e9 de votre service, mais le prix est assez \u00e9lev\u00e9.", }, language: "fr", }, ], timestamp: "2011-12-12T20:00:00.000000+00:00", }, ], }, }, function (error, response, json) { // digest response console.log(JSON.stringify(json, null, 2)); } ); - Python
import json import os import requests response = requests.post( "https://<my_api_endpoint>/api/v1/sources/<project>/example/sync", headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]}, json={ "comments": [ { "id": "fedcba098765", "timestamp": "2011-12-12T20:00:00.000000+00:00", "messages": [ { "language": "fr", "body": { "text": "I was impressed with the speed of your service, but the price is quite high.", "translated_from": "J'ai été impressionné par la rapidité de votre service, mais le prix est assez élevé.", }, } ], } ] }, ) print(json.dumps(response.json(), indent=2, sort_keys=True))
import json import os import requests response = requests.post( "https://<my_api_endpoint>/api/v1/sources/<project>/example/sync", headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]}, json={ "comments": [ { "id": "fedcba098765", "timestamp": "2011-12-12T20:00:00.000000+00:00", "messages": [ { "language": "fr", "body": { "text": "I was impressed with the speed of your service, but the price is quite high.", "translated_from": "J'ai été impressionné par la rapidité de votre service, mais le prix est assez élevé.", }, } ], } ] }, ) print(json.dumps(response.json(), indent=2, sort_keys=True)) - Response
{ "new": 1, "status": "ok", "unchanged": 0, "updated": 0 }
{ "new": 1, "status": "ok", "unchanged": 0, "updated": 0 }
This example shows how to add a comment that contains a single message. This format can suit data such as survey responses, customer reviews, etc.
messages
field should contain a single entry. You can skip email-specific fields that don't fit your data, as they are not required.
The response should confirm that one new comment has been created.
Once added, a comment may be retrieved by its ID. You should see the comment added in the previous section.
- Bash
curl -X GET 'https://<my_api_endpoint>/api/v1/sources/<project>/example/comments/0123456789abcdef' \ -H "Authorization: Bearer $REINFER_TOKEN"
curl -X GET 'https://<my_api_endpoint>/api/v1/sources/<project>/example/comments/0123456789abcdef' \ -H "Authorization: Bearer $REINFER_TOKEN" - Node
const request = require("request"); request.get( { url: "https://<my_api_endpoint>/api/v1/sources/<project>/example/comments/0123456789abcdef", headers: { Authorization: "Bearer " + process.env.REINFER_TOKEN, }, }, function (error, response, json) { // digest response console.log(JSON.stringify(json, null, 2)); } );
const request = require("request"); request.get( { url: "https://<my_api_endpoint>/api/v1/sources/<project>/example/comments/0123456789abcdef", headers: { Authorization: "Bearer " + process.env.REINFER_TOKEN, }, }, function (error, response, json) { // digest response console.log(JSON.stringify(json, null, 2)); } ); - Python
import json import os import requests response = requests.get( "https://<my_api_endpoint>/api/v1/sources/<project>/example/comments/0123456789abcdef", headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]}, ) print(json.dumps(response.json(), indent=2, sort_keys=True))
import json import os import requests response = requests.get( "https://<my_api_endpoint>/api/v1/sources/<project>/example/comments/0123456789abcdef", headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]}, ) print(json.dumps(response.json(), indent=2, sort_keys=True)) - Response
{ "comment": { "context": "0", "id": "0123456789abcdef", "last_modified": "2018-10-16T10:51:46.247000Z", "messages": [ { "body": { "text": "Hi Bob,\n\nCould you send me today's figures?\n\nThanks,\nAlice" }, "from": "[email protected]", "sent_at": "2011-12-11T11:02:03.000000+00:00", "to": ["[email protected]"] }, { "body": { "text": "Alice,\n\nHere are the figures for today.\n\nRegards,\nBob" }, "from": "[email protected]", "sent_at": "2011-12-11T11:05:10.000000+00:00", "to": ["[email protected]"] }, { "body": { "text": "Hi Bob,\n\nI think these are the wrong numbers - could you check?\n\nThanks again,\nAlice" }, "from": "[email protected]", "sent_at": "2011-12-11T11:18:43.000000+00:00", "to": ["[email protected]"] } ], "source_id": "22f0f76e82fd8867", "text_format": "plain", "timestamp": "2011-12-11T01:02:03Z", "uid": "22f0f76e82fd8867.0123456789abcdef", "user_properties": {} }, "status": "ok" }
{ "comment": { "context": "0", "id": "0123456789abcdef", "last_modified": "2018-10-16T10:51:46.247000Z", "messages": [ { "body": { "text": "Hi Bob,\n\nCould you send me today's figures?\n\nThanks,\nAlice" }, "from": "[email protected]", "sent_at": "2011-12-11T11:02:03.000000+00:00", "to": ["[email protected]"] }, { "body": { "text": "Alice,\n\nHere are the figures for today.\n\nRegards,\nBob" }, "from": "[email protected]", "sent_at": "2011-12-11T11:05:10.000000+00:00", "to": ["[email protected]"] }, { "body": { "text": "Hi Bob,\n\nI think these are the wrong numbers - could you check?\n\nThanks again,\nAlice" }, "from": "[email protected]", "sent_at": "2011-12-11T11:18:43.000000+00:00", "to": ["[email protected]"] } ], "source_id": "22f0f76e82fd8867", "text_format": "plain", "timestamp": "2011-12-11T01:02:03Z", "uid": "22f0f76e82fd8867.0123456789abcdef", "user_properties": {} }, "status": "ok" }
Having successfully added some raw data to Communications Mining, we can now start to add datasets. A dataset corresponds to a taxonomy of labels along with the training data supplied by applying those labels to the messages in a series of selected sources. You can create many datasets which refer to the same source(s) without the act of labeling messages using the taxonomy of one dataset having any impact on the other datasets (or the underlying sources), allowing different teams to use Communications Mining to gather insights independently.
- Bash
curl -X PUT 'https://<my_api_endpoint>/api/v1/datasets/<project>/my-dataset' \ -H "Authorization: Bearer $REINFER_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "dataset": { "description": "An optional long form description.", "source_ids": [ "22f0f76e82fd8867" ], "title": "An Example Dataset" } }'
curl -X PUT 'https://<my_api_endpoint>/api/v1/datasets/<project>/my-dataset' \ -H "Authorization: Bearer $REINFER_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "dataset": { "description": "An optional long form description.", "source_ids": [ "22f0f76e82fd8867" ], "title": "An Example Dataset" } }' - Node
const request = require("request"); request.put( { url: "https://<my_api_endpoint>/api/v1/datasets/<project>/my-dataset", headers: { Authorization: "Bearer " + process.env.REINFER_TOKEN, }, json: true, body: { dataset: { description: "An optional long form description.", source_ids: ["22f0f76e82fd8867"], title: "An Example Dataset", }, }, }, function (error, response, json) { // digest response console.log(JSON.stringify(json, null, 2)); } );
const request = require("request"); request.put( { url: "https://<my_api_endpoint>/api/v1/datasets/<project>/my-dataset", headers: { Authorization: "Bearer " + process.env.REINFER_TOKEN, }, json: true, body: { dataset: { description: "An optional long form description.", source_ids: ["22f0f76e82fd8867"], title: "An Example Dataset", }, }, }, function (error, response, json) { // digest response console.log(JSON.stringify(json, null, 2)); } ); - Python
import json import os import requests response = requests.put( "https://<my_api_endpoint>/api/v1/datasets/<project>/my-dataset", headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]}, json={ "dataset": { "title": "An Example Dataset", "description": "An optional long form description.", "source_ids": ["22f0f76e82fd8867"], } }, ) print(json.dumps(response.json(), indent=2, sort_keys=True))
import json import os import requests response = requests.put( "https://<my_api_endpoint>/api/v1/datasets/<project>/my-dataset", headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]}, json={ "dataset": { "title": "An Example Dataset", "description": "An optional long form description.", "source_ids": ["22f0f76e82fd8867"], } }, ) print(json.dumps(response.json(), indent=2, sort_keys=True)) - Response
{ "dataset": { "created": "2018-10-16T10:57:44.667000Z", "description": "An optional long form description.", "has_sentiment": true, "id": "b2ad67f9dfd2e76b", "last_modified": "2018-10-16T10:57:44.667000Z", "limited_access": false, "model_family": "english", "name": "my-dataset", "owner": "<project>", "source_ids": ["22f0f76e82fd8867"], "title": "An Example Dataset" }, "status": "ok" }
{ "dataset": { "created": "2018-10-16T10:57:44.667000Z", "description": "An optional long form description.", "has_sentiment": true, "id": "b2ad67f9dfd2e76b", "last_modified": "2018-10-16T10:57:44.667000Z", "limited_access": false, "model_family": "english", "name": "my-dataset", "owner": "<project>", "source_ids": ["22f0f76e82fd8867"], "title": "An Example Dataset" }, "status": "ok" }
Once sources have been created, appropriately-permissioned users can also create datasets in the UI, which may be more convenient.
- Bash
curl -X GET 'https://<my_api_endpoint>/api/v1/datasets/<project>/my-dataset' \ -H "Authorization: Bearer $REINFER_TOKEN"
curl -X GET 'https://<my_api_endpoint>/api/v1/datasets/<project>/my-dataset' \ -H "Authorization: Bearer $REINFER_TOKEN" - Node
const request = require("request"); request.get( { url: "https://<my_api_endpoint>/api/v1/datasets/<project>/my-dataset", headers: { Authorization: "Bearer " + process.env.REINFER_TOKEN, }, }, function (error, response, json) { // digest response console.log(JSON.stringify(json, null, 2)); } );
const request = require("request"); request.get( { url: "https://<my_api_endpoint>/api/v1/datasets/<project>/my-dataset", headers: { Authorization: "Bearer " + process.env.REINFER_TOKEN, }, }, function (error, response, json) { // digest response console.log(JSON.stringify(json, null, 2)); } ); - Python
import json import os import requests response = requests.get( "https://<my_api_endpoint>/api/v1/datasets/<project>/my-dataset", headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]}, ) print(json.dumps(response.json(), indent=2, sort_keys=True))
import json import os import requests response = requests.get( "https://<my_api_endpoint>/api/v1/datasets/<project>/my-dataset", headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]}, ) print(json.dumps(response.json(), indent=2, sort_keys=True)) - Response
{ "dataset": { "created": "2018-10-16T10:57:44.667000Z", "description": "An optional long form description.", "has_sentiment": true, "id": "b2ad67f9dfd2e76b", "last_modified": "2018-10-16T10:57:44.667000Z", "limited_access": false, "model_family": "random", "name": "my-dataset", "owner": "<project>", "source_ids": ["22f0f76e82fd8867"], "title": "An Example Dataset" }, "status": "ok" }
{ "dataset": { "created": "2018-10-16T10:57:44.667000Z", "description": "An optional long form description.", "has_sentiment": true, "id": "b2ad67f9dfd2e76b", "last_modified": "2018-10-16T10:57:44.667000Z", "limited_access": false, "model_family": "random", "name": "my-dataset", "owner": "<project>", "source_ids": ["22f0f76e82fd8867"], "title": "An Example Dataset" }, "status": "ok" }
Like sources, datasets have several GET routes corresponding to:
- all the datasets the user has access to;
- datasets belonging to the specified project;
- a single dataset specified by project and name.
We supply an example of the latter in action.
has_sentiment
, which is fixed for a given dataset.
- Bash
curl -X POST 'https://<my_api_endpoint>/api/v1/datasets/<project>/my-dataset' \ -H "Authorization: Bearer $REINFER_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "dataset": { "description": "An updated description." } }'
curl -X POST 'https://<my_api_endpoint>/api/v1/datasets/<project>/my-dataset' \ -H "Authorization: Bearer $REINFER_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "dataset": { "description": "An updated description." } }' - Node
const request = require("request"); request.post( { url: "https://<my_api_endpoint>/api/v1/datasets/<project>/my-dataset", headers: { Authorization: "Bearer " + process.env.REINFER_TOKEN, }, json: true, body: { dataset: { description: "An updated description." } }, }, function (error, response, json) { // digest response console.log(JSON.stringify(json, null, 2)); } );
const request = require("request"); request.post( { url: "https://<my_api_endpoint>/api/v1/datasets/<project>/my-dataset", headers: { Authorization: "Bearer " + process.env.REINFER_TOKEN, }, json: true, body: { dataset: { description: "An updated description." } }, }, function (error, response, json) { // digest response console.log(JSON.stringify(json, null, 2)); } ); - Python
import json import os import requests response = requests.post( "https://<my_api_endpoint>/api/v1/datasets/<project>/my-dataset", headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]}, json={"dataset": {"description": "An updated description."}}, ) print(json.dumps(response.json(), indent=2, sort_keys=True))
import json import os import requests response = requests.post( "https://<my_api_endpoint>/api/v1/datasets/<project>/my-dataset", headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]}, json={"dataset": {"description": "An updated description."}}, ) print(json.dumps(response.json(), indent=2, sort_keys=True)) - Response
{ "dataset": { "created": "2018-10-16T10:57:44.667000Z", "description": "An updated description.", "has_sentiment": true, "id": "b2ad67f9dfd2e76b", "last_modified": "2018-10-16T10:57:44.667000Z", "limited_access": false, "model_family": "random", "name": "my-dataset", "owner": "<project>", "source_ids": ["22f0f76e82fd8867"], "title": "An Example Dataset" }, "status": "ok" }
{ "dataset": { "created": "2018-10-16T10:57:44.667000Z", "description": "An updated description.", "has_sentiment": true, "id": "b2ad67f9dfd2e76b", "last_modified": "2018-10-16T10:57:44.667000Z", "limited_access": false, "model_family": "random", "name": "my-dataset", "owner": "<project>", "source_ids": ["22f0f76e82fd8867"], "title": "An Example Dataset" }, "status": "ok" }
Deleting a dataset will completely remove the associated taxonomy as well as all of the labels which have been applied to its sources. You will no longer be able to get predictions based on this taxonomy and would have to start the training process of annotating messages from the beginning in order to reverse this operation, so use it with care.
- Bash
curl -X DELETE 'https://<my_api_endpoint>/api/v1/datasets/<project>/my-dataset' \ -H "Authorization: Bearer $REINFER_TOKEN"
curl -X DELETE 'https://<my_api_endpoint>/api/v1/datasets/<project>/my-dataset' \ -H "Authorization: Bearer $REINFER_TOKEN" - Node
const request = require("request"); request.delete( { url: "https://<my_api_endpoint>/api/v1/datasets/<project>/my-dataset", headers: { Authorization: "Bearer " + process.env.REINFER_TOKEN, }, }, function (error, response, json) { // digest response console.log(JSON.stringify(json, null, 2)); } );
const request = require("request"); request.delete( { url: "https://<my_api_endpoint>/api/v1/datasets/<project>/my-dataset", headers: { Authorization: "Bearer " + process.env.REINFER_TOKEN, }, }, function (error, response, json) { // digest response console.log(JSON.stringify(json, null, 2)); } ); - Python
import json import os import requests response = requests.delete( "https://<my_api_endpoint>/api/v1/datasets/<project>/my-dataset", headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]}, ) print(json.dumps(response.json(), indent=2, sort_keys=True))
import json import os import requests response = requests.delete( "https://<my_api_endpoint>/api/v1/datasets/<project>/my-dataset", headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]}, ) print(json.dumps(response.json(), indent=2, sort_keys=True)) - Response
{ "status": "ok" }
{ "status": "ok" }
- Bash
curl -X POST 'https://<my_api_endpoint>/api/v1/datasets/<project>/<dataset>/labellers/<model_version>/predict' \ -H "Authorization: Bearer $REINFER_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "documents": [ { "messages": [ { "body": { "text": "Hi Bob, has my trade settled yet? Thanks, Alice" }, "from": "[email protected]", "sent_at": "2011-12-11T11:02:03.000000+00:00", "subject": { "text": "Trade Ref: 8726387 Settlement" }, "to": [ "[email protected]" ] } ], "user_properties": { "number:Deal Value": 12000, "string:City": "London" } }, { "messages": [ { "body": { "text": "All, just to let you know that processing is running late today. Regards, Bob" }, "from": "[email protected]", "sent_at": "2011-12-12T10:04:30.000000+00:00", "subject": { "text": "Trade Processing Delay" }, "to": [ "[email protected]", "[email protected]" ] } ], "user_properties": { "number:Deal Value": 4.9, "string:City": "Luton" } } ], "labels": [ { "name": [ "Trade", "Settlement" ], "threshold": 0.8 }, { "name": [ "Delay" ], "threshold": 0.75 } ], "threshold": 0 }'
curl -X POST 'https://<my_api_endpoint>/api/v1/datasets/<project>/<dataset>/labellers/<model_version>/predict' \ -H "Authorization: Bearer $REINFER_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "documents": [ { "messages": [ { "body": { "text": "Hi Bob, has my trade settled yet? Thanks, Alice" }, "from": "[email protected]", "sent_at": "2011-12-11T11:02:03.000000+00:00", "subject": { "text": "Trade Ref: 8726387 Settlement" }, "to": [ "[email protected]" ] } ], "user_properties": { "number:Deal Value": 12000, "string:City": "London" } }, { "messages": [ { "body": { "text": "All, just to let you know that processing is running late today. Regards, Bob" }, "from": "[email protected]", "sent_at": "2011-12-12T10:04:30.000000+00:00", "subject": { "text": "Trade Processing Delay" }, "to": [ "[email protected]", "[email protected]" ] } ], "user_properties": { "number:Deal Value": 4.9, "string:City": "Luton" } } ], "labels": [ { "name": [ "Trade", "Settlement" ], "threshold": 0.8 }, { "name": [ "Delay" ], "threshold": 0.75 } ], "threshold": 0 }' - Node
const request = require("request"); request.post( { url: "https://<my_api_endpoint>/api/v1/datasets/<project>/<dataset>/labellers/<model_version>/predict", headers: { Authorization: "Bearer " + process.env.REINFER_TOKEN, }, json: true, body: { documents: [ { messages: [ { body: { text: "Hi Bob, has my trade settled yet? Thanks, Alice" }, from: "[email protected]", sent_at: "2011-12-11T11:02:03.000000+00:00", subject: { text: "Trade Ref: 8726387 Settlement" }, to: ["[email protected]"], }, ], user_properties: { "number:Deal Value": 12000, "string:City": "London", }, }, { messages: [ { body: { text: "All, just to let you know that processing is running late today. Regards, Bob", }, from: "[email protected]", sent_at: "2011-12-12T10:04:30.000000+00:00", subject: { text: "Trade Processing Delay" }, to: ["[email protected]", "[email protected]"], }, ], user_properties: { "number:Deal Value": 4.9, "string:City": "Luton" }, }, ], labels: [ { name: ["Trade", "Settlement"], threshold: 0.8 }, { name: ["Delay"], threshold: 0.75 }, ], threshold: 0, }, }, function (error, response, json) { // digest response console.log(JSON.stringify(json, null, 2)); } );
const request = require("request"); request.post( { url: "https://<my_api_endpoint>/api/v1/datasets/<project>/<dataset>/labellers/<model_version>/predict", headers: { Authorization: "Bearer " + process.env.REINFER_TOKEN, }, json: true, body: { documents: [ { messages: [ { body: { text: "Hi Bob, has my trade settled yet? Thanks, Alice" }, from: "[email protected]", sent_at: "2011-12-11T11:02:03.000000+00:00", subject: { text: "Trade Ref: 8726387 Settlement" }, to: ["[email protected]"], }, ], user_properties: { "number:Deal Value": 12000, "string:City": "London", }, }, { messages: [ { body: { text: "All, just to let you know that processing is running late today. Regards, Bob", }, from: "[email protected]", sent_at: "2011-12-12T10:04:30.000000+00:00", subject: { text: "Trade Processing Delay" }, to: ["[email protected]", "[email protected]"], }, ], user_properties: { "number:Deal Value": 4.9, "string:City": "Luton" }, }, ], labels: [ { name: ["Trade", "Settlement"], threshold: 0.8 }, { name: ["Delay"], threshold: 0.75 }, ], threshold: 0, }, }, function (error, response, json) { // digest response console.log(JSON.stringify(json, null, 2)); } ); - Python
import json import os import requests response = requests.post( "https://<my_api_endpoint>/api/v1/datasets/<project>/<dataset>/labellers/<model_version>/predict", headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]}, json={ "documents": [ { "messages": [ { "body": { "text": "Hi Bob, has my trade settled yet? Thanks, Alice" }, "subject": {"text": "Trade Ref: 8726387 Settlement"}, "from": "[email protected]", "sent_at": "2011-12-11T11:02:03.000000+00:00", "to": ["[email protected]"], } ], "user_properties": { "number:Deal Value": 12000, "string:City": "London", }, }, { "messages": [ { "body": { "text": "All, just to let you know that processing is running late today. Regards, Bob" }, "subject": {"text": "Trade Processing Delay"}, "from": "[email protected]", "sent_at": "2011-12-12T10:04:30.000000+00:00", "to": ["[email protected]", "[email protected]"], } ], "user_properties": { "number:Deal Value": 4.9, "string:City": "Luton", }, }, ], "labels": [ {"name": ["Trade", "Settlement"], "threshold": 0.8}, {"name": ["Delay"], "threshold": 0.75}, ], "threshold": 0, }, ) print(json.dumps(response.json(), indent=2, sort_keys=True))
import json import os import requests response = requests.post( "https://<my_api_endpoint>/api/v1/datasets/<project>/<dataset>/labellers/<model_version>/predict", headers={"Authorization": "Bearer " + os.environ["REINFER_TOKEN"]}, json={ "documents": [ { "messages": [ { "body": { "text": "Hi Bob, has my trade settled yet? Thanks, Alice" }, "subject": {"text": "Trade Ref: 8726387 Settlement"}, "from": "[email protected]", "sent_at": "2011-12-11T11:02:03.000000+00:00", "to": ["[email protected]"], } ], "user_properties": { "number:Deal Value": 12000, "string:City": "London", }, }, { "messages": [ { "body": { "text": "All, just to let you know that processing is running late today. Regards, Bob" }, "subject": {"text": "Trade Processing Delay"}, "from": "[email protected]", "sent_at": "2011-12-12T10:04:30.000000+00:00", "to": ["[email protected]", "[email protected]"], } ], "user_properties": { "number:Deal Value": 4.9, "string:City": "Luton", }, }, ], "labels": [ {"name": ["Trade", "Settlement"], "threshold": 0.8}, {"name": ["Delay"], "threshold": 0.75}, ], "threshold": 0, }, ) print(json.dumps(response.json(), indent=2, sort_keys=True)) - Response
{ "entities": [ [ { "formatted_value": "2019-01-01 00:00 UTC", "kind": "date", "span": { "content_part": "body", "message_index": 0, "utf16_byte_end": 120, "utf16_byte_start": 94 } }, { "formatted_value": "Bob", "kind": "person", "span": { "content_part": "body", "message_index": 0, "utf16_byte_end": 6, "utf16_byte_start": 12 } } ], [] ], "model": { "time": "2018-12-20T15:05:43.906000Z", "version": "1" }, "predictions": [ [ { "name": ["Trade", "Settlement"], "probability": 0.8668700814247131 } ], [ { "name": ["Delay"], "probability": 0.26687008142471313 } ] ], "status": "ok" }
{ "entities": [ [ { "formatted_value": "2019-01-01 00:00 UTC", "kind": "date", "span": { "content_part": "body", "message_index": 0, "utf16_byte_end": 120, "utf16_byte_start": 94 } }, { "formatted_value": "Bob", "kind": "person", "span": { "content_part": "body", "message_index": 0, "utf16_byte_end": 6, "utf16_byte_start": 12 } } ], [] ], "model": { "time": "2018-12-20T15:05:43.906000Z", "version": "1" }, "predictions": [ [ { "name": ["Trade", "Settlement"], "probability": 0.8668700814247131 } ], [ { "name": ["Delay"], "probability": 0.26687008142471313 } ] ], "status": "ok" }
Once you have a trained model, you can now use this model to predict labels against other pieces of data. To do this you simply need to provide the following:
- Documents: This is an array of message data that the model will predict labels for and each message object can only contain one message along with any optional properties. For optimal model performance, the data provided needs to be consistent with the data and format that was annotated on the platform, as the model takes all available data and metadata into consideration. E.g emails should include subject, from/bcc/cc fields, etc (if these were present in the training data). Additionally, user properties in the training dataset should also be included in the API request body.
- Labels: This is an array of the model trained labels that you want the model to predict in the data provided. Additionally, for each label a confidence threshold to filter labels by should be provided. The optimal threshold can be decided based on your precision vs recall trade off. Further information regarding how to choose a threshold can be found in the user guide, under the "Using Validation" section.
- Default threshold (optional): This is a default threshold value that will be applied across all labels provided. Please note, if default and per-label thresholds are provided together in a request, the per-label thresholds will override the default threshold. As best practice, default thresholds can be used for testing or exploring data. For optimal results when using predictions for automated decision making, it is highly recommended to use per-label thresholds.
Within the API URL it is important to pass in the following arguments:
- Project name: This is an existing project you are a part of.
- Dataset name: This is a dataset the model has been trained on.
- Model version: The model version is a number that can be found on the "Models" page for your chosen dataset.
Understanding the Response
Because a specific model version is being used, the response to the same request will always return the same results even if the model is being trained further. Once you have validated the results of the new model and would like to submit a request against the new model, you should update the model version in your request. Additionally, you should also update the label thresholds to fit the new model. For every new model you will have to iterate through the steps again.
By default, the response will always provide a list of predicted labels for each message with a confidence greater than the threshold levels provided.
However, the response of a request can vary if entity recognition and sentiments are enabled for your model:
- General fields Enabled. The response will also provide a list of general fields that have been identified for each label (first response example)
- Sentiments Enabled. The response will also provide a sentiment score between -1 (perfectly negative) and 1 (perfectly positive) to every label object classified above the confidence threshold. (second response example)
{
"model": { "time": "2018-12-20T15:05:43.906000Z", "version": "1" },
"predictions": [
[
{
"name": ["Trade", "Settlement"],
"probability": 0.86687008142471313,
"sentiment": 0.8762539502232571
}
],
[
{
"name": ["Delay"],
"probability": 0.26687008142471313,
"sentiment": 0.8762539502232571
}
]
],
"status": "ok"
}
{
"model": { "time": "2018-12-20T15:05:43.906000Z", "version": "1" },
"predictions": [
[
{
"name": ["Trade", "Settlement"],
"probability": 0.86687008142471313,
"sentiment": 0.8762539502232571
}
],
[
{
"name": ["Delay"],
"probability": 0.26687008142471313,
"sentiment": 0.8762539502232571
}
]
],
"status": "ok"
}