Communications Mining user guide

Last updated Nov 10, 2025

Building the taxonomy structure

One of the most fundamental factors that will determine how well your model performs, as well as how well it meets your business objectives, is the structure of your taxonomy, including what is captured by each of the labels within it.

It is, therefore, important to think about your target taxonomy structure in advance of model training. Having said that, you should have a degree of flexibility to adapt, expand and enhance it as necessary as you progress through the training. This is what we call the data-led training approach.

Ultimately, the labels in the taxonomy, and the training examples provided for each label, should create an accurate and balanced representation of the dataset as a whole. But every label should also be valuable, by clearly representing in some way the messages for which it is predicted.

If labels are used to capture very broad, vague, or confused concepts, not only are they more likely to perform badly, they are less likely to provide business value. This can be to provide useful insights on that concept, or to help a process be fully or partially automated downstream.

Example of a taxonomy structure

This is an example of a high-level taxonomy with typical labels applicable across various use cases or industries. Not all of them will be applicable to your model.

Example of a real use case

A company receives millions of emails each year into different inboxes from clients about a multitude of issues, queries, suggestions, complaints, etc.

This company decides to increase their operational efficiency, process standardisation, and visibility as to what's going on in their business by automatically turning these emails from customers into workflow tickets. These can be then be tracked and actioned, using specified processes and within set timelines.

To do this, they decide to use the platform to interpret these inbound, unstructured communications and provide a classification regarding the process and sub-process that the email relates to. This classification is used to update the workflow ticket that will be automatically created using an automation service, and ensure that it's routed to the correct team or individual.

To make sure that this use case is as successful as possible, and to minimise the number of exceptions (wrong classifications, or emails the platform is not able to confidently classify), every inbound email should receive a confident prediction that has a parent label and a child label, i.e. [Process X] > [Sub-Process Y].

Structuring the taxonomy

Given that the objective is to classify every inbound email with a [Process] and [Sub-process], every label in the taxonomy should conform to this format:

What labels should capture

In this use case, any email that does not have a confident prediction for both a parent and child label could be an exception, sent for manual review and ticket creation. Alternatively, if it has a high confidence parent label prediction, but not a confident child label prediction, this could still be used to partially route the email or create a ticket, with some additional manual work to add the relevant sub-process.

If we imagine the former is true, and every email without a high confidence prediction in the form of [Process] > [Sub-Process] becomes a manual exception, we want to make sure that all of the examples we provide for each label when training the model reflect this format.

Each parent label in the taxonomy should relate to a broad process relevant to the content in the emails, for example, Invoicing. Each child label should then be a more specific sub-process that sits under a parent label, for exmaple, Invoicing > Status Request.

Note: Each label must be specific in what it is trying to capture.

Extremely broad labels such as General Query or Everything else can be very unhelpful if used to group together lots of different distinct topics, and there is no clear pattern or commonality between the pinned examples.

In this use case, they would also not provide much business value when a workflow ticket was created and classified as General Query or Everything else. Someone would still need to read it carefully to understand what it was about and whether it was relevant for their team before it could be actioned.

This negates any time saving benefit and would not provide useful MI to the business on what work was actually being done by the teams.

Note: This is just one example of how you might structure a taxonomy for a specific use case, this is not a one-size-fits-all approach. Every project will require a unique taxonomy of labels, which is highly dependent on your specific use case, dataset, and objectives.

On this page

Example of a taxonomy structure
Example of a real use case
Structuring the taxonomy
What labels should capture

Was this page helpful?

PREVIOUSDefining taxonomy objectives

NEXTImporting the taxonomy

Support and Services

Get The Help You Need

UiPath Academy

Learning RPA - Automation Courses

UiPath Forum

UiPath Community Forum

Trust and Security

Cookies Policy