Automation Cloud - Bring your own vector database

automation-cloud

latest

false

Automation Cloud admin guide

Bring your own vector database

Connect an externally managed vector database to UiPath agents for Context Grounding without duplicating content or changing your existing architecture.

Use your existing vector database to ground agent responses in trusted enterprise data, without duplicating content or changing your current architecture.

This guide shows how to connect externally managed vector databases (such as Databricks Vector Search or Azure AI Search) to UiPath agents using API Workflows, enabling retrieval-augmented generation (RAG) with your own data sources.

By the end of this guide, you will be able to:

Query an external vector database from a UiPath agent.
Return the most relevant content as structured context.
Ground agent responses in your organization's data securely and in real time.

When to use the Bring Your Own Vector Database (BYOVD) pattern

Use BYOVD when:

Your data is already indexed in an external vector store.
You want agents to access up-to-date enterprise knowledge.
You need to avoid copying or re-indexing data into UiPath.
You require full control over data storage, security, and embeddings.

How it works

BYOVD allows agents to ground generative AI responses in your trusted data sources. Instead of relying on a built-in Context Grounding index, you use API workflows that securely query your external vector database and return relevant context to the agent's large language model.

This approach gives you flexibility to integrate any vector database with a public API, while maintaining control over data access, authentication, and retrieval logic.

UiPath enables BYOVD through API workflows that act as tools for agents. At runtime:

User query: The user submits a prompt to the agent.
Tool selection: The agent's LLM determines that additional context is required and selects the custom vector search tool.
API Workflow execution: The agent invokes the published API Workflow, passing the user's query as input.
Vector search: The workflow queries the vector database to retrieve the most semantically relevant content.
Context return: The workflow returns the retrieved content as structured JSON.
Response formulation: The agent uses this context to generate a grounded, accurate response.

This approach supports Retrieval-Augmented Generation (RAG) without requiring native ingestion into the Context Grounding service.

Architecture overview

The BYOVD solution consists of three main components:

Vector database: Your existing system (for example, Databricks Vector Search or Azure AI Search).
API workflow: A UiPath Integration Service workflow that:
- Accepts a query.
- Calls the vector database API.
- Returns relevant results.
Agent tool: The published API Workflow, added as a tool that the agent can invoke.

Security and credential management

Before building the workflow, store all API keys and secrets securely. Do not hard-code credentials in your workflow. Instead, use the Orchestrator credentials store:

Store credentials in Orchestrator: Add your API keys and other secrets as credential assets in your Orchestrator tenant. This provides centralized, secure management of sensitive information.
Retrieve credentials at runtime: In your API Workflow, use the Get Credential activity to access stored credentials when the workflow runs. The activity returns the username as a string and the password (for example, an API key) as a SecureString, preventing secrets from being exposed in logs or workflow definitions.

Prerequisites

Before you begin, ensure you have:

An active vector database (such as Databricks Vector Search or Azure AI Search) with indexed data.
A valid API endpoint and authentication credentials stored as credential assets in Orchestrator.
An embedding model endpoint and key, also stored securely (for Azure client-side vectorization only).

Setup

You can implement BYOVD using one of three approaches: model-native endpoints, client-side vectorization (where the API workflow performs the vectorization), or integrated vectorization.

The following sections provide step-by-step instructions for configuring each approach. The examples use Databricks and Azure AI Search, but the same pattern applies to other vector databases. Choose the setup that aligns with how your vector database handles query vectorization.

Databricks vector search (model-native endpoint)

Use this option when Databricks handles query vectorization natively.

Why use this option

A simple configuration
Only one API call per query
No separate embedding model required

Steps

Get the Databricks details:
1. Retrieve the Endpoint URL.
2. Store your Databricks Personal Access Token as a credential asset in Orchestrator.
In Studio, create a new API workflow project and define the following arguments:
- in_QueryText (String)
- in_TopK (Int32, with a default value of 5)
- out_Results (String)
Use the Get Credential activity to retrieve the Databricks Personal Access Token from Orchestrator at runtime.
Add an HTTP Request activity to call the Databricks endpoint:
- Endpoint: the Databricks Vector Search endpoint
- Method: POST
- Headers: Authorization: Bearer <Personal Access Token>
- Body: Construct the JSON body required by the Databricks API, mapping your input variables.
Publish the workflow to your Orchestrator tenant.
Add the workflow as a tool to your agent, providing a clear name and description for the LLM to use.

Azure AI Search (client-side vectorization)

Use this option when your Azure AI Search index expects vector inputs.

Why use this option

Full control over embedding models
Compatibility with existing vector indexes

Steps

Get the API details:
- For Azure AI Search: Retrieve the Endpoint URL, Index name, and store your API Key as a credential asset in Orchestrator.
- For the embedding model: Retrieve the Endpoint URL and store the API Key for your embedding service as a credential asset in Orchestrator.
In Studio, create a new API workflow project and define the following arguments:
- in_QueryText (String)
- in_TopK (Int32, with a default value of 5)
- out_Results (String)
First, vectorize the query:
1. Add a Get Credential activity to retrieve your embedding model's API key.
2. Add an HTTP Request activity to call your embedding model with the in_QueryText.
3. Deserialize the JSON response and store the resulting embedding vector in a variable (e.g., queryVector).
Query Azure AI Search:
1. Add a Get Credential activity to retrieve your Azure AI Search API key.
2. Add a HTTP Request activity and configure it as follows:
  - Endpoint: Your Azure AI Search endpoint.
  - Method: POST.
  - Headers: Add an api-key header with your Azure AI Search API key variable, as follows: api-key: <API key>.
  - Body: Construct the JSON body for the Azure AI Search vector search query, embedding your queryVector variable.
Publish the workflow to your Orchestrator tenant.
Add the published workflow as a tool to your agent, providing a clear description for the LLM to use.

Azure AI Search (integrated vectorization)

Use this option when your Azure AI Search index supports built-in vectorization.

Why use this option

Simplest Azure setup
No embedding calls in the workflow
Single API request per query

Steps

Get the API details:
- Retrieve your Azure AI Search Endpoint URL, Index name, and store your API Key as a credential asset in Orchestrator.
In Studio, create a new API workflow project and define the following arguments:
- in_QueryText (String)
- in_TopK (Int32, with a default value of 5)
- out_Results (String)
Add a Get Credential activity to retrieve your Azure AI Search API key from Orchestrator.
Add an HTTP Request activity and configure it as follows:
- Endpoint:
  https://<service>.search.windows.net/indexes/<index-name>/docs/search?api-version=2023-11-01https://<service>.search.windows.net/indexes/<index-name>/docs/search?api-version=2023-11-01
- Method: POST
- Headers: Add an api-key header with your Azure AI Search API key variable, as follows: api-key: <API key>
- Body: Construct the JSON body to perform a vector search using the query text. Azure AI Search handles vectorization automatically.
  { "vectorQueries": [ { "kind": "text", "text": "<%= in_QueryText %>", "fields": "contentVector", "k": "<%= in_TopK %>" } ], "select": "chunk, source_document" }{ "vectorQueries": [ { "kind": "text", "text": "<%= in_QueryText %>", "fields": "contentVector", "k": "<%= in_TopK %>" } ], "select": "chunk, source_document" }
Publish the workflow to your Orchestrator tenant.
Add the published workflow as a tool to your agent, providing a clear description for the LLM.

On this page

When to use the Bring Your Own Vector Database (BYOVD) pattern
How it works
Architecture overview
Security and credential management
Prerequisites
Setup
Databricks vector search (model-native endpoint)
Azure AI Search (client-side vectorization)
Azure AI Search (integrated vectorization)

Was this page helpful?

PREVIOUSWorking with Context Grounding

NEXTContext Grounding FAQ

When to use the Bring Your Own Vector Database (BYOVD) pattern​

How it works​

Architecture overview​

Security and credential management​

Prerequisites​

Setup​

Databricks vector search (model-native endpoint)​

Why use this option​

Steps​

Azure AI Search (client-side vectorization)​

Why use this option​

Steps​

Azure AI Search (integrated vectorization)​

Why use this option​

Steps​

Was this page helpful?

When to use the Bring Your Own Vector Database (BYOVD) pattern

How it works

Architecture overview

Security and credential management

Prerequisites

Setup

Databricks vector search (model-native endpoint)

Why use this option

Steps

Azure AI Search (client-side vectorization)

Why use this option

Steps

Azure AI Search (integrated vectorization)

Why use this option

Steps