- Overview
- Requirements
- Installation
- Post-installation
- Cluster administration
- Monitoring and alerting
- Migration and upgrade
- Product-specific configuration
- Best practices and maintenance
- Troubleshooting
- How to Troubleshoot Services During Installation
- How to Uninstall the Cluster
- How to clean up offline artifacts to improve disk space
- How to disable TLS 1.0 and 1.1
- How to enable Istio logging
- How to manually clean up logs
- How to clean up old logs stored in the sf-logs bucket
- How to debug failed Automation Suite installations
- How to disable TX checksum offloading
- Unable to run an offline installation on RHEL 8.4 OS
- Error in Downloading the Bundle
- Offline installation fails because of missing binary
- Certificate issue in offline installation
- SQL connection string validation error
- Failure After Certificate Update
- Automation Suite Requires Backlog_wait_time to Be Set 1
- Cannot Log in After Migration
- Setting a timeout interval for the management portals
- Update the underlying directory connections
- Kinit: Cannot Find KDC for Realm <AD Domain> While Getting Initial Credentials
- Kinit: Keytab Contains No Suitable Keys for *** While Getting Initial Credentials
- GSSAPI Operation Failed With Error: An Invalid Status Code Was Supplied (Client's Credentials Have Been Revoked).
- Login Failed for User <ADDOMAIN><aduser>. Reason: The Account Is Disabled.
- Alarm Received for Failed Kerberos-tgt-update Job
- SSPI Provider: Server Not Found in Kerberos Database
- Failure to get the sandbox image
- Pods not showing in ArgoCD UI
- Redis Probe Failure
- RKE2 Server Fails to Start
- Secret Not Found in UiPath Namespace
- ArgoCD goes into progressing state after first installation
- Unexpected Inconsistency; Run Fsck Manually
- Missing Self-heal-operator and Sf-k8-utils Repo
- Degraded MongoDB or Business Applications After Cluster Restore
- Unhealthy Services After Cluster Restore or Rollback
- Using the Automation Suite Diagnostics Tool
- Using the Automation Suite support bundle
- Exploring Logs

Automation Suite installation guide
Adding a Dedicated Agent Node With GPU Support
linkAutomation Suite currently supports only Nvidia GPU drivers. See the list of GPU-supported operating systems.
For more on the cloud-specific instance types, see the following:
Before adding a dedicated agent node with GPU support, make sure to check Hardware requirements.
Installing a GPU driver on the machine
link-
The following instructions apply to both online and offline Automation Suite installations. In the case of offline installations, you must ensure temporary internet access to retrieve the required GPU driver dependencies. If you encounter issues while installing the GPU driver, contact Nvidia support.
-
The GPU driver is stored under the
/opt/nvidia
and/usr
folders. It is highly recommended that these folders should be at-least 5 GB and 15 GB, respectively, on the GPU agent machine.
- To install the GPU driver on the agent node, run the following command:
sudo yum install kernel kernel-tools kernel-headers kernel-devel sudo reboot sudo yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm sudo sed 's/$releasever/8/g' -i /etc/yum.repos.d/epel.repo sudo sed 's/$releasever/8/g' -i /etc/yum.repos.d/epel-modular.repo sudo yum config-manager --add-repo http://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64/cuda-rhel8.repo sudo yum install cuda
sudo yum install kernel kernel-tools kernel-headers kernel-devel sudo reboot sudo yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm sudo sed 's/$releasever/8/g' -i /etc/yum.repos.d/epel.repo sudo sed 's/$releasever/8/g' -i /etc/yum.repos.d/epel-modular.repo sudo yum config-manager --add-repo http://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64/cuda-rhel8.repo sudo yum install cuda - To install the container toolkits, run the following command:
curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo | \ sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo sudo yum-config-manager --enable nvidia-container-toolkit-experimental sudo yum install -y nvidia-container-toolkit
curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo | \ sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo sudo yum-config-manager --enable nvidia-container-toolkit-experimental sudo yum install -y nvidia-container-toolkit
Verify if the drivers are installed properly
sudo nvidia-smi
command on the node to verify if the drivers were installed properly.
At this point, the GPU drivers have been installed and that the GPU nodes have been added to the cluster.
Adding a GPU node to the cluster
linkStep 1: Configuring the machine
Follow the steps for configuring the machine to ensure the disk is partitioned correctly and all networking requirements are met.
Step 2: Copying the interactive installer to the target machine
For online installation
- SSH to any of the server machine.
- Run the following command to copy the contents of the
UiPathAutomationSuite
folder to the GPU node (username and DNS are specific to the GPU node):sudo su - scp -r /opt/UiPathAutomationSuite <username>@<node dns>:/opt/ scp -r ~/* <username>@<node dns>:/opt/UiPathAutomationSuite/
sudo su - scp -r /opt/UiPathAutomationSuite <username>@<node dns>:/opt/ scp -r ~/* <username>@<node dns>:/opt/UiPathAutomationSuite/
For Offline Installation
- SSH to any of the server node.
- Ensure that the
/opt/UiPathAutomationSuite
directory containssf-infra.tar.gz
file (it is part of the installation package download step )scp -r ~/opt/UiPathAutomationSuite <username>@<node dns>:/var/tmp
scp -r ~/opt/UiPathAutomationSuite <username>@<node dns>:/var/tmp
Step 3: Running the Interactive Installation Wizard to Configure the Dedicated Node
For online installation
- SSH to the GPU Node.
- Run the following commands:
sudo su - cd /opt/UiPathAutomationSuite chmod -R 755 /opt/UiPathAutomationSuite yum install unzip jq -y CONFIG_PATH=/opt/UiPathAutomationSuite/cluster_config.json UNATTENDED_ACTION="accept_eula,download_bundle,extract_bundle,join_gpu" ./installUiPathAS.sh
sudo su - cd /opt/UiPathAutomationSuite chmod -R 755 /opt/UiPathAutomationSuite yum install unzip jq -y CONFIG_PATH=/opt/UiPathAutomationSuite/cluster_config.json UNATTENDED_ACTION="accept_eula,download_bundle,extract_bundle,join_gpu" ./installUiPathAS.sh
For Offline Installation
- Connect via SSH to the GPU dedicated node.
- Install the platform bundle on the GPU dedicated node using the following script:
sudo su mv /var/tmp/UiPathAutomationSuite /opt cd /opt/UiPathAutomationSuite chmod -R 755 /opt/UiPathAutomationSuite ./install-uipath.sh -i ./cluster_config.json -o ./output.json -k -j gpu --offline-bundle ./sf-infra.tar.gz --offline-tmp-folder /opt/UiPathAutomationSuite/tmp --install-offline-prereqs --accept-license-agreement
sudo su mv /var/tmp/UiPathAutomationSuite /opt cd /opt/UiPathAutomationSuite chmod -R 755 /opt/UiPathAutomationSuite ./install-uipath.sh -i ./cluster_config.json -o ./output.json -k -j gpu --offline-bundle ./sf-infra.tar.gz --offline-tmp-folder /opt/UiPathAutomationSuite/tmp --install-offline-prereqs --accept-license-agreement
Configuring the GPU Driver on the Cluster
linkStep 1: Installing the GPU Driver on the Cluster
- Ensure you are SSH to GPU machine.
- Update the
contianerd
configuration of the GPU node by running the following commands:cat <<EOF > gpu_containerd.sh if ! nvidia-smi &>/dev/null; then echo "GPU Drivers are not installed on the VM. Please refer the documentation." exit 0 fi if ! which nvidia-container-runtime &>/dev/null; then echo "Nvidia container runtime is not installed on the VM. Please refer the documentation." exit 0 fi grep "nvidia-container-runtime" /var/lib/rancher/rke2/agent/etc/containerd/config.toml &>/dev/null && info "GPU containerd changes already applied" && exit 0 awk '1;/plugins.cri.containerd]/{print " default_runtime_name = \"nvidia-container-runtime\""}' /var/lib/rancher/rke2/agent/etc/containerd/config.toml > /var/lib/rancher/rke2/agent/etc/containerd/config.toml.tmpl echo -e '\n[plugins.linux]\n runtime = "nvidia-container-runtime"' >> /var/lib/rancher/rke2/agent/etc/containerd/config.toml.tmpl echo -e '\n[plugins.cri.containerd.runtimes.nvidia-container-runtime]\n runtime_type = "io.containerd.runc.v2"\n [plugins.cri.containerd.runtimes.nvidia-container-runtime.options]\n BinaryName = "nvidia-container-runtime"' >> /var/lib/rancher/rke2/agent/etc/containerd/config.toml.tmpl EOF
cat <<EOF > gpu_containerd.sh if ! nvidia-smi &>/dev/null; then echo "GPU Drivers are not installed on the VM. Please refer the documentation." exit 0 fi if ! which nvidia-container-runtime &>/dev/null; then echo "Nvidia container runtime is not installed on the VM. Please refer the documentation." exit 0 fi grep "nvidia-container-runtime" /var/lib/rancher/rke2/agent/etc/containerd/config.toml &>/dev/null && info "GPU containerd changes already applied" && exit 0 awk '1;/plugins.cri.containerd]/{print " default_runtime_name = \"nvidia-container-runtime\""}' /var/lib/rancher/rke2/agent/etc/containerd/config.toml > /var/lib/rancher/rke2/agent/etc/containerd/config.toml.tmpl echo -e '\n[plugins.linux]\n runtime = "nvidia-container-runtime"' >> /var/lib/rancher/rke2/agent/etc/containerd/config.toml.tmpl echo -e '\n[plugins.cri.containerd.runtimes.nvidia-container-runtime]\n runtime_type = "io.containerd.runc.v2"\n [plugins.cri.containerd.runtimes.nvidia-container-runtime.options]\n BinaryName = "nvidia-container-runtime"' >> /var/lib/rancher/rke2/agent/etc/containerd/config.toml.tmpl EOFsudo bash gpu_containerd.sh
sudo bash gpu_containerd.sh - Restart
rke2-agent
by running the following commands:systemctl restart rke2-agent
systemctl restart rke2-agent
Step 2: Enabling the GPU in the Cluster
- Run the following commands from any of the server nodes.
- Navigate to the
UiPathAutomationSuite
folder.cd /opt/UiPathAutomationSuite
cd /opt/UiPathAutomationSuite
Enabling the GPU in an Online Installation
DOCKER_REGISTRY_URL=$(cat defaults.json | jq -er ".registries.docker.url")
sed -i "s/REGISTRY_PLACEHOLDER/${DOCKER_REGISTRY_URL}/g" ./Infra_Installer/gpu_plugin/nvidia-device-plugin.yaml
kubectl apply -f ./Infra_Installer/gpu_plugin/nvidia-device-plugin.yaml
kubectl -n kube-system rollout restart daemonset nvidia-device-plugin-daemonset
DOCKER_REGISTRY_URL=$(cat defaults.json | jq -er ".registries.docker.url")
sed -i "s/REGISTRY_PLACEHOLDER/${DOCKER_REGISTRY_URL}/g" ./Infra_Installer/gpu_plugin/nvidia-device-plugin.yaml
kubectl apply -f ./Infra_Installer/gpu_plugin/nvidia-device-plugin.yaml
kubectl -n kube-system rollout restart daemonset nvidia-device-plugin-daemonset
Enabling the GPU in an Offline Installation
DOCKER_REGISTRY_URL=localhost:30071
sed -i "s/REGISTRY_PLACEHOLDER/${DOCKER_REGISTRY_URL}/g" ./Infra_Installer/gpu_plugin/nvidia-device-plugin.yaml
kubectl apply -f ./Infra_Installer/gpu_plugin/nvidia-device-plugin.yaml
kubectl -n kube-system rollout restart daemonset nvidia-device-plugin-daemonset
DOCKER_REGISTRY_URL=localhost:30071
sed -i "s/REGISTRY_PLACEHOLDER/${DOCKER_REGISTRY_URL}/g" ./Infra_Installer/gpu_plugin/nvidia-device-plugin.yaml
kubectl apply -f ./Infra_Installer/gpu_plugin/nvidia-device-plugin.yaml
kubectl -n kube-system rollout restart daemonset nvidia-device-plugin-daemonset
- Installing a GPU driver on the machine
- Adding a GPU node to the cluster
- Step 1: Configuring the machine
- Step 2: Copying the interactive installer to the target machine
- Step 3: Running the Interactive Installation Wizard to Configure the Dedicated Node
- Configuring the GPU Driver on the Cluster
- Step 1: Installing the GPU Driver on the Cluster
- Step 2: Enabling the GPU in the Cluster
- Enabling the GPU in an Online Installation
- Enabling the GPU in an Offline Installation