automation-suite
2021.10
false
UiPath logo, featuring letters U and I in white
OUT OF SUPPORT
Automation Suite Installation Guide
Last updated Nov 21, 2024

Backing up and Restoring the Cluster

In order to use backup and restore functionality, you need to enable an NFS Server, the Backup Cluster, and the Restore Cluster. All three are defined below.

Terminology

The NFS Server is the server that stores the backup data and facilitates the restoration. You can set up the NFS server on any machine or a PaaS service offered by cloud providers. Note that we do not support Windows-based NFS and Azure blob-based NFS.

The Backup Cluster is where the Automation Suite is installed. This refers to the cluster you set up during installation.

The Restore Cluster is the cluster where you would like to restore all the data from the Backup Cluster. This cluster becomes the new cluster where you run the Automation Suite after the restoration is complete.

The following steps show how to set all three up.

Environment Prerequisites

Important:
  • This step will not enable a backup for any external datasource backup (SQL Server). You need to enable the external data source backup separately.
  • We do not support cross-zone backup and restore.
  • The NFS Server should be reachable from all cluster nodes (both backup and restore clusters).
  • The cluster you want to back up and the NFS server must be in the same region.
  • Before the cluster restore, make sure to disable the backup as described in Disabling the cluster backup
  • Make sure to enable the following ports:

    Port

    Protocol

    Source

    Destination

    Purpose

    Requirements

    2049, 111

    TCP

    NFS Server

    All nodes in backup cluster

    Data sync between backup cluster and NFS Server

    This communication should be allowed from the NFS Server to the backup cluster node before running Step 2: Enabling the cluster backup.

    2049, 111

    TCP

    All nodes in backup cluster

    NFS Server

    Data sync between backup cluster and NFS Server

    This communication should be allowed from the backup cluster node to the NFS Server before running Step 2: Enabling the cluster backup.

    2049, 111

    TCP

    NFS Server

    All nodes in restore cluster

    Data sync between NFS Server and restore cluster

    This communication should be allowed from the NFS Server to restore the cluster node before running Step 3: Setting up the Restore Cluster.

    2049, 111

    TCP

    All nodes in restore cluster

    NFS Server

    Data sync between backup cluster and NFS Server

    This communication should be allowed from the NFS Server to the backup cluster node before running Step 3: Setting up the Restore Cluster.

Step 1: Setting up the External NFS Server

Requirements

The NFS Server must meet the following requirements:

  • You can set up the NFS server on any machine and any OS of your choice or alternatively use any PaaS service offered by cloud providers. Note that we do not support Windows-based NFS and Azure blob-based NFS.

  • The NFS Server version must be NFSv4 on Linux.

  • The NFS Server must run outside the Backup Cluster and the Restore Cluster.

  • The NFS Server disk size must be greater than the data disk size of the primary server node.

See Hardware requirements for more details.

Step 1.1: Installing NFS Libraries

Important: Ignore Step 1.1 if you already have an NFS server.
Install nfs-utils library on the node you plan to use as the NFS Server.
dnf install nfs-utils -y
systemctl start nfs-server.service
systemctl enable nfs-server.servicednf install nfs-utils -y
systemctl start nfs-server.service
systemctl enable nfs-server.service

Step 1.2: Configuring the Mount Path

Configure the mount path that you want to expose from the NFS Server.

chown -R nobody: "/datadisk"
chmod -R 777 "/datadisk"
systemctl restart nfs-utils.servicechown -R nobody: "/datadisk"
chmod -R 777 "/datadisk"
systemctl restart nfs-utils.service

Step 1.3: Disabling the Firewall

Firewalld is a security library that manages networking and firewall rules.

See official Firewalld documentation for more details.

To disable Firewalld, run the following command.

systemctl stop firewalld
systemctl disable firewalldsystemctl stop firewalld
systemctl disable firewalld

Step 1.4: Allowing Access of NFS Mount Path to All Backup and Restore Nodes

All nodes must be able to access the NFS mount path. On the NFS Server, go to the /etc/exports file, and add an entry for the FQDN for each node (both server and agent) for both the Backup Cluster and the Restore Cluster.

Below is an example of how to add an entry, where the entry below specifies the FQDN of a machine and the corresponding permissions on that machine:

echo "/datadisk sfdev1868610-d053997f-node.eastus.cloudapp.azure.com(rw,sync,no_all_squash,root_squash)" >> /etc/exportsecho "/datadisk sfdev1868610-d053997f-node.eastus.cloudapp.azure.com(rw,sync,no_all_squash,root_squash)" >> /etc/exports

Then run the following command to export mount path:

exportfs -arv
exportfs -sexportfs -arv
exportfs -s

Step 2: Enabling the Cluster Backup

Important:
  • Make sure you have followed the Environment prerequisites step.
  • Make sure to back up the cluster_config.json file used for installation.
  • This step will not enable the backup for any external datasource backup (such as the SQL Server). You need to enable external data source backup separately.
  • It is not recommended to reduce the backup interval to less than 15 minutes.
  • Automation Suite does not make a backup of all the Persistent Volumes, such as the volumes attached to the training pipeline in AI Center. A backup is created only for a few Persistent Volumes such as Alert Manager, Prometheus, Docker Registry, MongoDB, RabbitMQ, Ceph Objectstore, and Insights.
Create a file and name it backup.json. Make sure to fill it out based on the field definitions below.

Backup.json

{
  "backup": {
    "etcdBackupPath": "PLACEHOLDER",
    "nfs": {
      "endpoint": "PLACEHOLDER",
      "mountpath": "PLACEHOLDER"
    }
  },
  "backup_interval": "15"
}{
  "backup": {
    "etcdBackupPath": "PLACEHOLDER",
    "nfs": {
      "endpoint": "PLACEHOLDER",
      "mountpath": "PLACEHOLDER"
    }
  },
  "backup_interval": "15"
}
  • backup.etcdBackupPath — Relative path where the backup data will be stored on the NFS Server
  • backup.nfs.endpoint — Endpoint of the NFS Server (IP address or DNS name)
  • backup.nfs.mountpath — Path on the NFS Server (endpoint)
  • backup_interval — The backup time interval in minutes.
In the following example, the backup data will be stored under /datadisk/backup/cluster0 on the NFS server:
{
  "backup": {
    "etcdBackupPath": "cluster0",
    "nfs": {
      "endpoint": "20.224.01.66",
      "mountpath": "/datadisk"
    }
  }
}{
  "backup": {
    "etcdBackupPath": "cluster0",
    "nfs": {
      "endpoint": "20.224.01.66",
      "mountpath": "/datadisk"
    }
  }
}

Step 2.1: Enabling the Backup on the Primary Node of the Cluster

To enable the backup on the primary node of the cluster, run the following command:

./install-uipath.sh -i backup.json -o output.json -b --accept-license-agreement./install-uipath.sh -i backup.json -o output.json -b --accept-license-agreement

Step 2.2: Enabling the Backup on Secondary Nodes of the Cluster

To enable the backup on secondary nodes of the cluster, run the following command on the agent node:

./install-uipath.sh -i backup.json -o output.json -b -j server --accept-license-agreement./install-uipath.sh -i backup.json -o output.json -b -j server --accept-license-agreement

Step 2.3: Enabling the Backup on Agent Nodes of the Cluster

To enable the backup on agent nodes of the cluster, run the following command:

./install-uipath.sh -i backup.json -o output.json -b -j agent --accept-license-agreement./install-uipath.sh -i backup.json -o output.json -b -j agent --accept-license-agreement

Step 3: Setting up the Restore Cluster

Important:
  • Make sure the backup is disabled before restoring the cluster. See Disabling the cluster backup.
  • Make sure package wget, unzip, jq are availabe on all restore nodes.
  • Make sure you have followed the Environment prerequisites step.
  • All external datasource source should be the same (SQL Server).
  • Restart the NFS Server before cluster restoration. Execute the following command on the NFS Server node: systemctl restart nfs-server.

Restore Cluster Requirements

Restore.json

{
  "fixed_rke_address": "PLACEHOLDER",
  "gpu_support": false,
  "fqdn": "PLACEHOLDER",
  "rke_token": "PLACEHOLDER",
  "restore": {
    "etcdRestorePath": "PLACEHOLDER",
    "nfs": {
      "endpoint": "PLACEHOLDER",
      "mountpath": "PLACEHOLDER"
    }
  },
  "infra": {
    "docker_registry": {
      "username": "PLACEHOLDER",
      "password": "PLACEHOLDER"
    }
  }
}{
  "fixed_rke_address": "PLACEHOLDER",
  "gpu_support": false,
  "fqdn": "PLACEHOLDER",
  "rke_token": "PLACEHOLDER",
  "restore": {
    "etcdRestorePath": "PLACEHOLDER",
    "nfs": {
      "endpoint": "PLACEHOLDER",
      "mountpath": "PLACEHOLDER"
    }
  },
  "infra": {
    "docker_registry": {
      "username": "PLACEHOLDER",
      "password": "PLACEHOLDER"
    }
  }
}
  • fqdn — The load balancer FQDN for the multi-node HA-ready production mode or the machine FQDN for the single-node evaluation mode
  • fixed_rke_address — The fqdn of the load balancer if one is configured, otherwise it is the fqdn of the first restore server node. Used to load balance node registration and kube API request.
  • gpu_support — Use true or false to enable or disable GPU support for the cluster (use if you have agent nodes with GPUs).
  • rke_token — This is a pre-shared, cluster-specific secret. This should be the same as Backup Cluster and can be found in the cluster_config.json file. It is needed for all the nodes joining the cluster.
  • restore.etcdRestorePath — Path where backup data is stored for the cluster in NFS Server. Configured at Backup with etcdBackupPath.
  • restore.nfs.endpoint — Endpoint of NFS Server.
  • restore.nfs.mountpath: Mount Path of NFS Server.
  • infra.docker_registry.username — The username that you have set in the Backup Cluster. It can be found in the cluster_config.json file and is needed for the docker registry.
  • infra.docker_registry.password — The password that you have set in the Backup Cluster. It can be found in the cluster_config.json file and is needed for the docker registry installation.

Online Installation

Step 3.1: Restoring etcd on the primary node of the cluster

To restore etcd on the primary node of the cluster, run the following command:
./install-uipath.sh -i restore.json -o output.json -r --accept-license-agreement --install-type online./install-uipath.sh -i restore.json -o output.json -r --accept-license-agreement --install-type online

Step 3.2: Restoring etcd on secondary nodes of the cluster

To restore etcd on secondary nodes of the cluster, run the following command:
./install-uipath.sh -i restore.json -o output.json -r -j server --accept-license-agreement --install-type online./install-uipath.sh -i restore.json -o output.json -r -j server --accept-license-agreement --install-type online
Important: Node role is mandatory for all secondary server nodes.

Step 3.3: Restoring etcd on agent nodes of the cluster

To restore etcd on agent nodes of the cluster, run the following command:
./install-uipath.sh -i restore.json -o output.json -r -j agent --accept-license-agreement --install-type online./install-uipath.sh -i restore.json -o output.json -r -j agent --accept-license-agreement --install-type online

Step 3.4: Disabling maintenance mode

Note: This step is required only if the restore is part of the rollback operation during upgrade.
Once the etcd restore is complete, make sure you disable the maintenance mode:
/path/to/old-installer/configureUiPathAS.sh disable-maintenance-mode/path/to/old-installer/configureUiPathAS.sh disable-maintenance-mode

To verify the maintenance mode is disabled, run the following command:

/path/to/old-installer/configureUiPathAS.sh is-maintenance-enabled/path/to/old-installer/configureUiPathAS.sh is-maintenance-enabled

Step 3.5: Running volume restore on primary node

Once the etcd restore is complete, run volume restore on the primary node using the following command:
./install-uipath.sh -i restore.json -o output.json -r --volume-restore --accept-license-agreement --install-type online./install-uipath.sh -i restore.json -o output.json -r --volume-restore --accept-license-agreement --install-type online

Step 3.6: Installing the Automation Suite cluster certificate on the restore primary node

sudo ./configureUiPathAS.sh tls-cert get --outpath /opt/
cp /opt/ca.crt /etc/pki/ca-trust/source/anchors/
update-ca-trustsudo ./configureUiPathAS.sh tls-cert get --outpath /opt/
cp /opt/ca.crt /etc/pki/ca-trust/source/anchors/
update-ca-trust

Enabling AI Center on the Restored Cluster

After restoring an Automation Suite cluster with AI Center™ enabled, follow the steps from the Enabling AI Center on the Restored Cluster procedure.

Offline Installation

Step 3.1: Restoring etcd on the primary node of the cluster

To restore etcd on primary node of the cluster, run the following command:
./install-uipath.sh -i restore.json -o output.json -r --offline-bundle "/uipath/sf-infra-bundle.tar.gz" --offline-tmp-folder /uipath --install-offline-prereqs --accept-license-agreement --install-type offline./install-uipath.sh -i restore.json -o output.json -r --offline-bundle "/uipath/sf-infra-bundle.tar.gz" --offline-tmp-folder /uipath --install-offline-prereqs --accept-license-agreement --install-type offline

Step 3.2: Restoring etcd on secondary nodes of the cluster

./install-uipath.sh -i restore.json -o output.json -r -j server --offline-bundle "/uipath/sf-infra-bundle.tar.gz" --offline-tmp-folder /uipath --install-offline-prereqs --accept-license-agreement --install-type offline./install-uipath.sh -i restore.json -o output.json -r -j server --offline-bundle "/uipath/sf-infra-bundle.tar.gz" --offline-tmp-folder /uipath --install-offline-prereqs --accept-license-agreement --install-type offline

Step 3.3: Restoring etcd on agent nodes of the cluster

To restore etcd on agent nodes of the cluster, run the following command:
./install-uipath.sh -i restore.json -o output.json -r -j agent --offline-bundle "/uipath/sf-infra-bundle.tar.gz" --offline-tmp-folder /uipath --install-offline-prereqs --accept-license-agreement --install-type offline./install-uipath.sh -i restore.json -o output.json -r -j agent --offline-bundle "/uipath/sf-infra-bundle.tar.gz" --offline-tmp-folder /uipath --install-offline-prereqs --accept-license-agreement --install-type offline

Step 3.4: Disabling maintenance mode

Note: This step is required only if the restore is part of the rollback operation during upgrade.
Once the etcd restore is complete, make sure you disable the maintenance mode:
/path/to/old-installer/configureUiPathAS.sh disable-maintenance-mode/path/to/old-installer/configureUiPathAS.sh disable-maintenance-mode

To verify the maintenance mode is disabled, run the following command:

/path/to/old-installer/configureUiPathAS.sh is-maintenance-enabled/path/to/old-installer/configureUiPathAS.sh is-maintenance-enabled

Step 3.5: Running volume restore on primary node

Once the etcd restore is complete, run volume restore on the primary node using the following command:
./install-uipath.sh -i restore.json -o ./output.json -r --volume-restore --accept-license-agreement --install-type offline./install-uipath.sh -i restore.json -o ./output.json -r --volume-restore --accept-license-agreement --install-type offline

Step 3.6: Installing the Automation Suite cluster certificate on the restore primary node

sudo ./configureUiPathAS.sh tls-cert get --outpath /opt/
cp /opt/ca.crt /etc/pki/ca-trust/source/anchors/
update-ca-trustsudo ./configureUiPathAS.sh tls-cert get --outpath /opt/
cp /opt/ca.crt /etc/pki/ca-trust/source/anchors/
update-ca-trust

Enabling AI Center on the Restored Cluster

After restoring an Automation Suite cluster with AI Center™ enabled, follow the steps from the Enabling AI Center on the Restored Cluster procedure.

Disabling the Cluster Backup

Important: You can enable the cluster backup to save data at a specified time using the backup_interval parameter. Disabling the cluster backup will cause data loss that was created between the last scheduled run and the time you disabled the backup.

To disable the backup, run the following commands in this order:

  1. Disable the backup on the primary node of the cluster.
    ./install-uipath.sh -i backup.json -o output.json -b --disable-backup --accept-license-agreement./install-uipath.sh -i backup.json -o output.json -b --disable-backup --accept-license-agreement
  2. Disable the backup on secondary nodes of the cluster.
    ./install-uipath.sh -i backup.json -o output.json -b -j server --disable-backup --accept-license-agreement./install-uipath.sh -i backup.json -o output.json -b -j server --disable-backup --accept-license-agreement
  3. Disable the backup on agent nodes of the cluster.
    ./install-uipath.sh -i backup.json -o output.json -b -j agent --disable-backup --accept-license-agreement./install-uipath.sh -i backup.json -o output.json -b -j agent --disable-backup --accept-license-agreement

Additional Configurations

Updating the NFS Server

Important: Make sure the backup is disabled before updating the NFS server. See Disabling the cluster backup for details.

To update the NFS server, do the following:

  1. Re-run the following steps:
    1. Step 1: Setting up the external NFS server
    2. Step 2: Setting up the backup cluster
    3. Step 3: Setting up the Restore Cluster
  2. Update the NFS Server information, and then include the new nfs.endpoint in both the backup.json and restore.json files.

Adding a New Node to Cluster

To add a new node to the cluster, re-run the following steps:

  1. Step 1: Setting up the external NFS server
  2. Step 2: Setting up the backup cluster

Known Issues

Redis Restore

Redis restore fails when the restore is run, so you need to run a few additional steps.

Follow the steps in the Troubleshooting section.

Important: Once Redis is restored, make sure to restart orchestratorpods.

Insights Looker Pod Fails to Start After Restore

You can fix this issue by deleting the Looker pod from the Insights application in ArgoCD UI. The deployment will create a new pod that should start successfully.

Was this page helpful?

Get The Help You Need
Learning RPA - Automation Courses
UiPath Community Forum
Uipath Logo White
Trust and Security
© 2005-2024 UiPath. All rights reserved.