automation-suite

2.2510

true

重要 :

请注意，此内容已使用机器翻译进行了部分本地化。新发布内容的本地化可能需要 1-2 周的时间才能完成。

EKS/AKS 上的 Automation Suite 安装指南

上次更新日期 2025年11月13日

Kubernetes 集群和节点

集群和权限

您可以使用自己的 Kubernetes 集群，并按照标准实践进行配置和管理。

如果您授予 Automation Suite 安装程序管理员权限，UiPath™ 将安装并管理运行 Automation Suite 的所有必要组件。但是，如果您无法向安装程序授予集群的管理员权限，则无法安装某些所需组件。因此，在您未授予安装程序管理员权限的集群上安装 Automation Suite 之前，管理员用户必须单独安装所需的特定组件，然后再安装 Automation Suite 平台。如果您无法向 Automation Suite 安装程序授予管理员权限，则必须执行以下主要步骤：

安装并配置 Istio 服务网格。有关详细信息，请参阅安装和配置服务网格。
带上你自己的 ArgoCD。有关详细信息，请参阅安装和配置 GitOps 工具。
自行创建和管理证书。有关详细信息，请参阅安装期间生成的证书。
创建一个服务帐户并授予 Automation Suite 安装的必要权限。有关详细信息，请参阅授予安装权限。

安装所需的组件后，您可以使用较低权限执行安装程序。有关所需权限的列表，请参阅授予安装权限。

支持的 EKS/AKS 版本

每个 Automation Suite 长期支持版本都附带一个兼容性矩阵。有关兼容的 EKS 或 AKS 版本，请参阅兼容性矩阵。

我们测试了 Automation Suite 与以下 Linux 操作系统的兼容性：

云提供程序	操作系统
AKS	Ubuntu 22.04
EKS	Amazon Linux 2 及更高版本 EKS 1.32 适用于所有 EKS 版本的 Amazon Linux 2023 Bottlerocket 1.48.0

EKS/AKS 上的 Automation Suite 仅支持 x86 EKS/AKS 架构，而不支持 ARM64。

节点容量

要根据您的产品和规模要求估计节点容量，请使用 UiPath Automation Suite 安装大小调整计算器。

代理（工作器）节点的根卷要求为 256 GB。

要开始使用必需的平台服务（身份、许可和路由）和 Orchestrator，您必须为每个节点配置 8 个 vCPU 和 16 GB RAM。

备注：

由于稳定性和性能问题，我们不建议在生产场景中使用 Automation Suite 中的点实例。

交换内存

在安装 Automation Suite 之前，您必须禁用交换内存。已知交换内存会导致容器工作负载出现问题。此外，Automation Suite 工作负载不会从使用交换内存中受益，并且 Kubernetes 已优化内存使用情况。

自动缩放

我们建议在集群上启用自动调节功能，以确保高可靠性并避免业务中断。

其他 Automation Suite Robot 要求

Automation Suite Robot 需要其他工作线程节点。

Automation Suite Robot 节点的硬件要求取决于您计划使用资源的方式。除了其他代理节点要求外，您还需要至少10 GB的文件存储空间才能启用包缓存。

有关详细信息，请参阅存储文档。

以下部分介绍了影响 Automation Suite Robot 节点所需硬件数量的因素。

机器人尺寸

下表描述了所有机器人规格所需的 CPU、内存和存储。

大小	CPU	内存	存储
小	0.5	1 GB	1 GB
标准	1	2 GB	2 GB
中	2	4GB	4GB
大	6	10 GB	10 GB

代理节点规格

Automation Suite Robot 代理节点的资源会影响可并发运行的作业数量。原因是作业的 CPU/内存要求需要使用 CPU 内核数和 RAM 容量。

例如，具有 16 个 CPU 和 32 GB RAM 的节点将能够运行以下任何内容：

32 个小型作业
16 个标准作业
8 个中型作业
2 个大型作业

作业规格可以混合使用，因此在任何给定时刻，同一节点都可以运行作业组合，如下所示：

10 个小型作业（使用 5 个 CPU 和 10 GB 内存）
4 个标准作业（使用 4 个 CPU 和 8 GB 内存）
3 个中型作业（使用 6 个 CPU 和 12 GB 内存）

Kubernetes 资源消耗

鉴于节点是 Kubernetes 集群的一部分，服务器上的 Kubernetes 代理 (kubelet) 会消耗少量资源。根据我们的测量结果，kubelet 会使用以下资源：

0.6 个 CPU
0.4 GB 内存

与上述节点类似的节点实际上将具有大约 15.4 个 CPU 和 31.6 GB RAM。

自动选择计算机规格

默认情况下，所有跨平台流程的“Automation Suite Robots”选项都设置为“自动”。此设置会选择适当的计算机规格，以使用 Serverless Robot 运行流程。

自动选择规格时，系统会按顺序评估下表中列出的条件。只要满足一个标准，就会选择相应的计算机规格，并且不会再评估其余标准。

顺序	条件	计算机规格
1	[远程调试作业]	中
2	流程视用户界面自动化而定或流程视 UiPath Document Understanding 活动而定	标准
3	其他 Unattended 流程	小

其他 Document Understanding 建议

为了提高性能，您可以在具有 GPU 支持的其他代理节点上安装 Document Understanding。但请注意，Document Understanding 中基于 AI Center 的项目在没有 GPU 节点的情况下也可以完全正常运行。实际上，Document Understanding 使用 CPU 虚拟机执行所有提取和分类任务，而对于 OCR，我们强烈建议使用 GPU 虚拟机。

有关 Document Understanding 框架中 CPU/GPU 使用情况的更多详细信息，请参阅 CPU 和 GPU 使用情况。

如果要使用具有 GPU 支持的其他节点，则必须满足以下要求：

硬件	最低要求
处理器	8 (v-)CPU/内核
RAM	52GB
操作系统磁盘	256 GB 固态硬盘最低 IOPS：1100
数据磁盘	不适用
GPU RAM	11GB

添加 GPU 节点池时，请务必使用 --node-taints nvidia.com/gpu=present:NoSchedule 而不是 --node-taints sku=gpu:NoSchedule。

重要提示：为确保正确安排 GPU 工作负载，请确保您的守护程序集（NFD 或 NVIDIA GPU 运算符）YAML 配置包含匹配的tolerations块。可以使用以下示例：

tolerations:
  - key: "nvidia.com/gpu"
    operator: "Equal"
    value: "present"
    effect: "NoSchedule"tolerations:
  - key: "nvidia.com/gpu"
    operator: "Equal"
    value: "present"
    effect: "NoSchedule"

Automation Suite 支持 NVIDIA GPU。要了解如何配置 NVDIA GPU（例如驱动程序），请参阅Azure或AWS的相应文档。

其他 Document Understanding 新式项目要求

激活 CPU 推理后，至少需要 2 个 GPU。要启用 CPU 推理，请将enable_cpu_inference属性设置为true ，如“启用或禁用 Document Understanding”部分中所述。

注意：

推理速度可能会慢多达 10 倍。
我们建议将其用于最多 125 页的文档。未实施有效的限制。但是，对于超过 125 页的文档，推理可能会失败。

在没有 CPU 推理的情况下，Document Understanding 新式项目至少需要5 个 GPU 。下表中的示例场景演示了 5 个 GPU 足以处理 300 页。

注意：对于 Document Understanding 新式项目，建议的最低 GPU 为 NVIDIA T4。

函数	数字
每小时处理的自定义模型页	300
开箱即用模型每小时处理的页数	0
并行训练模型	1
所有项目中的页数 - 设计时	200
每个项目版本的文档类型数量	3

5 个 GPU 分布在不同的功能之间，如下表所述：

服务	GPU 数量
OCR 副本	1
自定义模型训练副本	1
自定义模型副本	2
开箱即用模型副本	1
总计	5

有关如何为每个服务分配 GPU 的更多信息，请查看为 Document Understanding 新式项目分配 GPU 资源页面。

除了 GPU 需求外，Document Understanding 新式项目还需要特定的 CPU 资源才能获得最佳性能。为获得最佳性能，至少需要18 个 vCPU 。

对于新式 Document Understanding 项目，需要额外 4 TB 的objectstore才能连续执行所提供示例中的活动一年。您可以从较小的数字开始，但存储完成后，除非您显式扩展存储空间，否则该活动将失败。

如果要配置一年的持续处理，则需要 4 TB 用于 Document Understanding 新式项目，512 GB 用于其他产品。总共为 4.5 TB 的存储空间。同样，如果您开始的处理时间为六个月，则 Document Understanding 新式项目需要 2 TB，其他产品需要 512 GB。在本例中，总数为 2.5 TB。

注意：有关更详细的计算结果和您需求所需的容量，请查看UiPath Automation Suite 安装大小调整计算器。

配置启用 MIG 的 GPU

Automation Suite Document Understanding 工作负载支持在使用 NVIDIA MIG（多实例 GPU）技术创建的虚拟 GPU (VGPU) 上运行。

要在这些条件下运行 Document Understanding，请记住以下要求：

GPU 内存 (VRAM)：每个 VGPU 至少 16 GB

注意： UiPath 仅支持单一策略（默认策略），这意味着所有 VGPU 都将完全相同。
存储空间：每个 VGPU 至少 80 GB

在 Kubernetes 中启用已启用 MIG 的 GPU

在集群中配置启用了 MIG 的 GPU 后，其配置文件符合或超过上述最低要求，请确保 GPU 是 Kubernetes 可调度的。节点必须报告非零数量的 GPU，然后才能在节点上计划工作负载。

要使 GPU 可计划，您有两个选项：

选项 A：遵循云提供商的官方 GPU 设置文档：
- Azure Kubernetes 服务 (AKS)
- Amazon Elastic Kubernetes Service (Amazon EKS)

选项 B（可选）：直接部署 NVIDIA 设备插件：

新建命名空间：

kubectl create namespace gpu-resourceskubectl create namespace gpu-resources

应用以下配置，并将migEnabledPoolName替换为与您的 GPU 节点匹配的标签：

apiVersion: v1
kind: Pod
metadata:
  name: nvidia-device-plugin-pod
  namespace: gpu-resources
spec:
 affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: agentpool
            operator: In
            values:
            # To be changed to a selector that matches the GPU nodes
            - migEnabledPoolName
 containers:
 - args:
   - --fail-on-init-error=false
   env:
   - name: MPS_ROOT
     value: /run/nvidia/mps
   - name: MIG_STRATEGY
      # We only support the single strategy for now
     value: single
   - name: NVIDIA_MIG_MONITOR_DEVICES
     value: all
   - name: NVIDIA_VISIBLE_DEVICES
     value: all
   - name: NVIDIA_DRIVER_CAPABILITIES
     value: compute,utility
   image: nvcr.io/nvidia/k8s-device-plugin:v0.17.3
   imagePullPolicy: IfNotPresent
   name: nvidia-device-plugin-ctr
   securityContext:
     allowPrivilegeEscalation: true
     capabilities:
       add:
       - SYS_ADMIN
   terminationMessagePath: /dev/termination-log
   terminationMessagePolicy: File
   volumeMounts:
   - mountPath: /var/lib/kubelet/device-plugins
     name: device-plugin
 tolerations:
 - key: CriticalAddonsOnly
   operator: Exists
 - effect: NoSchedule
   key: nvidia.com/gpu
   operator: Exists
 terminationGracePeriodSeconds: 30
 volumes:
 - hostPath:
     path: /var/lib/kubelet/device-plugins
     type: ""
   name: device-pluginapiVersion: v1
kind: Pod
metadata:
  name: nvidia-device-plugin-pod
  namespace: gpu-resources
spec:
 affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: agentpool
            operator: In
            values:
            # To be changed to a selector that matches the GPU nodes
            - migEnabledPoolName
 containers:
 - args:
   - --fail-on-init-error=false
   env:
   - name: MPS_ROOT
     value: /run/nvidia/mps
   - name: MIG_STRATEGY
      # We only support the single strategy for now
     value: single
   - name: NVIDIA_MIG_MONITOR_DEVICES
     value: all
   - name: NVIDIA_VISIBLE_DEVICES
     value: all
   - name: NVIDIA_DRIVER_CAPABILITIES
     value: compute,utility
   image: nvcr.io/nvidia/k8s-device-plugin:v0.17.3
   imagePullPolicy: IfNotPresent
   name: nvidia-device-plugin-ctr
   securityContext:
     allowPrivilegeEscalation: true
     capabilities:
       add:
       - SYS_ADMIN
   terminationMessagePath: /dev/termination-log
   terminationMessagePolicy: File
   volumeMounts:
   - mountPath: /var/lib/kubelet/device-plugins
     name: device-plugin
 tolerations:
 - key: CriticalAddonsOnly
   operator: Exists
 - effect: NoSchedule
   key: nvidia.com/gpu
   operator: Exists
 terminationGracePeriodSeconds: 30
 volumes:
 - hostPath:
     path: /var/lib/kubelet/device-plugins
     type: ""
   name: device-plugin

部署插件后，根据您配置的 MIG 配置文件，节点的“可分配”部分应在nvidia.com/gpu下显示正确的 VGPU 数量。该节点现在应该可计划并准备运行 Document Understanding 工作负载。

节点调度

我们建议在 Automation Suite Robot 和 Document Understanding 的专用工作器节点上启用节点污点。

AI Center 和 DU 示例：

对于 CPU：

kubectl taint node <node_name> aic.ml/cpu=present:NoSchedulekubectl taint node <node_name> aic.ml/cpu=present:NoSchedule

对于 GPU：

kubectl taint node <node_name> nvidia.com/gpu=present:NoSchedulekubectl taint node <node_name> nvidia.com/gpu=present:NoSchedule

Automation Suite Robot 示例：

使用以下命令为 Serverless Robot 添加污点：

kubectl taint node <node_name> serverless.robot=present:NoSchedulekubectl taint node <node_name> serverless.robot=present:NoSchedule

使用以下命令为 Serverless Robot 添加标签：

kubectl label node <node_name> serverless.robot=true serverless.daemon=truekubectl label node <node_name> serverless.robot=true serverless.daemon=true