automation-suite
2.2510
true
- 概述
- 要求
- 部署模板
- 手动:准备安装
- 手动:准备安装
- 步骤 2:为离线安装配置符合 OCI 的注册表
- 步骤 3:配置外部对象存储
- 步骤 4:配置 High Availability Add-on
- 步骤 5:配置 SQL 数据库
- 步骤 7:配置 DNS
- 步骤 8:配置磁盘
- 步骤 9:配置内核和操作系统级别设置
- 步骤 10:配置节点端口
- 步骤 11:应用其他设置
- 步骤 12:验证并安装所需的 RPM 包
- Cluster_config.json 示例
- 常规配置
- 配置文件配置
- 证书配置
- 数据库配置
- 外部对象存储配置
- 预签名 URL 配置
- ArgoCD 配置
- Kerberos 身份验证配置
- 符合 OCI 的外部注册表配置
- Disaster Recovery:主动/被动和主动/主动配置
- High Availability Add-on 配置
- 特定于 Orchestrator 的配置
- Insights 特定配置
- Process Mining 特定配置
- Document Understanding 特定配置
- Automation Suite Robot 特定配置
- AI Center 特定配置
- 监控配置
- 可选:配置代理服务器
- 可选:在多节点 HA 就绪生产集群中启用区域故障恢复
- 可选:传递自定义 resolv.conf
- 可选:提高容错能力
- 添加具有 GPU 支持的专用代理节点
- 为 Automation Suite Robot 添加专用代理节点
- 步骤 15:为离线安装配置临时 Docker 注册表
- 步骤 16:验证安装的先决条件
- 手动:执行安装
- 安装后
- 集群管理
- 监控和警示
- 迁移和升级
- 特定于产品的配置
- 最佳实践和维护
- 故障排除
- 如何在安装过程中对服务进行故障排除
- 如何卸载集群
- 如何清理离线工件以改善磁盘空间
- 如何清除 Redis 数据
- 如何启用 Istio 日志记录
- 如何手动清理日志
- 如何清理存储在 sf-logs 存储桶中的旧日志
- 如何禁用 AI Center 的流日志
- 如何对失败的 Automation Suite 安装进行调试
- 如何在升级后从旧安装程序中删除映像
- 如何禁用 TX 校验和卸载
- 如何手动将 ArgoCD 日志级别设置为 Info
- 如何扩展 AI Center 存储
- 如何为外部注册表生成已编码的 pull_secret_value
- 如何解决 TLS 1.2 中的弱密码问题
- 如何查看 TLS 版本
- 如何使用证书
- 如何计划 Ceph 备份和还原数据
- 如何使用集群内对象存储 (Ceph) 收集 DU 使用情况数据
- 如何在离线环境中安装 RKE2 SELinux
- How to clean up old differential backups on an NFS server
- 无法在对象存储中上传或下载数据
- 状态副本集卷附加错误
- 由于 Thanos 中的数据块已损坏,无法压缩指标
- 运行诊断工具
- 使用 Automation Suite 支持捆绑包
- 探索日志
- 探索汇总遥测
重要 :
请注意,此内容已使用机器翻译进行了部分本地化。
新发布内容的本地化可能需要 1-2 周的时间才能完成。

Linux 版 Automation Suite 安装指南
上次更新日期 2025年11月13日
要解决此问题,请执行以下步骤:
- 在任何服务器节点上,运行以下脚本:
thanosns=monitoring && if kubectl get application -n argocd rancher-monitoring; then thanosns=cattle-monitoring-system; fi && cat <<EOF | kubectl apply -f - --- apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: annotations: labels: app.kubernetes.io/component: thanos-cleaner app.kubernetes.io/instance: thanos-block-cleaner app.kubernetes.io/name: thanos-block-cleaner name: thanos-cleaner-role namespace: ${thanosns} rules: - apiGroups: - apps resources: - statefulsets - statefulsets/scale verbs: - list - get - update - patch - apiGroups: - batch resources: - jobs - cronjobs verbs: - delete - list - get - update - create - watch - apiGroups: - "" resources: - pods verbs: - delete - list - get - update --- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: labels: app.kubernetes.io/component: thanos-cleaner app.kubernetes.io/instance: thanos-block-cleaner app.kubernetes.io/name: thanos-block-cleaner name: thanos-cleaner-role-binding namespace: ${thanosns} roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: thanos-cleaner-role subjects: - kind: ServiceAccount name: thanos-cleaner namespace: ${thanosns} --- apiVersion: v1 kind: ServiceAccount metadata: name: thanos-cleaner namespace: ${thanosns} --- apiVersion: monitoring.coreos.com/v1 kind: PrometheusRule metadata: name: thanos-cleaner namespace: uipath spec: groups: - name: thanos rules: - alert: ThanosCompactorNotWorking annotations: description: Thanos compactor is not working. This will disable metrics compaction in objectstore bucket. Please check thanos compact pod in ${thanosns} namespace for any error. Compactor in faulty state will exhaust object store space message: Thanos compactor is not working. Please check if thanos cleaner job is functional and able to fix corruption runbook_url: https://docs.uipath.com/automation-suite/docs/alert-runbooks summary: Thanos compactor is not working expr: thanos_compactor_issue{job="thanos-cleaner"} >= 1 for: 1d labels: app: thanos severity: critical --- EOFthanosns=monitoring && if kubectl get application -n argocd rancher-monitoring; then thanosns=cattle-monitoring-system; fi && cat <<EOF | kubectl apply -f - --- apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: annotations: labels: app.kubernetes.io/component: thanos-cleaner app.kubernetes.io/instance: thanos-block-cleaner app.kubernetes.io/name: thanos-block-cleaner name: thanos-cleaner-role namespace: ${thanosns} rules: - apiGroups: - apps resources: - statefulsets - statefulsets/scale verbs: - list - get - update - patch - apiGroups: - batch resources: - jobs - cronjobs verbs: - delete - list - get - update - create - watch - apiGroups: - "" resources: - pods verbs: - delete - list - get - update --- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: labels: app.kubernetes.io/component: thanos-cleaner app.kubernetes.io/instance: thanos-block-cleaner app.kubernetes.io/name: thanos-block-cleaner name: thanos-cleaner-role-binding namespace: ${thanosns} roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: thanos-cleaner-role subjects: - kind: ServiceAccount name: thanos-cleaner namespace: ${thanosns} --- apiVersion: v1 kind: ServiceAccount metadata: name: thanos-cleaner namespace: ${thanosns} --- apiVersion: monitoring.coreos.com/v1 kind: PrometheusRule metadata: name: thanos-cleaner namespace: uipath spec: groups: - name: thanos rules: - alert: ThanosCompactorNotWorking annotations: description: Thanos compactor is not working. This will disable metrics compaction in objectstore bucket. Please check thanos compact pod in ${thanosns} namespace for any error. Compactor in faulty state will exhaust object store space message: Thanos compactor is not working. Please check if thanos cleaner job is functional and able to fix corruption runbook_url: https://docs.uipath.com/zh-CN/automation-suite/docs/alert-runbooks summary: Thanos compactor is not working expr: thanos_compactor_issue{job="thanos-cleaner"} >= 1 for: 1d labels: app: thanos severity: critical --- EOF - 在任何服务器节点上,运行以下脚本:
cat <<'EOF' | kubectl apply -f - --- apiVersion: v1 data: thanos-cleanup.sh: | #!/bin/bash # Copyright UiPath 2021 # # ================= # LICENSE AGREEMENT # ----------------- # Use of paid UiPath products and services is subject to the licensing agreement # executed between you and UiPath. Unless otherwise indicated by UiPath, use of free # UiPath products is subject to the associated licensing agreement available here: # https://www.uipath.com/legal/trust-and-security/legal-terms (or successor website). # You must not use this file separately from the product it is a part of or is associated with. set -eu -o pipefail export PATH=$PATH:/thanos-bin/ # Below script removes the blocks which are overlapping or having index issue or having duplicated compaction # # In few cases with above mentioned scenarios, thanos may skip the compaction and halt the compaction module. # Compaction halt requires manual deletion of corrupted blocks and restart of compact pod. config_file=/etc/thanos/${THANOS_CONFIG_KEY} function info() { echo "[INFO] [$(date +'%Y-%m-%dT%H:%M:%S%z')]: $*" } function warn() { echo -e "\e[0;33m[WARN] [$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2 } function error_without_exit() { echo -e "\e[0;31m[ERROR][$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2 } function error() { echo -e "\e[0;31m[ERROR][$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2 exit 1 } function is_compaction_halted() { info "Checking if thanos compactor running" IFS=" " read -r -a compactor_addresses <<<"$(kubectl get pods -n "$NAMESPACE" -l app.kubernetes.io/instance=thanos-compact -o jsonpath="{.items[*].status.podIP}")" is_compactor_halted=0 if [[ "${#compactor_addresses[@]}" -eq 0 ]]; then info "Thanos compactor pod is not running" is_compactor_halted=1 fi for ip in "${compactor_addresses[@]}"; do #shellcheck disable=SC2086 halted=$(curl -s http://${ip}:10902/metrics | grep thanos_compact_halted | grep -v '#' | awk -F ' ' '{print $2}') if [[ "$halted" -eq "1" ]]; then warn "Compaction is halted" is_compactor_halted=1 break fi done return $is_compactor_halted } function execute_thanos_issue_command() { if [[ $# -ne 1 ]]; then error "missing issue name for execute_thanos_issue_command function" fi issue=$1 info "Checking for issue $issue" cmd_ret=0 #shellcheck disable=SC2086 verify_output=$(thanos tools bucket --objstore.config-file=${config_file} verify --log.format=json -i $issue 2>&1) && true || cmd_ret=1 if [[ $cmd_ret -eq 1 ]]; then error_without_exit "Output of $issue command: -> $verify_output" error "Failed to verify bucket for $issue" fi #shellcheck disable=SC2086 echo $verify_output } function fix_index_issue() { info "Fixing index_known_issue issue" verify_output=$(execute_thanos_issue_command "index_known_issues") #shellcheck disable=SC2086 for b in $(echo $verify_output | sed 's/} {/\r\n/g' | grep err | grep "detected issue" | awk -F '"id":' '{print $2}' | awk -F ',' '{print $1}' | tr -d '"'); do info "Block=$b is having the issue, removing it.." thanos tools bucket mark --id="$b" \ --marker=deletion-mark.json \ --details="deleted by job" \ --objstore.config-file="${config_file}" info "Block=$b is marked for deletion" done info "Fixing index_known_issue issue done" } function fix_overlapping_issue() { info "Fixing overlapped_blocks issue" overlap_output=$(execute_thanos_issue_command "overlapped_blocks") while IFS= read -r line; do #shellcheck disable=SC2086 for b in $(echo $line | awk -F '"overlap":' '{print $2}' | awk -v search="ulid" 'match($0, search) {print substr($0, RSTART)}' | sed 's/ulid/\r\nulid/g' | awk -F ',' '{print $1}' | grep '^ulid' | awk -F ': ' '{print $2}'); do info "Block=$b is having the issue, removing it.." thanos tools bucket mark --id="$b" \ --marker=deletion-mark.json \ --details="deleted by job" \ --objstore.config-file="${config_file}" info "Block=$b is marked for deletion" done done < <(echo "$overlap_output" | sed 's/} {/\r\n/g' | grep "found overlapped blocks") info "Fixing overlapped_blocks issue done" } function fix_duplicate_issue() { info "Fixing duplicated_compaction issue" duplicate_output=$(execute_thanos_issue_command "duplicated_compaction") #shellcheck disable=SC2086,SC2006 for b in $(echo $duplicate_output | sed 's/ts=2/\r\n2/g' | grep "Found duplicated blocks that are ok to be removed" | awk -F 'ULIDs="' '{print $2}' | tr -d '[]' | awk -F '"' '{print $1}'); do info "Block=$b is having the issue, removing it.." thanos tools bucket mark --id="$b" \ --marker=deletion-mark.json \ --details="deleted by job" \ --objstore.config-file="${config_file}" info "Block=$b is marked for deletion" done info "Fixing duplicated_compaction issue done" } if [[ -z "$NAMESPACE" ]]; then error "NAMESPACE is not set" fi # We will check if compaction is halted or not before checking for issues if is_compaction_halted; then info "Thanos compaction is working" echo "thanos_compactor_issue 0" | curl --data-binary @- "http://pushgateway-prometheus-pushgateway.uipath.svc.cluster.local:9091/metrics/job/thanos-cleaner" exit 0 fi warn "Thanos compactor is not working. Checking for corrupted blocks..." echo "thanos_compactor_issue 1" | curl --data-binary @- "http://pushgateway-prometheus-pushgateway.uipath.svc.cluster.local:9091/metrics/job/thanos-cleaner" if [[ "$DISABLE_BLOCK_CLEANER" == true ]]; then info "DISABLE_BLOCK_CLEANER is set to $DISABLE_BLOCK_CLEANER, skipping block clean" exit 0 fi info "DISABLE_BLOCK_CLEANER is set to $DISABLE_BLOCK_CLEANER, removing corrupted blocks" replica=$(kubectl get sts -n "$NAMESPACE" thanos-compact -o jsonpath='{.spec.replicas}') # compactor must not be running while deleting blocks info "Stopping compactor" kubectl scale sts -n "$NAMESPACE" thanos-compact --replicas=0 kubectl delete pods -n "$NAMESPACE" -l app.kubernetes.io/instance=thanos-compact --force # fixing index_known_issues info "Checking blocks having issue" fix_index_issue fix_overlapping_issue fix_duplicate_issue info "Triggering deletion of all marked blocks" #shellcheck disable=SC2086 thanos tools bucket cleanup --delete-delay=0 --objstore.config-file=${config_file} info "Corrupted blocks are deleted" info "Scaling thanos compactor's replica to $replica" #shellcheck disable=SC2086 kubectl scale sts -n "$NAMESPACE" thanos-compact --replicas=$replica info "Thanos compactor started" validate-cronjob.sh: | #!/bin/bash # Copyright UiPath 2021 # # ================= # LICENSE AGREEMENT # ----------------- # Use of paid UiPath products and services is subject to the licensing agreement # executed between you and UiPath. Unless otherwise indicated by UiPath, use of free # UiPath products is subject to the associated licensing agreement available here: # https://www.uipath.com/legal/trust-and-security/legal-terms (or successor website). # You must not use this file separately from the product it is a part of or is associated with. set -eu -o pipefail function info() { echo "[INFO] [$(date +'%Y-%m-%dT%H:%M:%S%z')]: $*" } function warn() { echo -e "\e[0;33m[WARN] [$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2 } function error_without_exit() { echo -e "\e[0;31m[ERROR][$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2 } function error() { echo -e "\e[0;31m[ERROR][$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2 exit 1 } alias kubectl='kubectl --cache-dir=/tmp/' IFS="," read -ra cronjobs <<<"$CRONJOB_LIST" for cr in "${cronjobs[@]}"; do #shellcheck disable=SC2206 name=(${cr//// }) cronNs=default cronName="" if [[ ${#name[@]} -gt 2 || ${#name[@]} -lt 1 ]]; then error "Invalid cronjob name=$cr" fi if [[ ${#name[@]} -eq 2 ]]; then cronNs=${name[0]} cronName=${name[1]} else cronName=${name[0]} fi info "Validating cronjob=$cr" jobName="${cronName}-sf-job-validation" created=1 info "Creating validation job for $cr" kubectl delete job -n "${cronNs}" "${jobName}" --ignore-not-found --timeout=3m #shellcheck disable=SC2086 kubectl create job -n "${cronNs}" --from=cronjob/${cronName} "$jobName" || created=0 if [[ $created == 0 ]]; then error "Failed to create job for $cr" fi #shellcheck disable=SC2086 kubectl wait --timeout=20m --for=condition=complete -n "${cronNs}" job/$jobName & cpid=$! #shellcheck disable=SC2086 kubectl wait --timeout=20m --for=condition=failed -n "${cronNs}" job/${jobName} && exit 1 & fpid=$! ret=0 wait -n $cpid $fpid || ret=1 kill -9 $cpid || true kill -9 $fpid || true if [[ $ret -eq 0 ]]; then info "Job for $cr is validated/completed" #ignore deletion error. if deletion fail then will get caught in next sync. This is to reduce failure during installation kubectl delete job -n "${cronNs}" "${jobName}" --timeout=3m || true else error "Job for $cr failed" fi done kind: ConfigMap metadata: name: thanos-cleaner-script namespace: monitoring --- EOFcat <<'EOF' | kubectl apply -f - --- apiVersion: v1 data: thanos-cleanup.sh: | #!/bin/bash # Copyright UiPath 2021 # # ================= # LICENSE AGREEMENT # ----------------- # Use of paid UiPath products and services is subject to the licensing agreement # executed between you and UiPath. Unless otherwise indicated by UiPath, use of free # UiPath products is subject to the associated licensing agreement available here: # https://www.uipath.com/legal/trust-and-security/legal-terms (or successor website). # You must not use this file separately from the product it is a part of or is associated with. set -eu -o pipefail export PATH=$PATH:/thanos-bin/ # Below script removes the blocks which are overlapping or having index issue or having duplicated compaction # # In few cases with above mentioned scenarios, thanos may skip the compaction and halt the compaction module. # Compaction halt requires manual deletion of corrupted blocks and restart of compact pod. config_file=/etc/thanos/${THANOS_CONFIG_KEY} function info() { echo "[INFO] [$(date +'%Y-%m-%dT%H:%M:%S%z')]: $*" } function warn() { echo -e "\e[0;33m[WARN] [$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2 } function error_without_exit() { echo -e "\e[0;31m[ERROR][$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2 } function error() { echo -e "\e[0;31m[ERROR][$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2 exit 1 } function is_compaction_halted() { info "Checking if thanos compactor running" IFS=" " read -r -a compactor_addresses <<<"$(kubectl get pods -n "$NAMESPACE" -l app.kubernetes.io/instance=thanos-compact -o jsonpath="{.items[*].status.podIP}")" is_compactor_halted=0 if [[ "${#compactor_addresses[@]}" -eq 0 ]]; then info "Thanos compactor pod is not running" is_compactor_halted=1 fi for ip in "${compactor_addresses[@]}"; do #shellcheck disable=SC2086 halted=$(curl -s http://${ip}:10902/metrics | grep thanos_compact_halted | grep -v '#' | awk -F ' ' '{print $2}') if [[ "$halted" -eq "1" ]]; then warn "Compaction is halted" is_compactor_halted=1 break fi done return $is_compactor_halted } function execute_thanos_issue_command() { if [[ $# -ne 1 ]]; then error "missing issue name for execute_thanos_issue_command function" fi issue=$1 info "Checking for issue $issue" cmd_ret=0 #shellcheck disable=SC2086 verify_output=$(thanos tools bucket --objstore.config-file=${config_file} verify --log.format=json -i $issue 2>&1) && true || cmd_ret=1 if [[ $cmd_ret -eq 1 ]]; then error_without_exit "Output of $issue command: -> $verify_output" error "Failed to verify bucket for $issue" fi #shellcheck disable=SC2086 echo $verify_output } function fix_index_issue() { info "Fixing index_known_issue issue" verify_output=$(execute_thanos_issue_command "index_known_issues") #shellcheck disable=SC2086 for b in $(echo $verify_output | sed 's/} {/\r\n/g' | grep err | grep "detected issue" | awk -F '"id":' '{print $2}' | awk -F ',' '{print $1}' | tr -d '"'); do info "Block=$b is having the issue, removing it.." thanos tools bucket mark --id="$b" \ --marker=deletion-mark.json \ --details="deleted by job" \ --objstore.config-file="${config_file}" info "Block=$b is marked for deletion" done info "Fixing index_known_issue issue done" } function fix_overlapping_issue() { info "Fixing overlapped_blocks issue" overlap_output=$(execute_thanos_issue_command "overlapped_blocks") while IFS= read -r line; do #shellcheck disable=SC2086 for b in $(echo $line | awk -F '"overlap":' '{print $2}' | awk -v search="ulid" 'match($0, search) {print substr($0, RSTART)}' | sed 's/ulid/\r\nulid/g' | awk -F ',' '{print $1}' | grep '^ulid' | awk -F ': ' '{print $2}'); do info "Block=$b is having the issue, removing it.." thanos tools bucket mark --id="$b" \ --marker=deletion-mark.json \ --details="deleted by job" \ --objstore.config-file="${config_file}" info "Block=$b is marked for deletion" done done < <(echo "$overlap_output" | sed 's/} {/\r\n/g' | grep "found overlapped blocks") info "Fixing overlapped_blocks issue done" } function fix_duplicate_issue() { info "Fixing duplicated_compaction issue" duplicate_output=$(execute_thanos_issue_command "duplicated_compaction") #shellcheck disable=SC2086,SC2006 for b in $(echo $duplicate_output | sed 's/ts=2/\r\n2/g' | grep "Found duplicated blocks that are ok to be removed" | awk -F 'ULIDs="' '{print $2}' | tr -d '[]' | awk -F '"' '{print $1}'); do info "Block=$b is having the issue, removing it.." thanos tools bucket mark --id="$b" \ --marker=deletion-mark.json \ --details="deleted by job" \ --objstore.config-file="${config_file}" info "Block=$b is marked for deletion" done info "Fixing duplicated_compaction issue done" } if [[ -z "$NAMESPACE" ]]; then error "NAMESPACE is not set" fi # We will check if compaction is halted or not before checking for issues if is_compaction_halted; then info "Thanos compaction is working" echo "thanos_compactor_issue 0" | curl --data-binary @- "http://pushgateway-prometheus-pushgateway.uipath.svc.cluster.local:9091/metrics/job/thanos-cleaner" exit 0 fi warn "Thanos compactor is not working. Checking for corrupted blocks..." echo "thanos_compactor_issue 1" | curl --data-binary @- "http://pushgateway-prometheus-pushgateway.uipath.svc.cluster.local:9091/metrics/job/thanos-cleaner" if [[ "$DISABLE_BLOCK_CLEANER" == true ]]; then info "DISABLE_BLOCK_CLEANER is set to $DISABLE_BLOCK_CLEANER, skipping block clean" exit 0 fi info "DISABLE_BLOCK_CLEANER is set to $DISABLE_BLOCK_CLEANER, removing corrupted blocks" replica=$(kubectl get sts -n "$NAMESPACE" thanos-compact -o jsonpath='{.spec.replicas}') # compactor must not be running while deleting blocks info "Stopping compactor" kubectl scale sts -n "$NAMESPACE" thanos-compact --replicas=0 kubectl delete pods -n "$NAMESPACE" -l app.kubernetes.io/instance=thanos-compact --force # fixing index_known_issues info "Checking blocks having issue" fix_index_issue fix_overlapping_issue fix_duplicate_issue info "Triggering deletion of all marked blocks" #shellcheck disable=SC2086 thanos tools bucket cleanup --delete-delay=0 --objstore.config-file=${config_file} info "Corrupted blocks are deleted" info "Scaling thanos compactor's replica to $replica" #shellcheck disable=SC2086 kubectl scale sts -n "$NAMESPACE" thanos-compact --replicas=$replica info "Thanos compactor started" validate-cronjob.sh: | #!/bin/bash # Copyright UiPath 2021 # # ================= # LICENSE AGREEMENT # ----------------- # Use of paid UiPath products and services is subject to the licensing agreement # executed between you and UiPath. Unless otherwise indicated by UiPath, use of free # UiPath products is subject to the associated licensing agreement available here: # https://www.uipath.com/legal/trust-and-security/legal-terms (or successor website). # You must not use this file separately from the product it is a part of or is associated with. set -eu -o pipefail function info() { echo "[INFO] [$(date +'%Y-%m-%dT%H:%M:%S%z')]: $*" } function warn() { echo -e "\e[0;33m[WARN] [$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2 } function error_without_exit() { echo -e "\e[0;31m[ERROR][$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2 } function error() { echo -e "\e[0;31m[ERROR][$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2 exit 1 } alias kubectl='kubectl --cache-dir=/tmp/' IFS="," read -ra cronjobs <<<"$CRONJOB_LIST" for cr in "${cronjobs[@]}"; do #shellcheck disable=SC2206 name=(${cr//// }) cronNs=default cronName="" if [[ ${#name[@]} -gt 2 || ${#name[@]} -lt 1 ]]; then error "Invalid cronjob name=$cr" fi if [[ ${#name[@]} -eq 2 ]]; then cronNs=${name[0]} cronName=${name[1]} else cronName=${name[0]} fi info "Validating cronjob=$cr" jobName="${cronName}-sf-job-validation" created=1 info "Creating validation job for $cr" kubectl delete job -n "${cronNs}" "${jobName}" --ignore-not-found --timeout=3m #shellcheck disable=SC2086 kubectl create job -n "${cronNs}" --from=cronjob/${cronName} "$jobName" || created=0 if [[ $created == 0 ]]; then error "Failed to create job for $cr" fi #shellcheck disable=SC2086 kubectl wait --timeout=20m --for=condition=complete -n "${cronNs}" job/$jobName & cpid=$! #shellcheck disable=SC2086 kubectl wait --timeout=20m --for=condition=failed -n "${cronNs}" job/${jobName} && exit 1 & fpid=$! ret=0 wait -n $cpid $fpid || ret=1 kill -9 $cpid || true kill -9 $fpid || true if [[ $ret -eq 0 ]]; then info "Job for $cr is validated/completed" #ignore deletion error. if deletion fail then will get caught in next sync. This is to reduce failure during installation kubectl delete job -n "${cronNs}" "${jobName}" --timeout=3m || true else error "Job for $cr failed" fi done kind: ConfigMap metadata: name: thanos-cleaner-script namespace: monitoring --- EOF -
将
SF_K8S_TAG替换为正确的图像标签,然后应用 Cron 作业。从任何服务器节点上的安装程序目录中,获取最新标签:
cat versions/docker-images.json |grep uipath/sf-k8-utils-rhel | tr -d ',"' | awk -F ':' '{print $2}' |sort |uniq |tail -1cat versions/docker-images.json |grep uipath/sf-k8-utils-rhel | tr -d ',"' | awk -F ':' '{print $2}' |sort |uniq |tail -1然后,通过将SF_K8S_TAG替换为返回的值来更新 Cron 作业块。更新后,将整个块粘贴到任何服务器节点的终端中:
thanosns=monitoring && if kubectl get application -n argocd rancher-monitoring; then thanosns=cattle-monitoring-system; fi && thanosimage=$(kubectl get statefulset -n $thanosns thanos-compact -o jsonpath='{.spec.template.spec.containers[0].image}') && cat <<EOF | kubectl apply -f - --- apiVersion: batch/v1 kind: CronJob metadata: name: thanos-cleaner namespace: ${thanosns} spec: concurrencyPolicy: Forbid failedJobsHistoryLimit: 3 jobTemplate: metadata: creationTimestamp: null spec: backoffLimit: 3 template: metadata: annotations: sidecar.istio.io/inject: "false" creationTimestamp: null labels: app.kubernetes.io/name: thanos-cleaner-cronjob spec: containers: - args: - /script/thanos-cleanup.sh command: - /bin/bash env: - name: NAMESPACE valueFrom: fieldRef: apiVersion: v1 fieldPath: metadata.namespace - name: THANOS_CONFIG_KEY value: thanos.yaml - name: DISABLE_BLOCK_CLEANER value: "false" image: docker.io/uipath/sf-k8-utils-rhel:SF_K8S_TAG imagePullPolicy: IfNotPresent name: thanos-cleaner resources: limits: cpu: 200m memory: 400Mi requests: cpu: 20m memory: 64Mi terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /script/ name: script - mountPath: /etc/thanos/ name: thanos-objectstore-vol - mountPath: /thanos-bin/ name: thanos - mountPath: /.kube/ name: kubedir - mountPath: /tmp/ name: tmpdir dnsPolicy: ClusterFirst initContainers: - args: - set -e; cp /bin/thanos /thanos-bin/thanos && chmod +x /thanos-bin/thanos command: - /bin/sh - -c image: ${thanosimage} imagePullPolicy: IfNotPresent name: copy-uipathcore-binary resources: {} terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /thanos-bin/ name: thanos nodeSelector: kubernetes.io/os: linux restartPolicy: Never schedulerName: default-scheduler securityContext: fsGroup: 3000 runAsGroup: 2000 runAsNonRoot: true runAsUser: 1000 serviceAccount: thanos-cleaner serviceAccountName: thanos-cleaner terminationGracePeriodSeconds: 120 volumes: - emptyDir: {} name: kubedir - emptyDir: {} name: tmpdir - emptyDir: {} name: thanos - name: thanos-objectstore-vol secret: defaultMode: 420 secretName: thanos-objectstore-config - configMap: defaultMode: 420 name: thanos-cleaner-script name: script schedule: 0 1/6 * * * successfulJobsHistoryLimit: 2 suspend: false --- EOFthanosns=monitoring && if kubectl get application -n argocd rancher-monitoring; then thanosns=cattle-monitoring-system; fi && thanosimage=$(kubectl get statefulset -n $thanosns thanos-compact -o jsonpath='{.spec.template.spec.containers[0].image}') && cat <<EOF | kubectl apply -f - --- apiVersion: batch/v1 kind: CronJob metadata: name: thanos-cleaner namespace: ${thanosns} spec: concurrencyPolicy: Forbid failedJobsHistoryLimit: 3 jobTemplate: metadata: creationTimestamp: null spec: backoffLimit: 3 template: metadata: annotations: sidecar.istio.io/inject: "false" creationTimestamp: null labels: app.kubernetes.io/name: thanos-cleaner-cronjob spec: containers: - args: - /script/thanos-cleanup.sh command: - /bin/bash env: - name: NAMESPACE valueFrom: fieldRef: apiVersion: v1 fieldPath: metadata.namespace - name: THANOS_CONFIG_KEY value: thanos.yaml - name: DISABLE_BLOCK_CLEANER value: "false" image: docker.io/uipath/sf-k8-utils-rhel:SF_K8S_TAG imagePullPolicy: IfNotPresent name: thanos-cleaner resources: limits: cpu: 200m memory: 400Mi requests: cpu: 20m memory: 64Mi terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /script/ name: script - mountPath: /etc/thanos/ name: thanos-objectstore-vol - mountPath: /thanos-bin/ name: thanos - mountPath: /.kube/ name: kubedir - mountPath: /tmp/ name: tmpdir dnsPolicy: ClusterFirst initContainers: - args: - set -e; cp /bin/thanos /thanos-bin/thanos && chmod +x /thanos-bin/thanos command: - /bin/sh - -c image: ${thanosimage} imagePullPolicy: IfNotPresent name: copy-uipathcore-binary resources: {} terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /thanos-bin/ name: thanos nodeSelector: kubernetes.io/os: linux restartPolicy: Never schedulerName: default-scheduler securityContext: fsGroup: 3000 runAsGroup: 2000 runAsNonRoot: true runAsUser: 1000 serviceAccount: thanos-cleaner serviceAccountName: thanos-cleaner terminationGracePeriodSeconds: 120 volumes: - emptyDir: {} name: kubedir - emptyDir: {} name: tmpdir - emptyDir: {} name: thanos - name: thanos-objectstore-vol secret: defaultMode: 420 secretName: thanos-objectstore-config - configMap: defaultMode: 420 name: thanos-cleaner-script name: script schedule: 0 1/6 * * * successfulJobsHistoryLimit: 2 suspend: false --- EOF