automation-suite
2.2510
true
重要 :
请注意,此内容已使用机器翻译进行了部分本地化。 新发布内容的本地化可能需要 1-2 周的时间才能完成。
UiPath logo, featuring letters U and I in white

Linux 版 Automation Suite 安装指南

上次更新日期 2025年11月13日

由于 Thanos 中的数据块已损坏,无法压缩指标

描述

在对象存储中检测到损坏的块时,Thanas 压实程序可能无法压实指标。这种情况会阻止压缩器处理指标,从而导致 Ceph 存储桶中的存储使用量增加。

解决方案

要解决此问题,请执行以下步骤:
  1. 在任何服务器节点上,运行以下脚本:
    thanosns=monitoring && if kubectl get application -n argocd rancher-monitoring; then thanosns=cattle-monitoring-system; fi && cat <<EOF | kubectl apply -f -
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: Role
    metadata:
      annotations:
      labels:
        app.kubernetes.io/component: thanos-cleaner
        app.kubernetes.io/instance: thanos-block-cleaner
        app.kubernetes.io/name: thanos-block-cleaner
      name: thanos-cleaner-role
      namespace: ${thanosns}
    rules:
    - apiGroups:
      - apps
      resources:
      - statefulsets
      - statefulsets/scale
      verbs:
      - list
      - get
      - update
      - patch
    - apiGroups:
      - batch
      resources:
      - jobs
      - cronjobs
      verbs:
      - delete
      - list
      - get
      - update
      - create
      - watch
    - apiGroups:
      - ""
      resources:
      - pods
      verbs:
      - delete
      - list
      - get
      - update
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: RoleBinding
    metadata:
      labels:
        app.kubernetes.io/component: thanos-cleaner
        app.kubernetes.io/instance: thanos-block-cleaner
        app.kubernetes.io/name: thanos-block-cleaner
      name: thanos-cleaner-role-binding
      namespace: ${thanosns}
    roleRef:
      apiGroup: rbac.authorization.k8s.io
      kind: Role
      name: thanos-cleaner-role
    subjects:
    - kind: ServiceAccount
      name: thanos-cleaner
      namespace: ${thanosns}
    ---
    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: thanos-cleaner
      namespace: ${thanosns}
    ---
    apiVersion: monitoring.coreos.com/v1
    kind: PrometheusRule
    metadata:
      name: thanos-cleaner
      namespace: uipath
    spec:
      groups:
      - name: thanos
        rules:
        - alert: ThanosCompactorNotWorking
          annotations:
            description: Thanos compactor is not working. This will disable metrics compaction
              in objectstore bucket. Please check thanos compact pod in ${thanosns} namespace
              for any error. Compactor in faulty state will exhaust object store space
            message: Thanos compactor is not working. Please check if thanos cleaner job
              is functional and able to fix corruption
            runbook_url: https://docs.uipath.com/automation-suite/docs/alert-runbooks
            summary: Thanos compactor is not working
          expr: thanos_compactor_issue{job="thanos-cleaner"} >= 1
          for: 1d
          labels:
            app: thanos
            severity: critical
    ---
    EOFthanosns=monitoring && if kubectl get application -n argocd rancher-monitoring; then thanosns=cattle-monitoring-system; fi && cat <<EOF | kubectl apply -f -
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: Role
    metadata:
      annotations:
      labels:
        app.kubernetes.io/component: thanos-cleaner
        app.kubernetes.io/instance: thanos-block-cleaner
        app.kubernetes.io/name: thanos-block-cleaner
      name: thanos-cleaner-role
      namespace: ${thanosns}
    rules:
    - apiGroups:
      - apps
      resources:
      - statefulsets
      - statefulsets/scale
      verbs:
      - list
      - get
      - update
      - patch
    - apiGroups:
      - batch
      resources:
      - jobs
      - cronjobs
      verbs:
      - delete
      - list
      - get
      - update
      - create
      - watch
    - apiGroups:
      - ""
      resources:
      - pods
      verbs:
      - delete
      - list
      - get
      - update
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: RoleBinding
    metadata:
      labels:
        app.kubernetes.io/component: thanos-cleaner
        app.kubernetes.io/instance: thanos-block-cleaner
        app.kubernetes.io/name: thanos-block-cleaner
      name: thanos-cleaner-role-binding
      namespace: ${thanosns}
    roleRef:
      apiGroup: rbac.authorization.k8s.io
      kind: Role
      name: thanos-cleaner-role
    subjects:
    - kind: ServiceAccount
      name: thanos-cleaner
      namespace: ${thanosns}
    ---
    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: thanos-cleaner
      namespace: ${thanosns}
    ---
    apiVersion: monitoring.coreos.com/v1
    kind: PrometheusRule
    metadata:
      name: thanos-cleaner
      namespace: uipath
    spec:
      groups:
      - name: thanos
        rules:
        - alert: ThanosCompactorNotWorking
          annotations:
            description: Thanos compactor is not working. This will disable metrics compaction
              in objectstore bucket. Please check thanos compact pod in ${thanosns} namespace
              for any error. Compactor in faulty state will exhaust object store space
            message: Thanos compactor is not working. Please check if thanos cleaner job
              is functional and able to fix corruption
            runbook_url: https://docs.uipath.com/zh-CN/automation-suite/docs/alert-runbooks
            summary: Thanos compactor is not working
          expr: thanos_compactor_issue{job="thanos-cleaner"} >= 1
          for: 1d
          labels:
            app: thanos
            severity: critical
    ---
    EOF
  2. 在任何服务器节点上,运行以下脚本:
    cat <<'EOF' | kubectl apply -f -
    ---
    apiVersion: v1
    data:
      thanos-cleanup.sh: |
        #!/bin/bash
    
        # Copyright UiPath 2021
        #
        # =================
        # LICENSE AGREEMENT
        # -----------------
        #   Use of paid UiPath products and services is subject to the licensing agreement
        #   executed between you and UiPath. Unless otherwise indicated by UiPath, use of free
        #   UiPath products is subject to the associated licensing agreement available here:
        #   https://www.uipath.com/legal/trust-and-security/legal-terms (or successor website).
        #   You must not use this file separately from the product it is a part of or is associated with.
    
        set -eu -o pipefail
    
        export PATH=$PATH:/thanos-bin/
        # Below script removes the blocks which are overlapping or having index issue or having duplicated compaction
        #
        # In few cases with above mentioned scenarios, thanos may skip the compaction and halt the compaction module.
        # Compaction halt requires manual deletion of corrupted blocks and restart of compact pod.
    
        config_file=/etc/thanos/${THANOS_CONFIG_KEY}
    
        function info() {
          echo "[INFO] [$(date +'%Y-%m-%dT%H:%M:%S%z')]: $*"
        }
    
        function warn() {
          echo -e "\e[0;33m[WARN] [$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2
        }
    
        function error_without_exit() {
          echo -e "\e[0;31m[ERROR][$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2
        }
    
        function error() {
          echo -e "\e[0;31m[ERROR][$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2
          exit 1
        }
    
        function is_compaction_halted() {
          info "Checking if thanos compactor running"
    
          IFS=" " read -r -a compactor_addresses <<<"$(kubectl get pods -n "$NAMESPACE" -l app.kubernetes.io/instance=thanos-compact -o jsonpath="{.items[*].status.podIP}")"
    
          is_compactor_halted=0
    
          if [[ "${#compactor_addresses[@]}" -eq 0 ]]; then
            info "Thanos compactor pod is not running"
            is_compactor_halted=1
          fi
    
          for ip in "${compactor_addresses[@]}"; do
            #shellcheck disable=SC2086
            halted=$(curl -s http://${ip}:10902/metrics | grep thanos_compact_halted | grep -v '#' | awk -F ' ' '{print $2}')
            if [[ "$halted" -eq "1" ]]; then
              warn "Compaction is halted"
              is_compactor_halted=1
              break
            fi
          done
    
          return $is_compactor_halted
        }
    
        function execute_thanos_issue_command() {
          if [[ $# -ne 1 ]]; then
            error "missing issue name for execute_thanos_issue_command function"
          fi
    
          issue=$1
    
          info "Checking for issue $issue"
          cmd_ret=0
          #shellcheck disable=SC2086
          verify_output=$(thanos tools bucket --objstore.config-file=${config_file} verify --log.format=json -i $issue 2>&1) && true || cmd_ret=1
          if [[ $cmd_ret -eq 1 ]]; then
            error_without_exit "Output of $issue command: -> $verify_output"
            error "Failed to verify bucket for $issue"
          fi
    
          #shellcheck disable=SC2086
          echo $verify_output
        }
    
        function fix_index_issue() {
          info "Fixing index_known_issue issue"
    
          verify_output=$(execute_thanos_issue_command "index_known_issues")
          #shellcheck disable=SC2086
          for b in $(echo $verify_output | sed 's/} {/\r\n/g' | grep err | grep "detected issue" | awk -F '"id":' '{print $2}' | awk -F ',' '{print $1}' | tr -d '"'); do
            info "Block=$b is having the issue, removing it.."
    
            thanos tools bucket mark --id="$b" \
              --marker=deletion-mark.json \
              --details="deleted by job" \
              --objstore.config-file="${config_file}"
    
            info "Block=$b is marked for deletion"
          done
    
          info "Fixing index_known_issue issue done"
        }
    
        function fix_overlapping_issue() {
          info "Fixing overlapped_blocks issue"
    
          overlap_output=$(execute_thanos_issue_command "overlapped_blocks")
    
          while IFS= read -r line; do
            #shellcheck disable=SC2086
            for b in $(echo $line | awk -F '"overlap":' '{print $2}' | awk -v search="ulid" 'match($0, search) {print substr($0, RSTART)}' | sed 's/ulid/\r\nulid/g' | awk -F ',' '{print $1}' | grep '^ulid' | awk -F ': ' '{print $2}'); do
              info "Block=$b is having the issue, removing it.."
              thanos tools bucket mark --id="$b" \
                --marker=deletion-mark.json \
                --details="deleted by job" \
                --objstore.config-file="${config_file}"
    
              info "Block=$b is marked for deletion"
            done
          done < <(echo "$overlap_output" | sed 's/} {/\r\n/g' | grep "found overlapped blocks")
    
          info "Fixing overlapped_blocks issue done"
        }
    
        function fix_duplicate_issue() {
          info "Fixing duplicated_compaction issue"
          duplicate_output=$(execute_thanos_issue_command "duplicated_compaction")
          #shellcheck disable=SC2086,SC2006
          for b in $(echo $duplicate_output | sed 's/ts=2/\r\n2/g' | grep "Found duplicated blocks that are ok to be removed" | awk -F 'ULIDs="' '{print $2}' | tr -d '[]' | awk -F '"' '{print $1}'); do
            info "Block=$b is having the issue, removing it.."
            thanos tools bucket mark --id="$b" \
              --marker=deletion-mark.json \
              --details="deleted by job" \
              --objstore.config-file="${config_file}"
    
            info "Block=$b is marked for deletion"
          done
    
          info "Fixing duplicated_compaction issue done"
        }
    
        if [[ -z "$NAMESPACE" ]]; then
          error "NAMESPACE is not set"
        fi
    
        # We will check if compaction is halted or not before checking for issues
        if is_compaction_halted; then
          info "Thanos compaction is working"
          echo "thanos_compactor_issue 0" | curl --data-binary @- "http://pushgateway-prometheus-pushgateway.uipath.svc.cluster.local:9091/metrics/job/thanos-cleaner"
          exit 0
        fi
    
        warn "Thanos compactor is not working. Checking for corrupted blocks..."
        echo "thanos_compactor_issue 1" | curl --data-binary @- "http://pushgateway-prometheus-pushgateway.uipath.svc.cluster.local:9091/metrics/job/thanos-cleaner"
    
        if [[ "$DISABLE_BLOCK_CLEANER" == true ]]; then
          info "DISABLE_BLOCK_CLEANER is set to $DISABLE_BLOCK_CLEANER, skipping block clean"
          exit 0
        fi
    
        info "DISABLE_BLOCK_CLEANER is set to $DISABLE_BLOCK_CLEANER, removing corrupted blocks"
    
        replica=$(kubectl get sts -n "$NAMESPACE" thanos-compact -o jsonpath='{.spec.replicas}')
    
        # compactor must not be running while deleting blocks
    
        info "Stopping compactor"
        kubectl scale sts -n "$NAMESPACE" thanos-compact --replicas=0
        kubectl delete pods -n "$NAMESPACE" -l app.kubernetes.io/instance=thanos-compact --force
    
        # fixing index_known_issues
        info "Checking blocks having issue"
    
        fix_index_issue
        fix_overlapping_issue
        fix_duplicate_issue
    
        info "Triggering deletion of all marked blocks"
    
        #shellcheck disable=SC2086
        thanos tools bucket cleanup --delete-delay=0 --objstore.config-file=${config_file}
    
        info "Corrupted blocks are deleted"
    
        info "Scaling thanos compactor's replica to $replica"
        #shellcheck disable=SC2086
        kubectl scale sts -n "$NAMESPACE" thanos-compact --replicas=$replica
        info "Thanos compactor started"
      validate-cronjob.sh: |
        #!/bin/bash
    
        # Copyright UiPath 2021
        #
        # =================
        # LICENSE AGREEMENT
        # -----------------
        #   Use of paid UiPath products and services is subject to the licensing agreement
        #   executed between you and UiPath. Unless otherwise indicated by UiPath, use of free
        #   UiPath products is subject to the associated licensing agreement available here:
        #   https://www.uipath.com/legal/trust-and-security/legal-terms (or successor website).
        #   You must not use this file separately from the product it is a part of or is associated with.
    
        set -eu -o pipefail
    
        function info() {
          echo "[INFO] [$(date +'%Y-%m-%dT%H:%M:%S%z')]: $*"
        }
    
        function warn() {
          echo -e "\e[0;33m[WARN] [$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2
        }
    
        function error_without_exit() {
          echo -e "\e[0;31m[ERROR][$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2
        }
    
        function error() {
          echo -e "\e[0;31m[ERROR][$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2
          exit 1
        }
    
        alias kubectl='kubectl --cache-dir=/tmp/'
        IFS="," read -ra cronjobs <<<"$CRONJOB_LIST"
    
        for cr in "${cronjobs[@]}"; do
          #shellcheck disable=SC2206
          name=(${cr//// })
          cronNs=default
          cronName=""
    
          if [[ ${#name[@]} -gt 2 || ${#name[@]} -lt 1 ]]; then
            error "Invalid cronjob name=$cr"
          fi
    
          if [[ ${#name[@]} -eq 2 ]]; then
            cronNs=${name[0]}
            cronName=${name[1]}
          else
            cronName=${name[0]}
          fi
    
          info "Validating cronjob=$cr"
    
          jobName="${cronName}-sf-job-validation"
    
          created=1
          info "Creating validation job for $cr"
          kubectl delete job -n "${cronNs}" "${jobName}" --ignore-not-found --timeout=3m
    
          #shellcheck disable=SC2086
          kubectl create job -n "${cronNs}" --from=cronjob/${cronName} "$jobName" || created=0
    
          if [[ $created == 0 ]]; then
            error "Failed to create job for $cr"
          fi
    
          #shellcheck disable=SC2086
          kubectl wait --timeout=20m --for=condition=complete -n "${cronNs}" job/$jobName &
          cpid=$!
    
          #shellcheck disable=SC2086
          kubectl wait --timeout=20m --for=condition=failed -n "${cronNs}" job/${jobName} && exit 1 &
          fpid=$!
    
          ret=0
          wait -n $cpid $fpid || ret=1
    
          kill -9 $cpid || true
          kill -9 $fpid || true
    
          if [[ $ret -eq 0 ]]; then
            info "Job for $cr is validated/completed"
            #ignore deletion error. if deletion fail then will get caught in next sync. This is to reduce failure during installation
            kubectl delete job -n "${cronNs}" "${jobName}" --timeout=3m || true
          else
            error "Job for $cr failed"
          fi
        done
    kind: ConfigMap
    metadata:
      name: thanos-cleaner-script
      namespace: monitoring
    ---
    EOFcat <<'EOF' | kubectl apply -f -
    ---
    apiVersion: v1
    data:
      thanos-cleanup.sh: |
        #!/bin/bash
    
        # Copyright UiPath 2021
        #
        # =================
        # LICENSE AGREEMENT
        # -----------------
        #   Use of paid UiPath products and services is subject to the licensing agreement
        #   executed between you and UiPath. Unless otherwise indicated by UiPath, use of free
        #   UiPath products is subject to the associated licensing agreement available here:
        #   https://www.uipath.com/legal/trust-and-security/legal-terms (or successor website).
        #   You must not use this file separately from the product it is a part of or is associated with.
    
        set -eu -o pipefail
    
        export PATH=$PATH:/thanos-bin/
        # Below script removes the blocks which are overlapping or having index issue or having duplicated compaction
        #
        # In few cases with above mentioned scenarios, thanos may skip the compaction and halt the compaction module.
        # Compaction halt requires manual deletion of corrupted blocks and restart of compact pod.
    
        config_file=/etc/thanos/${THANOS_CONFIG_KEY}
    
        function info() {
          echo "[INFO] [$(date +'%Y-%m-%dT%H:%M:%S%z')]: $*"
        }
    
        function warn() {
          echo -e "\e[0;33m[WARN] [$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2
        }
    
        function error_without_exit() {
          echo -e "\e[0;31m[ERROR][$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2
        }
    
        function error() {
          echo -e "\e[0;31m[ERROR][$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2
          exit 1
        }
    
        function is_compaction_halted() {
          info "Checking if thanos compactor running"
    
          IFS=" " read -r -a compactor_addresses <<<"$(kubectl get pods -n "$NAMESPACE" -l app.kubernetes.io/instance=thanos-compact -o jsonpath="{.items[*].status.podIP}")"
    
          is_compactor_halted=0
    
          if [[ "${#compactor_addresses[@]}" -eq 0 ]]; then
            info "Thanos compactor pod is not running"
            is_compactor_halted=1
          fi
    
          for ip in "${compactor_addresses[@]}"; do
            #shellcheck disable=SC2086
            halted=$(curl -s http://${ip}:10902/metrics | grep thanos_compact_halted | grep -v '#' | awk -F ' ' '{print $2}')
            if [[ "$halted" -eq "1" ]]; then
              warn "Compaction is halted"
              is_compactor_halted=1
              break
            fi
          done
    
          return $is_compactor_halted
        }
    
        function execute_thanos_issue_command() {
          if [[ $# -ne 1 ]]; then
            error "missing issue name for execute_thanos_issue_command function"
          fi
    
          issue=$1
    
          info "Checking for issue $issue"
          cmd_ret=0
          #shellcheck disable=SC2086
          verify_output=$(thanos tools bucket --objstore.config-file=${config_file} verify --log.format=json -i $issue 2>&1) && true || cmd_ret=1
          if [[ $cmd_ret -eq 1 ]]; then
            error_without_exit "Output of $issue command: -> $verify_output"
            error "Failed to verify bucket for $issue"
          fi
    
          #shellcheck disable=SC2086
          echo $verify_output
        }
    
        function fix_index_issue() {
          info "Fixing index_known_issue issue"
    
          verify_output=$(execute_thanos_issue_command "index_known_issues")
          #shellcheck disable=SC2086
          for b in $(echo $verify_output | sed 's/} {/\r\n/g' | grep err | grep "detected issue" | awk -F '"id":' '{print $2}' | awk -F ',' '{print $1}' | tr -d '"'); do
            info "Block=$b is having the issue, removing it.."
    
            thanos tools bucket mark --id="$b" \
              --marker=deletion-mark.json \
              --details="deleted by job" \
              --objstore.config-file="${config_file}"
    
            info "Block=$b is marked for deletion"
          done
    
          info "Fixing index_known_issue issue done"
        }
    
        function fix_overlapping_issue() {
          info "Fixing overlapped_blocks issue"
    
          overlap_output=$(execute_thanos_issue_command "overlapped_blocks")
    
          while IFS= read -r line; do
            #shellcheck disable=SC2086
            for b in $(echo $line | awk -F '"overlap":' '{print $2}' | awk -v search="ulid" 'match($0, search) {print substr($0, RSTART)}' | sed 's/ulid/\r\nulid/g' | awk -F ',' '{print $1}' | grep '^ulid' | awk -F ': ' '{print $2}'); do
              info "Block=$b is having the issue, removing it.."
              thanos tools bucket mark --id="$b" \
                --marker=deletion-mark.json \
                --details="deleted by job" \
                --objstore.config-file="${config_file}"
    
              info "Block=$b is marked for deletion"
            done
          done < <(echo "$overlap_output" | sed 's/} {/\r\n/g' | grep "found overlapped blocks")
    
          info "Fixing overlapped_blocks issue done"
        }
    
        function fix_duplicate_issue() {
          info "Fixing duplicated_compaction issue"
          duplicate_output=$(execute_thanos_issue_command "duplicated_compaction")
          #shellcheck disable=SC2086,SC2006
          for b in $(echo $duplicate_output | sed 's/ts=2/\r\n2/g' | grep "Found duplicated blocks that are ok to be removed" | awk -F 'ULIDs="' '{print $2}' | tr -d '[]' | awk -F '"' '{print $1}'); do
            info "Block=$b is having the issue, removing it.."
            thanos tools bucket mark --id="$b" \
              --marker=deletion-mark.json \
              --details="deleted by job" \
              --objstore.config-file="${config_file}"
    
            info "Block=$b is marked for deletion"
          done
    
          info "Fixing duplicated_compaction issue done"
        }
    
        if [[ -z "$NAMESPACE" ]]; then
          error "NAMESPACE is not set"
        fi
    
        # We will check if compaction is halted or not before checking for issues
        if is_compaction_halted; then
          info "Thanos compaction is working"
          echo "thanos_compactor_issue 0" | curl --data-binary @- "http://pushgateway-prometheus-pushgateway.uipath.svc.cluster.local:9091/metrics/job/thanos-cleaner"
          exit 0
        fi
    
        warn "Thanos compactor is not working. Checking for corrupted blocks..."
        echo "thanos_compactor_issue 1" | curl --data-binary @- "http://pushgateway-prometheus-pushgateway.uipath.svc.cluster.local:9091/metrics/job/thanos-cleaner"
    
        if [[ "$DISABLE_BLOCK_CLEANER" == true ]]; then
          info "DISABLE_BLOCK_CLEANER is set to $DISABLE_BLOCK_CLEANER, skipping block clean"
          exit 0
        fi
    
        info "DISABLE_BLOCK_CLEANER is set to $DISABLE_BLOCK_CLEANER, removing corrupted blocks"
    
        replica=$(kubectl get sts -n "$NAMESPACE" thanos-compact -o jsonpath='{.spec.replicas}')
    
        # compactor must not be running while deleting blocks
    
        info "Stopping compactor"
        kubectl scale sts -n "$NAMESPACE" thanos-compact --replicas=0
        kubectl delete pods -n "$NAMESPACE" -l app.kubernetes.io/instance=thanos-compact --force
    
        # fixing index_known_issues
        info "Checking blocks having issue"
    
        fix_index_issue
        fix_overlapping_issue
        fix_duplicate_issue
    
        info "Triggering deletion of all marked blocks"
    
        #shellcheck disable=SC2086
        thanos tools bucket cleanup --delete-delay=0 --objstore.config-file=${config_file}
    
        info "Corrupted blocks are deleted"
    
        info "Scaling thanos compactor's replica to $replica"
        #shellcheck disable=SC2086
        kubectl scale sts -n "$NAMESPACE" thanos-compact --replicas=$replica
        info "Thanos compactor started"
      validate-cronjob.sh: |
        #!/bin/bash
    
        # Copyright UiPath 2021
        #
        # =================
        # LICENSE AGREEMENT
        # -----------------
        #   Use of paid UiPath products and services is subject to the licensing agreement
        #   executed between you and UiPath. Unless otherwise indicated by UiPath, use of free
        #   UiPath products is subject to the associated licensing agreement available here:
        #   https://www.uipath.com/legal/trust-and-security/legal-terms (or successor website).
        #   You must not use this file separately from the product it is a part of or is associated with.
    
        set -eu -o pipefail
    
        function info() {
          echo "[INFO] [$(date +'%Y-%m-%dT%H:%M:%S%z')]: $*"
        }
    
        function warn() {
          echo -e "\e[0;33m[WARN] [$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2
        }
    
        function error_without_exit() {
          echo -e "\e[0;31m[ERROR][$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2
        }
    
        function error() {
          echo -e "\e[0;31m[ERROR][$(date +'%Y-%m-%dT%H:%M:%S%z')]:\e[0m $*" >&2
          exit 1
        }
    
        alias kubectl='kubectl --cache-dir=/tmp/'
        IFS="," read -ra cronjobs <<<"$CRONJOB_LIST"
    
        for cr in "${cronjobs[@]}"; do
          #shellcheck disable=SC2206
          name=(${cr//// })
          cronNs=default
          cronName=""
    
          if [[ ${#name[@]} -gt 2 || ${#name[@]} -lt 1 ]]; then
            error "Invalid cronjob name=$cr"
          fi
    
          if [[ ${#name[@]} -eq 2 ]]; then
            cronNs=${name[0]}
            cronName=${name[1]}
          else
            cronName=${name[0]}
          fi
    
          info "Validating cronjob=$cr"
    
          jobName="${cronName}-sf-job-validation"
    
          created=1
          info "Creating validation job for $cr"
          kubectl delete job -n "${cronNs}" "${jobName}" --ignore-not-found --timeout=3m
    
          #shellcheck disable=SC2086
          kubectl create job -n "${cronNs}" --from=cronjob/${cronName} "$jobName" || created=0
    
          if [[ $created == 0 ]]; then
            error "Failed to create job for $cr"
          fi
    
          #shellcheck disable=SC2086
          kubectl wait --timeout=20m --for=condition=complete -n "${cronNs}" job/$jobName &
          cpid=$!
    
          #shellcheck disable=SC2086
          kubectl wait --timeout=20m --for=condition=failed -n "${cronNs}" job/${jobName} && exit 1 &
          fpid=$!
    
          ret=0
          wait -n $cpid $fpid || ret=1
    
          kill -9 $cpid || true
          kill -9 $fpid || true
    
          if [[ $ret -eq 0 ]]; then
            info "Job for $cr is validated/completed"
            #ignore deletion error. if deletion fail then will get caught in next sync. This is to reduce failure during installation
            kubectl delete job -n "${cronNs}" "${jobName}" --timeout=3m || true
          else
            error "Job for $cr failed"
          fi
        done
    kind: ConfigMap
    metadata:
      name: thanos-cleaner-script
      namespace: monitoring
    ---
    EOF
  3. SF_K8S_TAG替换为正确的图像标签,然后应用 Cron 作业。

    从任何服务器节点上的安装程序目录中,获取最新标签:

    cat versions/docker-images.json  |grep uipath/sf-k8-utils-rhel | tr -d ',"' | awk -F ':' '{print $2}' |sort |uniq |tail -1cat versions/docker-images.json  |grep uipath/sf-k8-utils-rhel | tr -d ',"' | awk -F ':' '{print $2}' |sort |uniq |tail -1
    
    然后,通过将SF_K8S_TAG替换为返回的值来更新 Cron 作业块。

    更新后,将整个块粘贴到任何服务器节点的终端中:

    thanosns=monitoring && if kubectl get application -n argocd rancher-monitoring; then thanosns=cattle-monitoring-system; fi && thanosimage=$(kubectl  get statefulset -n $thanosns thanos-compact -o jsonpath='{.spec.template.spec.containers[0].image}') &&  cat <<EOF | kubectl apply -f -
    ---
    apiVersion: batch/v1
    kind: CronJob
    metadata:
      name: thanos-cleaner
      namespace: ${thanosns}
    spec:
      concurrencyPolicy: Forbid
      failedJobsHistoryLimit: 3
      jobTemplate:
        metadata:
          creationTimestamp: null
        spec:
          backoffLimit: 3
          template:
            metadata:
              annotations:
                sidecar.istio.io/inject: "false"
              creationTimestamp: null
              labels:
                app.kubernetes.io/name: thanos-cleaner-cronjob
            spec:
              containers:
              - args:
                - /script/thanos-cleanup.sh
                command:
                - /bin/bash
                env:
                - name: NAMESPACE
                  valueFrom:
                    fieldRef:
                      apiVersion: v1
                      fieldPath: metadata.namespace
                - name: THANOS_CONFIG_KEY
                  value: thanos.yaml
                - name: DISABLE_BLOCK_CLEANER
                  value: "false"
                image: docker.io/uipath/sf-k8-utils-rhel:SF_K8S_TAG
                imagePullPolicy: IfNotPresent
                name: thanos-cleaner
                resources:
                  limits:
                    cpu: 200m
                    memory: 400Mi
                  requests:
                    cpu: 20m
                    memory: 64Mi
                terminationMessagePath: /dev/termination-log
                terminationMessagePolicy: File
                volumeMounts:
                - mountPath: /script/
                  name: script
                - mountPath: /etc/thanos/
                  name: thanos-objectstore-vol
                - mountPath: /thanos-bin/
                  name: thanos
                - mountPath: /.kube/
                  name: kubedir
                - mountPath: /tmp/
                  name: tmpdir
              dnsPolicy: ClusterFirst
              initContainers:
              - args:
                - set -e; cp /bin/thanos /thanos-bin/thanos && chmod +x /thanos-bin/thanos
                command:
                - /bin/sh
                - -c
                image: ${thanosimage}
                imagePullPolicy: IfNotPresent
                name: copy-uipathcore-binary
                resources: {}
                terminationMessagePath: /dev/termination-log
                terminationMessagePolicy: File
                volumeMounts:
                - mountPath: /thanos-bin/
                  name: thanos
              nodeSelector:
                kubernetes.io/os: linux
              restartPolicy: Never
              schedulerName: default-scheduler
              securityContext:
                fsGroup: 3000
                runAsGroup: 2000
                runAsNonRoot: true
                runAsUser: 1000
              serviceAccount: thanos-cleaner
              serviceAccountName: thanos-cleaner
              terminationGracePeriodSeconds: 120
              volumes:
              - emptyDir: {}
                name: kubedir
              - emptyDir: {}
                name: tmpdir
              - emptyDir: {}
                name: thanos
              - name: thanos-objectstore-vol
                secret:
                  defaultMode: 420
                  secretName: thanos-objectstore-config
              - configMap:
                  defaultMode: 420
                  name: thanos-cleaner-script
                name: script
      schedule: 0 1/6 * * *
      successfulJobsHistoryLimit: 2
      suspend: false
    ---
    EOFthanosns=monitoring && if kubectl get application -n argocd rancher-monitoring; then thanosns=cattle-monitoring-system; fi && thanosimage=$(kubectl  get statefulset -n $thanosns thanos-compact -o jsonpath='{.spec.template.spec.containers[0].image}') &&  cat <<EOF | kubectl apply -f -
    ---
    apiVersion: batch/v1
    kind: CronJob
    metadata:
      name: thanos-cleaner
      namespace: ${thanosns}
    spec:
      concurrencyPolicy: Forbid
      failedJobsHistoryLimit: 3
      jobTemplate:
        metadata:
          creationTimestamp: null
        spec:
          backoffLimit: 3
          template:
            metadata:
              annotations:
                sidecar.istio.io/inject: "false"
              creationTimestamp: null
              labels:
                app.kubernetes.io/name: thanos-cleaner-cronjob
            spec:
              containers:
              - args:
                - /script/thanos-cleanup.sh
                command:
                - /bin/bash
                env:
                - name: NAMESPACE
                  valueFrom:
                    fieldRef:
                      apiVersion: v1
                      fieldPath: metadata.namespace
                - name: THANOS_CONFIG_KEY
                  value: thanos.yaml
                - name: DISABLE_BLOCK_CLEANER
                  value: "false"
                image: docker.io/uipath/sf-k8-utils-rhel:SF_K8S_TAG
                imagePullPolicy: IfNotPresent
                name: thanos-cleaner
                resources:
                  limits:
                    cpu: 200m
                    memory: 400Mi
                  requests:
                    cpu: 20m
                    memory: 64Mi
                terminationMessagePath: /dev/termination-log
                terminationMessagePolicy: File
                volumeMounts:
                - mountPath: /script/
                  name: script
                - mountPath: /etc/thanos/
                  name: thanos-objectstore-vol
                - mountPath: /thanos-bin/
                  name: thanos
                - mountPath: /.kube/
                  name: kubedir
                - mountPath: /tmp/
                  name: tmpdir
              dnsPolicy: ClusterFirst
              initContainers:
              - args:
                - set -e; cp /bin/thanos /thanos-bin/thanos && chmod +x /thanos-bin/thanos
                command:
                - /bin/sh
                - -c
                image: ${thanosimage}
                imagePullPolicy: IfNotPresent
                name: copy-uipathcore-binary
                resources: {}
                terminationMessagePath: /dev/termination-log
                terminationMessagePolicy: File
                volumeMounts:
                - mountPath: /thanos-bin/
                  name: thanos
              nodeSelector:
                kubernetes.io/os: linux
              restartPolicy: Never
              schedulerName: default-scheduler
              securityContext:
                fsGroup: 3000
                runAsGroup: 2000
                runAsNonRoot: true
                runAsUser: 1000
              serviceAccount: thanos-cleaner
              serviceAccountName: thanos-cleaner
              terminationGracePeriodSeconds: 120
              volumes:
              - emptyDir: {}
                name: kubedir
              - emptyDir: {}
                name: tmpdir
              - emptyDir: {}
                name: thanos
              - name: thanos-objectstore-vol
                secret:
                  defaultMode: 420
                  secretName: thanos-objectstore-config
              - configMap:
                  defaultMode: 420
                  name: thanos-cleaner-script
                name: script
      schedule: 0 1/6 * * *
      successfulJobsHistoryLimit: 2
      suspend: false
    ---
    EOF
  • 描述
  • 解决方案

此页面有帮助吗?

获取您需要的帮助
了解 RPA - 自动化课程
UiPath Community 论坛
Uipath Logo
信任与安全
© 2005-2025 UiPath。保留所有权利。