- Release Notes
 - Requirements
 - Installation
 - Getting Started
 - Projects
 - Datasets
 - ML Packages
 - Pipelines
 - ML Skills
 - ML Logs
 - Document Understanding in AI Fabric
 - Basic Troubleshooting Guide
 

AI Center
Run the AI Fabric infrastructure installer. Completing this installer will produce the Kots admin console where you are able to manage application updates, application configuration, resource usage (CPU/mem pressure), and download support bundles to troubleshoot any issues.
First step is to download installer zip file here and move it to AI Fabric server. Alternatively, you can download it directly from the machine using following command
The script will download some files locally as part of installation process, please make sure you have 4GB available on the directory where you are executing script.
By default Azure RHEL VMs have only 1 GB available on home directory which is default directory.
wget https://download.uipath.com/aifabric/online-installer/v2020.10.5/aifabric-installer-v20.10.5.tar.gzwget https://download.uipath.com/aifabric/online-installer/v2020.10.5/aifabric-installer-v20.10.5.tar.gzThen untarthe file and go inside main folder using following command:
tar -xvf aifabric-installer-v20.10.5.tar.gz
cd ./aifabric-installer-v20.10.5tar -xvf aifabric-installer-v20.10.5.tar.gz
cd ./aifabric-installer-v20.10.5You can then run AI Fabric installer by running:
./setup.sh./setup.shThe first step is to accept license agreement by pressing Y. The script will then ask you what type of platform you want to install, enter onebox and press enter as on image below:
You will then be asked if a GPU is available for your setup and Y or N depending of your hardware. Make sure that drivers are already installed.
Depending of your system you might be asked to press Y few times for the installation to complete.
This step will take between 15-25 minutes to complete. Upon completion, you will see on the terminal output a message Installation Complete.
On local machine with access to a browser (e.g. a Windows server) download bundle install using link provided by your account manager.
tar -zxvf aifabric-installer-v2020.10.5.tar.gz from a machine that
                  supports tar.
               This will create two folders:
aif_infra_20.10.5.tar.gzcontaining infrastructure components (about 3.6 GB)ai-fabric-v2020.10.5.airgapcontaining application components (about 8.7 GB). This will be uploaded to the UI in step 5. Run the AI Fabric Application Installer.
aif_infra_20.10.5.tar.gzto the airgapped AI Fabric
                  machine.
               Then run following command to start infrastructure installer:
tar -zxvf aif_infra_20.10.5.tar.gz
cd aif_infra_20.10.5
sudo ./setup.shtar -zxvf aif_infra_20.10.5.tar.gz
cd aif_infra_20.10.5
sudo ./setup.shIn both case, successfull installation will ouput address and password of KotsAdmin Ui
...
Install Successful:
configmap/kurl-config created
                Installation
                  Complete ✔
Kotsadm: http://13.59.108.17:8800
Login with password (will not be shown again): NNqKCY82S
The UIs of Prometheus, Grafana and Alertmanager have been exposed on NodePorts 30900, 
30902 and 30903 respectively.
To access Grafana use the generated user:password of admin:msDX5VZ9m .
To access the cluster with kubectl, reload your shell:
    bash -l
    
......
Install Successful:
configmap/kurl-config created
                Installation
                  Complete ✔
Kotsadm: http://13.59.108.17:8800
Login with password (will not be shown again): NNqKCY82S
The UIs of Prometheus, Grafana and Alertmanager have been exposed on NodePorts 30900, 
30902 and 30903 respectively.
To access Grafana use the generated user:password of admin:msDX5VZ9m .
To access the cluster with kubectl, reload your shell:
    bash -l
    
...<machine-ip>:8800. In some case
                  it ma display internal IP instead of public IP, make sure you are using public IP if you
                  are accessing it from outside.
               bash -l
kubectl kots reset-password -n defaultbash -l
kubectl kots reset-password -n default- Check if the GPU drivers are
                        correctly installed by running the following
                        command:
nvidia-sminvidia-smiIf the GPU drivers are installed correctly, your GPU information should be displayed. If an error occurs, it means that the GPU is not accessible or the drivers are not installed correctly. This issue must be fixed before proceeding.
 - Check if NVIDIA Runtime
                           Container is correctly installed by running the following
                        command:
/usr/bin/nvidia-container-runtime/usr/bin/nvidia-container-runtime 
- Download the two available scripts for adding the GPU from the following link: GPU scripts.
 - 
                     Run a script to add the GPU to the cluster so that Pipelines and ML
                           Skills can use it. Depending on your installation, choose one of the
                        following options:
                     
                     
- In case of online installation, run the following
                           script:
<h1>navigate to where you untar installer (or redo it if you have removed it) cd ./aicenter-installer-v21.4.0/infra/common/scripts ./attach_gpu_drivers.sh</h1><h1>navigate to where you untar installer (or redo it if you have removed it) cd ./aicenter-installer-v21.4.0/infra/common/scripts ./attach_gpu_drivers.sh</h1> - 
                           
                           In case of airgapped, first you need to create the file in the
aif_infradirectory, making surenvidia-device-plugin.yamlis located in the same folder.To create the file, paste the content from theattach_gpu_drivers.shfile downloaded at Step 1. Run the following script:./attach_gpu_drivers.sh./attach_gpu_drivers.sh 
 - In case of online installation, run the following
                           script:
 
The infrastructure installer is not idempotent. This means that running the installer again (after you have already run it once) will not work. If this installer fails, you will need to reprovision a new machine with fresh disks.
The most common sources of error are that the bootdisk becomes full during the install or that the external data disks are mounted/formatted. Remember to only attach the disks, not format them.
If the installation fails with unformatted disks and a sufficiently large boot risk, contact our support team and include in your email a support bundle. A support bundle can be generated by running this command:
curl https://krew.sh/support-bundle | bash
kubectl support-bundle https://kots.iocurl https://krew.sh/support-bundle | bash
kubectl support-bundle https://kots.ioAlternatively if you don't have access to the internet you can create file support-bundle.yaml with following text:
apiVersion: troubleshoot.replicated.com/v1beta1
kind: Collector
metadata:
  name: collector-sample
spec:
  collectors:
    - clusterInfo: {}
    - clusterResources: {}
    - exec:
        args:
          - "-U"
          - kotsadm
        collectorName: kotsadm-postgres-db
        command:
          - pg_dump
        containerName: kotsadm-postgres
        name: kots/admin_console
        selector:
          - app=kotsadm-postgres
        timeout: 10s
    - logs:
        collectorName: kotsadm-postgres-db
        name: kots/admin_console
        selector:
          - app=kotsadm-postgres
    - logs:
        collectorName: kotsadm-api
        name: kots/admin_console
        selector:
          - app=kotsadm-api
    - logs:
        collectorName: kotsadm-operator
        name: kots/admin_console
        selector:
          - app=kotsadm-operator
    - logs:
        collectorName: kotsadm
        name: kots/admin_console
        selector:
          - app=kotsadm
    - logs:
        collectorName: kurl-proxy-kotsadm
        name: kots/admin_console
        selector:
          - app=kurl-proxy-kotsadm
    - secret:
        collectorName: kotsadm-replicated-registry
        includeValue: false
        key: .dockerconfigjson
        name: kotsadm-replicated-registry
    - logs:
        collectorName: rook-ceph-agent
        selector:
          - app=rook-ceph-agent
        namespace: rook-ceph
        name: kots/rook
    - logs:
        collectorName: rook-ceph-mgr
        selector:
          - app=rook-ceph-mgr
        namespace: rook-ceph
        name: kots/rook
- logs:
        collectorName: rook-ceph-mon
        selector:
          - app=rook-ceph-mon
        namespace: rook-ceph
        name: kots/rook
    - logs:
        collectorName: rook-ceph-operator
        selector:
          - app=rook-ceph-operator
        namespace: rook-ceph
        name: kots/rook
    - logs:
        collectorName: rook-ceph-osd
        selector:
          - app=rook-ceph-osd
        namespace: rook-ceph
        name: kots/rook
    - logs:
        collectorName: rook-ceph-osd-prepare
        selector:
          - app=rook-ceph-osd-prepare
        namespace: rook-ceph
        name: kots/rook
    - logs:
        collectorName: rook-ceph-rgw
        selector:
          - app=rook-ceph-rgw
        namespace: rook-ceph
        name: kots/rook
    - logs:
        collectorName: rook-discover
        selector:
          - app=rook-discover
        namespace: rook-ceph
        name: kots/rookapiVersion: troubleshoot.replicated.com/v1beta1
kind: Collector
metadata:
  name: collector-sample
spec:
  collectors:
    - clusterInfo: {}
    - clusterResources: {}
    - exec:
        args:
          - "-U"
          - kotsadm
        collectorName: kotsadm-postgres-db
        command:
          - pg_dump
        containerName: kotsadm-postgres
        name: kots/admin_console
        selector:
          - app=kotsadm-postgres
        timeout: 10s
    - logs:
        collectorName: kotsadm-postgres-db
        name: kots/admin_console
        selector:
          - app=kotsadm-postgres
    - logs:
        collectorName: kotsadm-api
        name: kots/admin_console
        selector:
          - app=kotsadm-api
    - logs:
        collectorName: kotsadm-operator
        name: kots/admin_console
        selector:
          - app=kotsadm-operator
    - logs:
        collectorName: kotsadm
        name: kots/admin_console
        selector:
          - app=kotsadm
    - logs:
        collectorName: kurl-proxy-kotsadm
        name: kots/admin_console
        selector:
          - app=kurl-proxy-kotsadm
    - secret:
        collectorName: kotsadm-replicated-registry
        includeValue: false
        key: .dockerconfigjson
        name: kotsadm-replicated-registry
    - logs:
        collectorName: rook-ceph-agent
        selector:
          - app=rook-ceph-agent
        namespace: rook-ceph
        name: kots/rook
    - logs:
        collectorName: rook-ceph-mgr
        selector:
          - app=rook-ceph-mgr
        namespace: rook-ceph
        name: kots/rook
- logs:
        collectorName: rook-ceph-mon
        selector:
          - app=rook-ceph-mon
        namespace: rook-ceph
        name: kots/rook
    - logs:
        collectorName: rook-ceph-operator
        selector:
          - app=rook-ceph-operator
        namespace: rook-ceph
        name: kots/rook
    - logs:
        collectorName: rook-ceph-osd
        selector:
          - app=rook-ceph-osd
        namespace: rook-ceph
        name: kots/rook
    - logs:
        collectorName: rook-ceph-osd-prepare
        selector:
          - app=rook-ceph-osd-prepare
        namespace: rook-ceph
        name: kots/rook
    - logs:
        collectorName: rook-ceph-rgw
        selector:
          - app=rook-ceph-rgw
        namespace: rook-ceph
        name: kots/rook
    - logs:
        collectorName: rook-discover
        selector:
          - app=rook-discover
        namespace: rook-ceph
        name: kots/rookAnd then create support-bundle file using following command:
kubectl support-bundle support-bundle.yamlkubectl support-bundle support-bundle.yamlThis will create a file called supportbundle.tar.gz which you can upload when raising a support ticket.