Enabling Application Machines with GPUs in On-Premise Podman Deployments

With Hyperscience, you can leverage ORCA VLMs in the processing of your submissions. However, VLMs require more computing resources than our other automation capabilities do.

In order to use ORCA VLMs in on-premise deployments of Hyperscience, you need to have at least one application machine in your instance that has both a GPU (graphics processing unit) and a CPU (central processing unit). GPUs have specialized cores that allow the system to perform multiple computations in parallel, reducing the time required to complete the complex operations required to apply visual machine learning to your use case. When you add a correctly sized machine with a GPU to your instance, you can maximize the benefits of ORCA VLMs. To learn more about this feature, see “ORCA (Optical Reasoning and Cognition Agent) VLMs” ( v41 | v42.0-v42.2 | v42.3 | v43 ).

This article describes how to enable an application machine with both a GPU and a CPU in an on-premise Podman deployment of Hyperscience. Steps 1-3 must be completed before untarring the Hyperscience bundle on the application machine. For more information on installing Hyperscience, see Technical Installation / Upgrade Instructions.

1. Make sure your GPU hardware meets the requirements.

See Infrastructure Requirements for more information.

2. Make sure your application machine meets the software-compatibility requirements.

There are several software-compatibility considerations to keep in mind when setting up your application machine.

a. Verify that the lspci command is enabled.

To do so:

Install the pciutils package by running the following command on RHEL:
```
yum -y install pciutils
```
Run lspci to make sure the command has been enabled.

b. Verify that your GPU supports CUDA.

CUDA is a parallel computing platform and programming model created by NVIDIA. Machine learning often uses CUDA-based libraries, SDKs, and other tools.

You can find out whether your GPU supports CUDA by running the following command:

lspci | grep -i nvidia

For more information, see NVIDIA’s CUDA GPUs - Compute Capability and NVIDIA CUDA Installation Guide for Linux.

c. Verify that you have a supported version of Linux.

Follow the instructions in NVIDIA’s NVIDIA CUDA Installation Guide for Linux to check your version of Linux. Then, make sure your Linux version is supported by the latest CUDA Toolkit by reviewing NVIDIA’s NVIDIA CUDA Toolkit Release Notes.

You should also ensure that you are running a version of Linux that is supported by Hyperscience. For a list of supported Linux distributions and versions, see Infrastructure Requirements.

d. Verify that the system has gcc installed.

The gcc compiler is required for development using the CUDA Toolkit. To make sure it is installed, follow the instructions in NVIDIA’s NVIDIA CUDA Installation Guide for Linux.

e. Verify that the system has the current Kernel headers and development packages installed.

Kernel headers are header files that specify the interface between the Linux kernel and userspace libraries and programs. The CUDA driver requires that the kernel headers and development packages for the running version of the kernel be installed at the time of the driver installation, as well whenever the driver is rebuilt. For example, if your system is running kernel version 3.17.4-301, the 3.17.4-301 kernel headers and development packages must also be installed.

To verify that these requirements are met, run the following command:

sudo dnf install kernel-devel-$(uname -r) kernel-headers-$(uname -r)

For more information and commands for various Linux distributions, see NVIDIA’s NVIDIA CUDA Installation Guide for Linux.

3. Install the CUDA Driver and NVIDIA Container Toolkit.

Ensure your system meets the prerequisites for the driver installation:
- If you're running RHEL 8: complete the steps in sections 3.2.1-3.2.3 of NVIDIA’s NVIDIA CUDA Installation Guide for Linux.
- If you're running RHEL 9 (supported in v39.0.9): complete the steps in sections 3.3.1-3.3.3 of NVIDIA’s NVIDIA CUDA Installation Guide for Linux.
Install the driver by running the following command on RHEL:
```
sudo dnf module install nvidia-driver:latest-dkms
```
Install the container toolkit by completing the steps in the "Installing with Yum or Dnf" section of NVIDIA's Installing the NVIDIA Container Toolkit.
Configure Podman to use NVIDIA devices in the container:
1. Complete the steps in the "Procedure" section of NVIDIA's Support for Container Device Interface.
2. Edit /usr/share/containers/containers.conf:
  1. Set NVIDIA as the runtime (i.e., runtime=”nvidia"). If there is any other runtime set, comment it out.
  2. Add nvidia=[“/usr/bin/nvidia-container-runtime”] to [engine.runtimes].
    After completing these steps, your file should look similar to the one shown below.
```
runtime = "nvidia"

...

[engine.runtimes]

...

nvidia = [

"/usr/bin/nvidia-container-runtime",

]
```
(Optional) Reboot the system:
```
sudo reboot
```

While we recommend installing the latest version of the CUDA driver, only the minimum required version for your version of Hyperscience is required. See Infrastructure Requirements for more information.

4. Initialize the VLM in the application machine.

Before you can use ORCA, you need to initialize it in the application machine with the GPU. Doing so ensures that that machine will be used for processing tasks that require ORCA.

Run the following commands on the machine that has the GPU:

./run.sh init # standard installation step, skip if already done
./run.sh # standard installation step, skip if already done
./run.sh ipm VISION_LANGUAGE_MODEL_GPU # ORCA-specific installation step (only in the GPU machine)

Next steps

After you’ve finished prepared your infrastructure to use ORCA VLMs, you are ready to apply ORCA to your use case.

To learn how, see “ORCA (Optical Reasoning and Cognition Agent) VLMs” ( v41 | v42.0-v42.2 | v42.3 | v43 ).

Documentation Index