Helm Chart

Compatibility

Below is a list of supported Platform versions for the respective Hyperscience Helm chart and Operator versions:

Chart Version	Operator Version	Supported Platform Versions
=> 8.7.1	=> 5.4.3	=> 37.0.5
<= 8.7.0	<= 5.4.2	<= 37.0.5
7.*	4.*	33.1.32+, 34.0.9+, 35.0.0+
6.*	4.*	33.1.32+, 34.0.9+, 35.0.0+
<= 5.*	<= 3.*	<= 33.1.31, <= 34.0.8

IMPORTANT: It's recommended to use the default operator version in the respective Helm Chart version, instead of hard-coding an operator version in values.yaml.

Prerequisites

Before attempting to install Hyperscience, please be sure to follow the infrastructure requirements and guidelines to ensure that your cluster is compliant with Hyperscience's requirements.

Then follow the hsk8s (Hyperscience Kubernetes CLI) instructions to install hsk8s and Helm repo.

Make sure that you have imported environment variables from the previous step:

source hs_env.bash

Create a `values.yaml` file

Use the examples below to create a values.yaml file for AWS or GCP.

Minimal `values.yaml` file for AWS

secrets:
  platform: "hyperscience-platform" # required: platform secret

app:
  # Required, for example: 0123456789.dkr.ecr.us-east-1.amazonaws.com/forms
  repository: "0123456789.dkr.ecr.us-east-1.amazonaws.com/forms"
  tag: "40.0.3"
  # Application-specific environment variables
  dotenv:
    FORMS_DB_TYPE: postgres
    FORMS_DB_HOST: hyperscience.xxxxxxxx.us-east-1.rds.amazonaws.com  # your RDS database's endpoint
  # If not possible to use IRSA (IAM role for service account), ref AWS creds here
  storage_mode:
    s3:
      bucket: my-hyperscience-bucket # your S3 bucket, if using S3 or GCS as Object storage
      prefix: my-hyperscience-prefix # Optional
  secret_env_vars:
  - name: AWS_ACCESS_KEY_ID
    valueFrom:
      secretKeyRef:
        key: AWS_ACCESS_KEY_ID
        name: hyperscience-platform
  - name: AWS_SECRET_ACCESS_KEY
    valueFrom:
      secretKeyRef:
        key: AWS_SECRET_ACCESS_KEY
        name: hyperscience-platform

  # Used to access the Object Storage
  serviceAccount:
    annotations:
      eks.amazonaws.com/role-arn: arn:aws:iam::0123456789:role/my-hypescience-role

blocks:
  # The Docker repository under which all Block images are stored
  # Example: if repository is equal to .dkr.ecr.us-east-1.amazonaws.com/sdm_blocks,
  # the operator will attempt to create a block with the image .dkr.ecr.us-east-1.amazonaws.com/sdm_blocks:finetune...
  repository: "0123456789.dkr.ecr.us-east-1.amazonaws.com/sdm_blocks"

operator:
  # Required, for example: 0123456789.dkr.ecr.us-east-1.amazonaws.com/hyperoperator
  repository: "0123456789.dkr.ecr.us-east-1.amazonaws.com/hyperoperator" # required

trainer:
  # Required, for example: 0123456789.dkr.ecr.us-east-1.amazonaws.com/trainer
  repository: 0123456789.dkr.ecr.us-east-1.amazonaws.com/forms

  tags:
  - 40.0.3

Minimal `values.yaml` file for GCP

# The Docker repository in this example assumes you are using a single Google Artifact Registry named "hyperscience".
# Ensure you replace all occurrences of "gcp-project-name" and "us-central1" with your GCP project name and GCP region, respectively.
secrets:
  platform: hyperscience-platform # required: "platform secret"

app:
  # Required
  repository: us-central1-docker.pkg.dev/gcp-project-name/hyperscience/forms
  # Set the Hyperscience platform version that is synced to your container registry.
  tag: 40.0.3                                               
  # Application-specific environment variables
  dotenv:
    FORMS_DB_TYPE: postgres
    # CloudSQL endpoint
    FORMS_DB_HOST: xxxxxxxx.xxxxxxxx.us-central1.sql.goog.  
  storage_mode:
    gcs:
      # Your Google Cloud Storage bucket name
      bucket: my-hyperscience-bucket 
      # Optional
      prefix: my-hyperscience-prefix 
  # ".env" file variables attached to every container. The keys for each secretKeyRef need to match those in your "platform secret".
  secret_env_vars:
    # Reference to the secret value holding the base64 service account JSON file
  - name: FILE_STORE_GOOGLE_CLOUD_KEY 
    valueFrom:
      secretKeyRef:
        key: FILE_STORE_GOOGLE_CLOUD_KEY 
        name: hyperscience-platform

blocks:
  # The Docker repository under which all Block images are stored.
  # Example: if repository is equal to "us-central1-docker.pkg.dev/production/hyperscience", the operator will
  # attempt to create blocks with an image such as "us-central1-docker.pkg.dev/production/hyperscience/finetune:40.0.3"
  repository: us-central1-docker.pkg.dev/gcp-project-name/hyperscience/sdm_blocks

operator:
  repository: us-central1-docker.pkg.dev/gcp-project-name/hyperscience/hyperoperator # required

trainer:
  repository: us-central1-docker.pkg.dev/gcp-project-name/hyperscience/forms
  # Set the Hyperscience platform version that is synced to your container registry. Unless you have specific needs, this version is the same as the application version.
  tags:
  - 40.0.3
cloud:
  aws:
    # Include a ConfigMap that contain AWS RDS certs for TLS connections.
    # Also adds ConfigMap mounts to relevant containers.
    # Not needed when installing in GKE.
    includeRdsCerts: false

Minimal `values.yaml` file for GDC Connected

# The Docker repository in this example assumes you are using a single Google Artifact Registry named "hyperscience".
# Ensure you replace all occurrences of "gcp-project-name" and "us-central1" with your GCP project name and GCP region, respectively.
secrets:
  platform: hyperscience-platform # required: "platform secret"

app:
  # Required
  repository: us-central1-docker.pkg.dev/gcp-project-name/hyperscience/forms
  # Set the Hyperscience platform version that is synced to your container registry.
  tag: 40.0.3                                               
  # Application-specific environment variables
  dotenv:
    FORMS_DB_TYPE: postgres
    # CloudSQL endpoint
    FORMS_DB_HOST: xxxxxxxx.xxxxxxxx.us-central1.sql.goog.  
  storage_mode:
    gcs:
      # Your Google Cloud Storage bucket name
      bucket: my-hyperscience-bucket 
      # Optional
      prefix: my-hyperscience-prefix 
  # ".env" file variables attached to every container. The keys for each secretKeyRef need to match those in your "platform secret".
  secret_env_vars:
    # Reference to the secret value holding the base64 service account JSON file
  - name: FILE_STORE_GOOGLE_CLOUD_KEY 
    valueFrom:
      secretKeyRef:
        key: FILE_STORE_GOOGLE_CLOUD_KEY 
        name: hyperscience-platform
  # Assuming a three-node cluster, set app components replicas to "3" for high availability.
  replicas:
    backend: 3
    frontend: 3
    hyperflow_engine: 3
    idp_sync_manager: 3
  # Override default node affinity to allow scheduling on all three nodes.
  nodeAffinity: null

blocks:
  # The Docker repository under which all Block images are stored.
  # Example: if repository is equal to "us-central1-docker.pkg.dev/production/hyperscience", the operator will
  # attempt to create blocks with an image such as "us-central1-docker.pkg.dev/production/hyperscience/finetune:40.0.3".
  repository: us-central1-docker.pkg.dev/gcp-project-name/hyperscience/sdm_blocks
  # Set block autoscaling to max 3 replicas due to ephemeral-storage issue listed in docs.
  autoscaling:
    enabled: true
    min_replicas: 0
    max_replicas: 3
  # Override default node affinity to allow scheduling on all three nodes.
  nodeAffinity: null
  gpu:
    nodeAffinity: null
  llm:
    nodeAffinity: null
operator:
  repository: us-central1-docker.pkg.dev/gcp-project-name/hyperscience/hyperoperator # required
  env:
    # GDC-specific ".env" file variables for GPU blocks and trainer functionality
    HO_GPU_RESOURCE_LABEL: "nvidia.com/gpu-pod-NVIDIA_L4"
    BLOCK_VISION_LANGUAGE_MODEL_GPU_ENV_VAR_LD_LIBRARY_PATH: "/usr/local/nvidia/lib64"
    BLOCK_ENTRYPOINT: "[\"/block/command\"]"
    GPU_BLOCK_SECURITY_CONTEXT: '{"fsGroup":1000,"runAsGroup":1000,"runAsNonRoot":true,"runAsUser":1000,"supplementalGroups":[27]}'
    TRAINER_GPU_SECURITY_CONTEXT: '{"runAsGroup":1000,"runAsNonRoot":true,"runAsUser":1000,"supplementalGroups":[27]}'
    
trainer:
  repository: us-central1-docker.pkg.dev/gcp-project-name/hyperscience/forms
  # Set the Hyperscience platform version that is synced to your container registry. Unless you have specific needs, this version is the same as the application version.
  tags:
  - 40.0.3
  # Override default node affinity to allow scheduling on all three nodes.
  nodeAffinity: null
cloud:
  aws:
    # Include a ConfigMap that contain AWS RDS certs for TLS connections.
    # Also adds ConfigMap mounts to relevant containers.
    # Not needed when installing in GKE.
    includeRdsCerts: false

Advanced `values.yaml`

The following command can be used to retrieve all the possible options of the Helm chart:

helm show values $HS_HELM_CHART

It will return the template for the latest Helm chart version. Save the file as values-full.yaml. It's best practice to only add the options you want to change from values-full.yaml in your values.yaml.

Kubernetes Secrets

Platform Secret

We require a kubernetes native secret to store database credentials and shared tokens that allow intra-app communication. This secret needs to contain at least the following keys:

FORMS_DB_NAME
FORMS_DB_USER
FORMS_DB_PASS
BLOCK_ORCHESTRATOR_TOKEN # Token used by the operator to authenticate against the platform

Optionally it could also have:

TRAINER_TOKEN # Token used by the trainer to authenticate against the platform. Must be present if trainer.tags is not empty

You should obtain the FORMS_DB_NAME, FORMS_DB_USER, and FORMS_DB_PASS from your database configuration.

The BLOCK_ORCHESTRATOR_TOKEN and TRAINER_TOKEN should be randomly-generated strings using URL-safe ASCII characters (letters & numbers) and are limited to a maximum of 40 characters.

We recommend that you use an external secrets manager and associated Kubernetes tooling to securely store secrets inside your Kubernetes cluster.

WARNING: Example Only! Do not use in production!

This is an example of a secrets.yaml :

AWS

apiVersion: v1
kind: Secret
type: Opaque
metadata:
  name: hyperscience-platform
stringData:
  # If not possible to use IRSA (IAM role for service account), specify AWS creds here
  # AWS_ACCESS_KEY_ID: <AWS user key id>
  # AWS_SECRET_ACCESS_KEY: <AWS secret key>
  FORMS_DB_NAME: my-postgres-db
  FORMS_DB_PASS: my-postgres-password
  FORMS_DB_USER: my-db-role
  BLOCK_ORCHESTRATOR_TOKEN: e271dc47fa80ddc9e6590042ad9ed2b7
  TRAINER_TOKEN: 25f9e794323b453885f5181f1b624d0b

GCP

apiVersion: v1
kind: Secret
type: Opaque
metadata:
  name: hyperscience-platform
stringData:
  # If you use Google service accounts, then you should base 64 encode the json file and set it in the key below. Not needed for Workload Identity 
  # FILE_STORE_GOOGLE_CLOUD_KEY: <base_64_sa_json_file> 
  FORMS_DB_NAME: my-postgres-db
  FORMS_DB_PASS: my-postgres-password
  FORMS_DB_USER: my-db-role
  BLOCK_ORCHESTRATOR_TOKEN: e271dc47fa80ddc9e6590042ad9ed2b7
  TRAINER_TOKEN: 25f9e794323b453885f5181f1b624d0b

Once you have it, you can import it to Kubernetes with:

kubectl apply -f secrets.yaml

Other Secrets

We also provide an option to reference additional native Kubernetes secrets in the Hyperscience environment. In values.yaml, you can reference secrets in the app.secret_env_vars or app.gunicorn_env_vars keys. These keys are useful for passing telemetry-related secrets or third-party API tokens that need to be made available to the application.

Application Configuration

There are two ways to configure the Hyperscience application environment. The first and recommended way is to create environment key-value pairs in the values.yaml path app.dotenv. Optionally, you can create your own ConfigMap with your desired configuration and pass the ConfigMap name to the app.dotenv_configmap_name setting in values.yaml.

Since the Hyperscience application supports a wide variety of options, please refer to Hyperscience's configuration documentation for more details. Please keep in mind that not all settings will be supported in a Kubernetes deployment.

HyperOperator

The application depends on a custom operator, called the HyperOperator. This operator's related CRDs will be installed when you run helm install.

IMPORTANT: We use Helm's built-in CRD installation mechanism. Since helm upgrade is not supported for CRDs in Helm 3, our development team will actively avoid making CRD spec changes and only change the CRD specifications if absolutely necessary.

IAM Configuration

AWS

Create IAM Policy

Hyperscience requires S3 read/write permissions. You need to create an IAM Policy with the following definition:

{
    "Statement": [
        {
            "Action": [
                "s3:ListBucketMultipartUploads",
                "s3:ListBucket"
            ],
            "Effect": "Allow",
            "Resource": "arn:aws:s3:::your-hs-bucket",
            "Sid": ""
        },
        {
            "Action": [
                "s3:PutObject",
                "s3:GetObject",
                "s3:DeleteObject",
                "s3:AbortMultipartUpload"
            ],
            "Effect": "Allow",
            "Resource": "arn:aws:s3:::your-hs-bucket/*",
            "Sid": ""
        }
    ],
    "Version": "2012-10-17"
}

IMPORTANT: Replace "arn:aws:s3:::your-hs-bucket" with the ARN of the S3 bucket that you are using for Hyperscience data storage.

Create Identity Provider

Please follow AWS's IRSA Documentation steps 1 & 2 to create an IAM OIDC provider. You don't need to do Step 3 (Configuring pods to use a Kubernetes service account). Instead, we'll configure it in values.yaml below.

Create IAM Role for Service Account (IRSA)

You need to create an IAM role (type: Custom trust policy) with this custom definition:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "",
            "Effect": "Allow",
            "Principal": {
                "Federated": "arn:aws:iam::123456789012:oidc-provider/oidc.eks.us-east-1.amazonaws.com/id/ABCDEFGHIJKLMNOPQRSTUVWXYZ123456"
            },
            "Action": "sts:AssumeRoleWithWebIdentity",
            "Condition": {
                "StringLike": {
                    "oidc.eks.us-east-1.amazonaws.com/id/ABCDEFGHIJKLMNOPQRSTUVWXYZ123456:sub": "system:serviceaccount:your_namespace:*"
                }
            }
        }
    ]
}

IMPORTANT:
Replace "arn:aws:iam::123456789012:oidc-provider/oidc.eks.us-east-1.amazonaws.com/id/ABCDEFGHIJKLMNOPQRSTUVWXYZ123456" with your cluster's OIDC Provider Arn.
Replace the "ABCDE..." in "oidc.eks.us-east-1.amazonaws.com/id/ABCDEFGHIJKLMNOPQRSTUVWXYZ123456:sub" with your cluster's OIDC Provider Id.
Replace your_namespace in "system:serviceaccount:your_namespace:*" with the Kubernetes namespace that you are installing Hyperscience in. You should have it in the HS_K8S_NAMESPACE environment variable.

Now attach the policy you created earlier to that role.

Reference IAM role in values.yaml

Edit your values.yaml file. Under app.serviceAccount.annotations, there is an entry for eks.amazonaws.com/role-arn. Set the value to the ARN of the IAM role you just created. It should look like:

app:
  serviceAccount:
    create: true
    # Annotations to add to the service account
    annotations:
      eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/your_role_name # The service account role created in a previous step

Make sure to replace 123456789012 and your_role_name with the actual values.

GCP

The Hyperscience Platform supports Google Workload Identity and Google service accounts for authenticating into Google’s service. We recommend using Google Workload Identity for GKE. To learn more, see About Workload Identity Federation for GKE in the Google Cloud documentation.

If you can’t use Workload Identity, set a dedicated service account by following the steps in Google’s Create service accounts.

For Workload Identity, ensure you’ve properly set the principal:

#SERVICEACCOUNT here should be .Values.app.serviceAccount.name
principal://iam.googleapis.com/projects/PROJECT_NUMBER/locations/global/workloadIdentityPools/PROJECT_ID.svc.id.goog/subject/ns/NAMESPACE/sa/SERVICEACCOUNT

Access to the bucket

In both cases, you should grant roles/storage.objectUser permissions so the Hyperscience application can read and write in the Google Cloud Storage bucket. Instructions on how to assign the role through the Console or CLI can be found in Google’s Set and manage IAM policies on buckets.

Ingress Controller (optional)

The Helm chart can optionally create an application Ingress resource, but it does not make assumptions about which ingress you are using. To take advantage of this functionality, please configure an ingress controller before attempting to install Hyperscience.

Installation

Make sure you followed the hsk8s (Hyperscience Kubernetes CLI) instructions to install hsk8s and Helm Repo.

Run the helm install command

Run helm install to install Hyperscience, specifying an application name and chart version if so desired.

Here is the command that installs the latest version of the Hyperscience Helm Chart

helm repo update
helm install $HS_HELM_RELEASE -f values.yaml $HS_HELM_CHART --create-namespace

Updating `values.yaml`

In order to apply a change made to values.yaml, you should run helm upgrade. Make sure to specify a chart version with --version, otherwise the latest chart version will be used, which could have breaking changes!

Useful links:

helm upgrade $HS_HELM_RELEASE -f values.yaml $HS_HELM_CHART --version X.Y.Z

Post-Installation Steps (development only)

To test your Hyperscience installation, you can create a default admin user with custom credentials. This approach is recommended for development or testing ONLY and is NOT suitable for production.

Set the following environment variables in values.yaml:

app:
  dotenv:
    FORMS_USER: <username> 
    FORMS_PASS: <password>

Run helm upgrade to reinstall the chart. Then, run the following command to create the user:

kubectl exec $(kubectl get pod -l 'app.kubernetes.io/component=backend' -o name) -c shell-command -- /bin/bash -c 'cd forms/forms && /var/www/venv/bin/python manage.py add_default_admin'

Development and testing ONLY
Changes to FORMS_USER and FORMS_PASS in values.yaml can be applied asynchronously during your next Helm change or HS application update. This method for setting the credentials is ONLY for testing. You should integrate the Hyperscience application with an external identity provider (IdP) as soon as possible to prevent access-related issues in the future. For information on supported IdPs, see Application Authentication Overview.

Scaling

By default, only one instance of each block type will be run. Although it may be sufficient for a demo of the Hyperscience Platform, in most real-world cases, one instance of each block is not enough. For more information on how to scale the system, refer to the Scaling article.