[v42.3] Installing ORCA VLMs

This article explains how to install ORCA (Optical Reasoning and Cognition Agent) VLMs in v42.3 and configure your instance to use them for document processing.

For information about the capabilities of ORCA VLMs, see our [v42.3] ORCA (Optical Reasoning and Cognition Agent) VLMs article.

This article applies only to installing ORCA VLMs in v42.3. For information on installing ORCA VLMs in v42.0-v42.2, see [v42.0-v42.2] ORCA (Optical Reasoning and Cognition Agent) VLMs.

Prerequisites

To successfully use ORCA VLMs:

Your instance must support GPU-enabled application machines.

Configure your infrastructure for ORCA VLMs
Make sure your instance supports GPU-enabled application machines.
On-premise deployment: follow the instructions in the “Enabling Application Machines with GPUs” article for Docker, Podman, or Kubernetes.
SaaS deployment: contact your Hyperscience representative to enable GPU support in your instance.
Learn more about Hyperscience’s GPU requirements in Infrastructure Requirements.

Your release must contain at least one Semi-structured layout with fields.
You need to enable the ORCA flows to start processing. Learn more in the sections below.

Important limits and considerations
ORCA VLMs extract data from fields only.
ORCA VLMs does not support tables. If your layout contains only tables and no fields, the ORCA VLM will not process your data.
The Document Processing with ORCA Subflow is designed for Semi-structured layouts. Structured documents submitted through this flow will not be processed. To support both Structured and Semi-structured documents, a custom flow must be configured. Contact your Hyperscience representative for additional information.
The ORCA Quality Assurance Subflow is not selected by default when configuring the Document Processing with ORCA Subflow. If QA is not configured, no QA tasks will be created, and accuracy reporting will not be available.

Learn how to install and activate ORCA VLMs for both SaaS and air-gapped instances in the sections below.

Installing the ORCA base model in deployments with internet access

In deployments with internet access (i.e., SaaS deployments and non-air-gapped on-premise deployments), the base model is available through a pre-configured artifact repository or Cloudsmith and can be installed directly from the platform.

Base model
A foundational model that provides core general-purpose capabilities and is not directly trained on customer-specific examples. Use-case specialization is achieved through additional training on top of the base model using customer-specific data. To learn more, see [v42.3] Training a Specialized Model.
Currently, Hyperscience uses the ORCA 1.0 base model.

The installation runs in the background and, once complete, enables you to use ORCA VLMs and specialize them for your use case. Follow the steps below to install the ORCA base model.

Go to Administration > Assets.
Click Install on the ORCA 1.0 card.
Choose how the ORCA base model should be downloaded:
- Fetch via Artifact Repository — SaaS deployments use a pre-configured repository to download the base model.
- Fetch via Cloudsmith — On-premise deployments use a Cloudsmith key to download the base model, as your organization manages artifact access. If you don’t have the key for your account, file a ticket on our Support portal.

Click Start.

The ORCA 1.0 card will indicate that installation is in progress.

You’ll receive a notification in the notifications panel () once installation is complete.

Installing the ORCA base model in air-gapped instances

In some on-premise deployments, the instance may be air-gapped, meaning it does not have direct internet access and cannot download assets from external repositories such as Cloudsmith.

In these cases, the ORCA base model must be installed through a manual transfer process. Instead of downloading the model directly on your machine, you will:

Download the ORCA base model assets on a machine with internet access.
Transfer the files to your secured internal storage (e.g., S3).
Configure an Artifacts Repository pointing to that storage.
Install the ORCA base model from the configured repository.

Important considerations
This process is required only for air-gapped or restricted instances.
ORCA model assets are split into multiple chunks to improve reliability in restricted or unstable network conditions:
Each chunk can be downloaded independently.
If a download fails, only the affected chunk needs to be retried. This setup eliminates the need to restart the entire download process.
Downloads and uploads can be performed in parallel (e.g., uploading one chunk while others are still downloading), improving overall transfer efficiency.
Chunked downloads also help mitigate firewall timeouts and bandwidth limitations.

Each step is described in the sections below:

Download ORCA VLM assets from Cloudsmith:
- ORCA VLM assets are split into multiple files (chunks). You need to download all required files before proceeding. You can either download them manually or use a script to retrieve all required assets automatically.
  - 44 chunk files: block_asset_chunk__14554188-cf8e-4f10...
  - 1 block asset: block_asset_14554188-cf8e-4f10...
  - 1 manifest file: orca_repo_manifest.json

Automatically download ORCA VLM assets
You can download the files manually from the Cloudsmith repository. However, because the assets are split into multiple files, we recommend using a script to download them automatically.

Download ORCA VLM assets, using a script:
- The script downloads all required files directly from the Cloudsmith repository using your access key. It works as follows:
  - First, it downloads the orca_repo_manifest.json file.
    - This file contains the list of all required ORCA assets and their locations.
  - The script then uses this list to download all remaining files.
- The example below is designed for instances that support bash (e.g., Linux or macOS).
  - For Windows instances, you may need to adapt the script (e.g., using PowerShell) or use a compatible shell. For more information, file a ticket on our Support portal.

Before runing the script
Ensure you are using a valid Cloudsmith key with access to the hs-assets repository.
Ensure curl and jq are installed

KEY="your-cloudsmith-key"
BASE="https://dl.cloudsmith.io/${KEY}/hyperscience/hs-assets/raw/files"
OUT="./orca_chunks"
mkdir -p "$OUT"

# 1. Download the manifest (query params needed only here)
curl -o "${OUT}/orca_repo_manifest.json" "${BASE}/orca_repo_manifest.json?accept=true&accept_eula=1"

# 2. Download every chunk (NO query params — urljoin drops them, Cloudsmith doesn't need them)
jq -r '.artifacts[].location' "${OUT}/orca_repo_manifest.json" | while read loc; do
  echo "Downloading ${loc} ..."
  curl -o "${OUT}/${loc}" "${BASE}/${loc}"
done

Prepare ORCA VLM assets for S3 repositories

S3 bucket with ZIP artifacts
This step is required only when using an S3 bucket with ZIPs artifact repository. It is not required when installing ORCA directly from Cloudsmith.

Before uploading the downloaded ORCA VLM assets to an S3 bucket, you must prepare them for use with the S3 bucket with ZIPs artifact repository type.

The downloaded ORCA package contains:

Chunk files (block_asset_chunk__*.zip)
A block asset definition file (block_asset__*.json)
A manifest file (orca_repo_manifest.json)

To make the assets compatible with the artifact repository, run the provided Python script. The script creates ZIP bundles containing both the artifact and the corresponding manifest information required by the repository.

Before running the script, ensure that:
- Python 3 is installed on the machine.
- The downloaded ORCA assets and orca_repo_manifest.json are stored in the same directory.
- The wrap_artifacts_for_s3_zips.py script is available on the machine.
Run the script

Custom

import argparse
import json
import logging
import zipfile
from pathlib import Path

logger = logging.getLogger(__name__)

REPO_MANIFEST_JSON = 'repo_manifest.json'


def wrap_artifacts(input_dir: Path, output_dir: Path) -> None:
    manifest_path = input_dir / REPO_MANIFEST_JSON
    if not manifest_path.is_file():
        raise FileNotFoundError(f'{manifest_path} not found')

    with manifest_path.open('r', encoding='utf-8') as f:
        manifest = json.load(f)

    artifacts = manifest.get('artifacts', [])
    if not artifacts:
        raise ValueError(f'No artifacts found in {manifest_path}')

    output_dir.mkdir(parents=True, exist_ok=True)

    for entry in artifacts:
        location = entry['location']
        source_file = input_dir / location
        if not source_file.is_file():
            raise FileNotFoundError(f'Artifact file missing: {source_file}')

        # Single-entry manifest for this wrapper ZIP
        single_manifest = {'artifacts': [entry]}
        single_manifest_bytes = json.dumps(single_manifest, indent=2).encode('utf-8')

        # Output ZIP name: same stem as the artifact, always `.zip`
        wrapper_name = Path(location).stem + '.zip'
        wrapper_path = output_dir / wrapper_name

        logger.info('Creating %s wrapping %s', wrapper_path.name, location)
        with zipfile.ZipFile(wrapper_path, mode='w', compression=zipfile.ZIP_STORED) as zf:
            zf.writestr(REPO_MANIFEST_JSON, single_manifest_bytes)
            zf.write(source_file, arcname=location)

    logger.info('Wrote %d wrapper ZIPs to %s', len(artifacts), output_dir)


def main() -> None:
    parser = argparse.ArgumentParser(description=__doc__)
    parser.add_argument(
        '--input', '-i', required=True, type=Path,
        help='Directory containing repo_manifest.json and the artifact files',
    )
    parser.add_argument(
        '--output', '-o', required=True, type=Path,
        help='Directory where wrapper ZIPs will be written',
    )
    args = parser.parse_args()

    logging.basicConfig(
        level=logging.INFO,
        format='%(asctime)s %(levelname)s %(message)s',
    )

    wrap_artifacts(args.input.expanduser(), args.output.expanduser())


if __name__ == '__main__':
    main()

Where:

--input specifies the directory containing the downloaded ORCA assets and orca_repo_manifest.json.
--output specifies the directory where the generated ZIP bundles will be created.

The script creates a ZIP file for each ORCA artifact. These generated ZIP files must be uploaded to your S3 bucket and used when configuring the artifact repository.

Upload the prepared ORCA VLM assets to internal storage (S3):
- Go to your S3 bucket (e.g., s3://hs-build-artifact/block-asset/).
- Create a new folder called orca .
  - Final path: s3://hs-build-artifact/block-asset/orca/
- Place the assets in the folder.

Configure the Artifact Repository in your instance:
- Add /admin at the end of your instance’s URL:
  - instance.hs.ai/admin
- Using the browser’s search, find Artifact repositorys.
- Click Add Artifacts Repository.
- Enter a name for the repository.
  - We recommend choosing a human-friendly repository name, for example, S3.
- From the Repo type drop-down list, select S3 bucket with ZIPs.
- Enter the following in the Configs field:

{
  "s3_path": "s3://hs-build-artifact/block-asset/",
  "s3_region": null
}

Go to Administration > Assets.
Click Install on the ORCA 1.0 card.
Choose Fetch via Artifact Repository. The system will now install the ORCA base model using the files from your internal storage.

Generated ZIP bundles
The generated ZIP bundles replace the originally downloaded files for S3 uploads.

Create a Semi-structured layout for ORCA VLMs

ORCA VLMs extract data only from fields defined in a Semi-structured layout. Before associating the layout with an ORCA flow, ensure it’s configured correctly.

The layout configuration affects the training payload sent to the trainer. Learn more about the trainer in our Trainer article. Proper layout configuration ensures consistent extraction behavior and reliable model performance.

Use the checklist below when creating your layout:

Define explicit fields.
- Ensure that:
  - At least one field is defined.
  - Field names reflect business meaning (e.g., “Invoice number,” “Total amount,” “Merchant name”).
    - Note that field names are included in the training payload and influence prompt construction during training.
    - Avoid field names that don’t bring information to the model, such as “Names,” “Amount,” and “Field1.”
    - Ensure the field names are specific and reflect their intention.
      - For example, “Last Name” instead of “Name.”
Configure data types.
- Data types define semantic expectations (e.g., numeric, date).
  Multiline fields
  Ensure the Bounding Box spans across the multilne value. The transcription will always appear in one line only.
- Configure Multiple Occurrences (MOs).
  - The MOs determine how the trainer interprets repeated field values.
  - Enable MOs when a field appears multiple times in a document or when you expect multiple extracted values.
Use Notes for extraction guidance

Use Notes to provide additional context that guides how ORCA interprets and extracts values.
The data in fields’ Notes behave similarly to prompts — they influence how the model understands the field, but they do not override the document content. They do not replace ground truth or enforce formatting.

Clarify what the field represents.
Provide semantic details.
Define extraction intent (e.g., “Exclude VAT,” “Use header value only”).
- For example, consider a field named “Bill to email.” If the document contains both a company name and an email address, you can use the Notes section in the Layout Editor to clarify which value should be extracted. In this example, the Notes section specifies that only the email address is required:

ORCA still reads the full value from the document.
- During a Flexible Extraction task, only the email address is extracted under the “Bill to email” field.

For more information about Semi-structured layouts, see our Creating Semi-structured Layouts article.

Once your layout is ready, include it in a release and assign the release to a flow. Learn more in What is a Release?.

Configure ORCA subflows

In v42.3, the included flows relevant to ORCA are:

Document Processing
Document Processing with ORCA Subflow
ORCA Quality Assurance Subflow.

Ensure that the ORCA base model is installed before configuring the subflow.

To enable ORCA VLMs, you need to configure the Document Processing flow:

Go to Flows and click the Document Processing flow.
In the menu (), click Duplicate.

Duplicating the original flow
To avoid overriding default settings, duplicate the original flow and configure the copy based on your requirements.

Rename the new flow.
Select a layout release from the drop-down list.
- If no release is selected, a Missing Layout Release UUID warning appears, and the flow cannot be enabled.
Click Start Document Processing Subflow to open the flow’s settings.
In the Flow Identifier setting, under Block Details, click Document Processing with ORCA Subflow.
- Ensure that the correct base model is selected in ORCA Base Model, as shown in the image below.
Select the ORCA Quality Assurance flow to generate accuracy reporting.
- Set the ORCA QA Sample Rate to define the percentage of documents that the system will randomly select for QA. Learn more about ORCA QA in Vision Language Model Quality Assurance.

ORCA QA Sample rate
We recommend setting the sample rate to 100% for model evaluation.

From the Settings Type, choose Flexible Extraction, and select the checkbox for Flexible Extraction Show Machine Predictions. That way, you’ll be able to see how ORCA is extracting fields.

Once the flow is configured, click Save, then enable it by toggling the Live switch at the top of the page.

ORCA base model automation
Installing the ORCA base model does not enable automation. Each processed document generates a Flexible Extraction task for human review.

Next steps

After ORCA is installed and configured, you can start processing documents.

Because ORCA is delivered as a base model, it provides general-purpose extraction capabilities and is not adapted to your specific document types or business requirements. By default, every processed document generates a Flexible Extraction task. You can also view the approximate field locations used by ORCA, as shown in the screenshot below. Note that to complete the task, you must click through all fields.

To optimize extraction performance for your use case, you should:

Create a model definition for your layout. To learn more, see [v42.3] Model Definitions and [v42.3] TDM for ORCA VLMs.
Annotate documents specific to your use case.
Train a model on top of the ORCA base model. Doing so allows ORCA to learn patterns specific to your use case. Learn more in our [v42.3] Training a Specialized Model article.