Installing ORCA VLMs

Prev Next

This article explains how to install ORCA (Optical Reasoning and Cognition Agent) VLMs and configure your instance to use them for document processing.

For information about the capabilities of ORCA VLMs, see our ORCA (Optical Reasoning and Cognition Agent) VLMs article.

Prerequisites

To successfully use ORCA VLMs:

  • Your instance must support GPU-enabled application machines.

Configure your infrastructure for ORCA VLMs

Make sure your instance supports GPU-enabled application machines.

  • On-premise deployment: follow the instructions in the “Enabling Application Machines with GPUs” article for Docker, Podman, or Kubernetes.

  • SaaS deployment: contact your Hyperscience representative to enable GPU support in your instance.

Learn more about Hyperscience’s GPU requirements in Infrastructure Requirements.

  • Your release must contain at least one Semi-structured layout with fields.

  • You need to enable the ORCA flows to start processing. Learn more in the sections below.

Important limits and considerations

  • ORCA VLMs extract data from fields only.

  • ORCA VLMs does not support tables. If your layout contains only tables and no fields, the ORCA VLM will not process your data.

  • The Document Processing with ORCA Subflow is designed for Semi-structured layouts. Structured documents submitted through this flow will not be processed. To support both Structured and Semi-structured documents, a custom flow must be configured. Contact your Hyperscience representative for additional information.

  • The ORCA Quality Assurance Subflow is not selected by default when configuring the Document Processing with ORCA Subflow. If QA is not configured, no QA tasks will be created, and accuracy reporting will not be available.

Learn how to install and activate ORCA VLMs for both SaaS and air-gapped instances in the sections below.

Installing the ORCA base model in deployments with internet access

In deployments with internet access (i.e., SaaS deployments and non-air-gapped on-premise deployments), the base model is available through a pre-configured artifact repository or Cloudsmith and can be installed directly from the platform.

Base model

A foundational model that provides core general-purpose capabilities and is not directly trained on customer-specific examples. Use-case specialization is achieved through additional training on top of the base model using customer-specific data. To learn more, see Training a Specialized Model.

Currently, Hyperscience uses the ORCA 1.0 base model.

The installation runs in the background and, once complete, enables you to use ORCA VLMs and specialize them for your use case. Follow the steps below to install the ORCA base model.

  1. Go to Administration > Assets.

  2. Click Install on the ORCA 1.0 card.

  3. Choose how the ORCA base model should be downloaded:

    • Fetch via Artifact Repository — SaaS deployments use a pre-configured repository to download the base model.

    • Fetch via Cloudsmith — On-premise deployments use a Cloudsmith key to download the base model, as your organization manages artifact access. If you don’t have the key for your account, file a ticket on our Support portal.

  1. Click Start.

  • The ORCA 1.0 card will indicate that installation is in progress.

  • You’ll receive a notification in the notifications panel () once installation is complete.

Installing the ORCA base model in air-gapped instances

In some on-premise deployments, the instance may be air-gapped, meaning it does not have direct internet access and cannot download assets from external repositories such as Cloudsmith.

In these cases, the ORCA base model must be installed through a manual transfer process. Instead of downloading the model directly on your machine, you will:

  • Download the ORCA base model assets on a machine with internet access.

  • Transfer the files to your secured internal storage (e.g., S3).

  • Configure an Artifacts Repository pointing to that storage.

  • Install the ORCA base model from the configured repository.

Important considerations

  • This process is required only for air-gapped or restricted instances.

  • ORCA model assets are split into multiple chunks to improve reliability in restricted or unstable network conditions:

    • Each chunk can be downloaded independently.

    • If a download fails, only the affected chunk needs to be retried. This setup eliminates the need to restart the entire download process.

    • Downloads and uploads can be performed in parallel (e.g., uploading one chunk while others are still downloading), improving overall transfer efficiency.

    • Chunked downloads also help mitigate firewall timeouts and bandwidth limitations.

Each step is described below:

  1. Download ORCA VLM assets from Cloudsmith:

    • Download all ORCA VLM assets:

      • 44 chunk files: block_asset_chunk__14554188-cf8e-4f10...

      • 1 block asset: block_asset_14554188-cf8e-4f10...

      • 1 manifest file: orca_repo_manifest.json

  1. Upload the downloaded ORCA VLM assets to internal storage (S3):

    • Go to your S3 bucket (e.g., s3://hs-build-artifact/block-asset/).

    • Create a new folder called orca .

      • Final path: s3://hs-build-artifact/block-asset/orca/

    • Place the assets in the folder.

  1. Configure the Artifact Repository in your instance:

    • Add /admin at the end of your instance’s URL:

      • instance.hs.ai/admin

    • Using the browser’s search, find Artifact repositorys.

    • Click Add Artifacts Repository.

    • Enter a name for the repository.

      • We recommend choosing a human-friendly repository name, for example, S3.

    • From the Repo type drop-down list, select S3 bucket with ZIPs.

    • Enter the following in the Configs field:

{
  "s3_path": "s3://hs-build-artifact/block-asset/",
  "s3_region": null
}

  1. Go to Administration > Assets.

  2. Click Install on the ORCA 1.0 card.

  3. Choose Fetch via Artifact Repository. The system will now install the ORCA base model using the files from your internal storage.

Create a Semi-structured layout for ORCA VLMs

ORCA VLMs extract data only from fields defined in a Semi-structured layout. Before associating the layout with an ORCA flow, ensure it’s configured correctly.

The layout configuration affects the training payload sent to the trainer. Learn more about the trainer in our Trainer article. Proper layout configuration ensures consistent extraction behavior and reliable model performance.

Use the checklist below when creating your layout:

  • Define explicit fields.

    • Ensure that:

      • At least one field is defined.

      • Field names reflect business meaning (e.g., “Invoice number,” “Total amount,” “Merchant name”).

        • Note that field names are included in the training payload and influence prompt construction during training.

        • Avoid field names that don’t bring information to the model, such as “Names,” “Amount,” and “Field1.”

        • Ensure the field names are specific and reflect their intention.

          • For example, “Last Name” instead of “Name.”

  • Configure data types.

    • Data types define semantic expectations (e.g., numeric, date).

      Multiline fields

      Ensure the Bounding Box spans across the multilne value. The transcription will always appear in one line only.

    • Configure Multiple Occurrences (MOs).

      • The MOs determine how the trainer interprets repeated field values.

      • Enable MOs when a field appears multiple times in a document or when you expect multiple extracted values.

  • Use Notes for extraction guidance

Use Notes to provide additional context that guides how ORCA interprets and extracts values.

The data in fields’ Notes behave similarly to prompts — they influence how the model understands the field, but they do not override the document content. They do not replace ground truth or enforce formatting.

  • Clarify what the field represents.

  • Provide semantic details.

  • Define extraction intent (e.g., “Exclude VAT,” “Use header value only”).

    • For example, consider a field named “Bill to email.” If the document contains both a company name and an email address, you can use the Notes section in the Layout Editor to clarify which value should be extracted. In this example, the Notes section specifies that only the email address is required:

  • ORCA still reads the full value from the document.

    • During a Flexible Extraction task, only the email address is extracted under the “Bill to email” field.

For more information about Semi-structured layouts, see our Creating Semi-structured Layouts article.

Once your layout is ready, include it in a release and assign the release to a flow. Learn more in What is a Release?.

Configure ORCA subflows

In v42.3, the included flows relevant to ORCA are:

  • Document Processing

  • Document Processing with ORCA Subflow

  • ORCA Quality Assurance Subflow.

Ensure that the ORCA base model is installed before configuring the subflow.

To enable ORCA VLMs, you need to configure the Document Processing flow:

  1. Go to Flows and click the Document Processing flow.

  2. In the menu  (), click Duplicate.

Duplicating the original flow

To avoid overriding default settings, duplicate the original flow and configure the copy based on your requirements.

  1. Rename the new flow.

  2. Select a layout release from the drop-down list.

    • If no release is selected, a Missing Layout Release UUID warning appears, and the flow cannot be enabled.

  3. Click Start Document Processing Subflow to open the flow’s settings.

  4. In the Flow Identifier setting, under Block Details, click Document Processing with ORCA Subflow.

    • Ensure that the correct base model is selected in ORCA Base Model, as shown in the image below.

  5. Select the ORCA Quality Assurance flow to generate accuracy reporting.

ORCA QA Sample rate

We recommend setting the sample rate to 100% for model evaluation.

  1. From the Settings Type, choose Flexible Extraction, and select the checkbox for Flexible Extraction Show Machine Predictions. That way, you’ll be able to see how ORCA is extracting fields.

  1. Once the flow is configured, click Save, then enable it by toggling the Live switch at the top of the page.

ORCA base model automation

Installing the ORCA base model does not enable automation. Each processed document generates a Flexible Extraction task for human review.

Next steps

After ORCA is installed and configured, you can start processing documents.

Because ORCA is delivered as a base model, it provides general-purpose extraction capabilities and is not adapted to your specific document types or business requirements. By default, every processed document generates a Flexible Extraction task. You can also view the approximate field locations used by ORCA, as shown in the screenshot below. Note that to complete the task, you must click through all fields.

To optimize extraction performance for your use case, you should:

  • Create a model definition for your layout. To learn more, see Model Definitions and TDM for ORCA VLMs.

  • Annotate documents specific to your use case.

  • Train a model on top of the ORCA base model. Doing so allows ORCA to learn patterns specific to your use case. Learn more in our Training a Specialized Model article.