This feature is available in v42.3 and later.
Building an effective document-processing solution requires understanding how the components involved in model training work together. The Model Definitions table is designd to help manage these components. Model definitions separate configuration from the underlying models, providing a structured way to manage a model’s lifecycle. As training progresses, new model versions are created incrementally using updated training data within the same definition.
Model definitions availability
Beginning in v42.3, model definitions are available for VLMs, and the same framework will be used for other model types in future releases. To learn more about ORCA VLMs, see our [v42.3] ORCA (Optical Reasoning and Cognition Agent) VLMs article.
In this article, you will learn how to:
Understand what each row in the model definitions table represents.
Create a model definition.
Understanding model definitions
A model definition is the configuration layer that represents a specific combination of scope (what the model operates on), task (what the model does), and model type (the model architecture used). Each row in the table acts as the control layer for models used in a specific task and scope and serves as the container that manages all models trained for that configuration. This approach allows you to retrain, evaluate, and deploy new model versions while maintaining a stable reference for the system.
Each row in the model definitions table represents a model definition associated with a specific layout, and each column displays key information about the model’s configuration, training status, deployment state, and version compatibility.

Column | Description | Notes and examples |
|---|---|---|
Scope | The data or objects the model operates on (for example, a layout or set of fields). | For example, ORCA VLMs extract fields from documents, such as invoices. In this case, the scope of ORCA VLMs is field processing. |
Task | The type of problem the model is trained to solve. | For example, the task that ORCA VLMs are performing is field extraction — the task is to extract data (e.g., fields) from documents, based on the layout configuration. |
Type | The model family used for this task and scope. | For example, VLM. |
Compatibility | Compatibility of the most recently live model for this definition. |
Learn more about compatibility in our Model Compatibility Logic article. |
State | Shows whether the model is Live or Inactive. | The state is Live when the model is deployed. The state is Inactive when the model is not deployed |
Training status | The status of the current model training. | Training status could be:
|
Date deployed | The timestamp of the last deployment for this model definition. | Displays the date and time when the model was last deployed. |
Creating a model definition
This section explains how to create a model definition for ORCA VLM. Learn more about ORCA VLMs in our [v42.3] ORCA (Optical Reasoning and Cognition Agent) VLMs article.
Before you start, ensure that:
An ORCA base model is installed. Follow the process described in [v42.3] Installing ORCA VLMs to install and configure the ORCA base model.
The layout you select is Semi-structured and contains at least one field. ORCA VLMs cannot extract data from tables.
The latest version of the layout is locked.
The layout is not already linked to another ORCA VLM model definition.
Layouts and model definitions
For ORCA VLMs, each layout can be linked to only one model definition.
Use the interactive demo below to learn how to create a model definition:
Flows and live models
Each model definition maintains a model history, which contains all models trained or imported for that definition. A model definition can have multiple trained models in its history, but only one model can be Live at a time. The model definition acts as the source of truth for which model is currently used for document processing.
Flows are configured to use a model definition, not a specific model version. When a document is processed, the system automatically selects the model that is currently Live for that definition. If you deploy a new model version, the flow continues to work without any changes. Learn more about ORCA VLM flows in [v42.3] Installing ORCA VLMs.
Next steps
Because ORCA is delivered as a base model, it provides general-purpose extraction capabilities and is not adapted to your specific document types or business requirements. To optimize extraction performance for your use case, you should:
Annotate documents specific to your use case.
Train a model on top of the ORCA base model. Doing so allows ORCA VLM to learn patterns specific to your use case.
Base model
A foundational model that provides core general-purpose capabilities and is not directly trained on customer-specific examples. Use-case specialization is achieved through additional training on top of the base model using customer-specific data. Currently, Hyperscience uses the ORCA 1.0 base model.
Learn how to train a specialized model in [v42.3] Training a Specialized Model.