V42 Release Notes

Versions 42.1.x and 42.2.x are available to SaaS customers only.

42.2.8 (13 Mar 2026)

Version 42.2.7 was not released and is not supported.

Field Identification

Fixed

Duplicate responses to Field Identification tasks — We've fixed an issue that created duplicate responses to Field Identification tasks in some situations, causing those tasks' flows to fail.

42.2.6 (26 Feb 2026)

Updates

This version includes a number of updates that optimize our internal testing and deployment processes.

42.2.5 (26 Feb 2026)

Reporting

Updated

Hourly page metrics CSV file in Usage Bundle — The Usage Bundle includes a new CSV file, hourly_pages_metrics, in the product_analytics folder. This file enables review of hourly page volumes.

Hourly reporting data is available for the past 30 days only. If a Usage Bundle is created for a period that includes dates that are more than 30 days in the past, the CSV file will contain data only for the days within the available reporting window.

Licenses

Updated

Additional licensing enforcements — We’ve extended support to enforce license keys that restrict usage across other dimensions. Note that this change will not impact existing license keys in use.

42.2.4 (12 Feb 2026)

Flows

Updated

Python 3.11 deprecation messages — Info-level deprecation messages are shown when importing or validating flows that use Python 3.11 blocks. These messages notify you that Python 3.11 support will be removed in v43 and help identify flows that should be updated before upgrading to v43 of the application.

In v41 and v42, flows using Python 3.11 remain supported.
In v43, blocks using Python 3.11 will no longer be supported, and flows using them will be undeployed.

File Storage

Fixed

Generating signed URLs with Google Cloud Storage client — We've fixed an issue that caused an AttributeError to occur when generating signed URLs with the Google Cloud Storage (GCS) client. The issue primarily affected instances where the GCS client ran on Google Kubernetes Engine (GKE) or used Workload Identity.

42.2.3 (28 Jan 2026)

Updates

This version includes a number of updates that optimize our internal testing and deployment processes.

42.2.2 (28 Jan 2026)

Updates

This version includes a number of updates that optimize our internal testing and deployment processes.

42.2.1 (21 Jan 2026)

Version 42.2.0 was not released and is not supported.

File sharing

New

Hyperscience Secure Share — When sharing files for a TVE, demo, or implementation, you can now upload them using Hyperscience Secure Share, an HTTPS-based file-sharing application. This solution, available to both current and prospective customers, eliminates the need to share files through SFTP or Jira Service Management tickets.

After going to the Secure Share URL and verifying your identity, you can upload up to 100GB of data. Then, you can monitor the progress of the upload, confirm its contents, and submit it to Hyperscience.

For more information, see Hyperscience Secure Share.

Languages

New

New languages — We now support automation on Structured and Semi-structured documents containing printed text in the following languages:

Finnish
Swedish
Norwegian
Danish

Note that automation on handwritten text in these languages is not supported.

For more information on supported languages, see Supported Languages.

Updated

Support for Japanese Semi-structured printed documents — You can now automate the processing of Semi-structured documents containing printed text in Japanese.

Note that automation on handwritten text in Japanese is not supported for any document type.

To learn more about our supported languages, see Supported Languages.

Training Data Management

In progress

Models page improvements (ORCA VLM models) – We’re introducing an updated Models page experience for ORCA VLM models. In v42.2:

The updated Models page applies only to ORCA VLM models.
Other model types will continue to use the existing interface.

This work is still in progress, and additional changes and details will be introduced in upcoming versions.

New

Training-data tagging — You can now organize and manage your documents more efficiently in Training Data Management (TDM) by adding tags. This feature allows you to add, filter, import, and export tags for documents, making it easier to categorize and find the information you need.

It provides the following key capabilities:

Manual tagging — Hover over the Tags cell in the Training Data table to reveal a + button. Click it to open the drop-down list with all existing tags. From there, you can select an existing tag or create a new tag.
Tag filtering — Filter documents in the Training Data table by tag to find relevant items quickly.
Import tags — If the training data contains tags, they will be automatically imported.
Special-character handling — Tags cannot contain “;” or spaces (spaces are replaced with underscores).
Unused tags — Unassigned tags are automatically deleted.

To learn more about tags, see TDM for Identification Models and TDM for Classification Models.

Updated

Removed “Ignore Anomaly” confirmation message — To streamline the anomaly-review process in Identification Model Management, the confirmation message that appeared after clicking Ignore anomaly has been removed. Previously, users were required to confirm their decision in a secondary dialog, resulting in excessive manual clicks, especially in high-volume annotation and training scenarios.

Removed “Display Suggestions” toggle — When Annotation Co-pilot is disabled at the instance level, the Display Suggestions checkbox is hidden from the annotations interface. For more information on this feature, please contact Hyperscience Support.

Flows

Updated

Supported Python versions — Hyperscience now supports the use of Python 3.13 in flows, including Code Blocks and external Python packages. The use of Python 3.9 is no longer supported.

For more information about supported Python versions, see Developing Flows.

Flows SDK

Updated

Improvements to the developer experience — To streamline the flow-development process, we've added support for the unit testing of Code Blocks, and we've added more information about the PythonBlock to our Flows SDK documentation.

To learn more, see our Flows SDK documentation.

Manual Classification

Updated

Searching layout variations in Manual Classification — You can now search for a layout variation directly from the Layout Variations drop-down list when creating a new Structured document in Manual Classification. This update improves classification speed and reduces errors, especially for layouts with a large number of variations.

Additionally, you can now specify only the layout variation, and the system will automatically assign the correct layout group. This feature is especially useful when you know the variation name but not the layout name, or when you prefer to skip selecting a group first. Previously, selecting a layout group before choosing a variation was required.

Orange highlight for appended pages in Manual Classification and manually classified pages in Flexible Extraction — When a keyer appends a page to a document in Manual Classification, an orange border highlight appears, and a tooltip indicates that the page was manually classified. This visual cue is now consistent for both manually matched and appended pages, making them easier to identify and streamlining the user experience of Manual Classification and Flexible Extraction tasks.

Manual Classification:

Flexible Extraction:

Flexible Extraction

Fixed

Displaying layout variation name in Flexible Extraction — We’ve fixed an issue where only the layout group name was displayed in Flexible Extraction. This update improves clarity for users when selecting layouts, ensuring the correct layout group and layout variation names are displayed.

Custom Supervision

Updated

Adding and removing occurrences of transcription fields — When completing Custom Supervision tasks, keyers can now add and remove occurrences of any transcription field defined in the task's sv_template.

This update provides a more flexible Custom Supervision experience to keyers working in any flow with Custom Supervision blocks defined in v42.2 or later, including those that incorporate ORCA VLMs.

Note that it is not currently possible to draw bounding boxes for added occurrences. The ability to do so will be available in an upcoming version of Hyperscience.

More information can be found in the Custom Supervision section of the Flows SDK documentation.

Editing case-level decisions for documents outside of the current submission — You can now apply or update case decisions when reviewing documents from submissions other than the one currently being worked on, as long as the case is part of the active flow. This update supports the use of flows that contain Case Collation Blocks, which often require keyers to reference previous submissions.

Decision editing remains disabled for documents that belong to cases outside the current submission, preserving data integrity.

Reporting

Updated

New CSV files in the Usage Bundle — We've added the following CSV files to the product_analytics folder of the Usage Bundle:

entry_accuracy — Includes data on the daily accuracy of Identification and Transcription tasks based on QA results. This data is grouped by flow and layout and differentiates between tasks performed by the machine and those performed by keyers.
classification_automation_metrics — Provides data on the number of pages classified by the machine and by keyers each day, along with the resulting automation rate. This data is grouped by flow and layout.
throughput_metrics — Includes data on the daily volume of processed objects (i.e., submissions, documents, pages, fields, and table cells) and the average processing time for submissions and documents. This data is grouped by flow and layout.

More information about the Usage Bundle can be found in Usage Bundle.

Checksum for internal validation — We’ve added a checksum column to the Usage Bundle’s application_usage CSV file to support internal data validation.

Connections

Updated

Additional Email Listener settings — To give you more options in the processing of email-based submissions, we've added the following settings to the Email Listener:

Original Email File — When this setting is enabled, copies of the original emails ingested through the Listener are kept in the file store. This setting is disabled by default. If you choose to enable it, ensure that you have available space in your file store where the emails can be saved.
Attachment Headers — This setting determines whether or not attachment headers are extracted during processing. If it is enabled, attachment headers found in the emails are made available in the attachment_metadata element of the Input Block's output. This information can be used by Code Blocks to differentiate between inline and regular attachments and filter out inline attachments, if needed.

To learn more about these settings, see Email Listener.

Permissions

New

“Full Object Access” permission — We've created a Full Object Access permission, which is assigned to users in the System Admin default permission group. This permission grants users access to all flows and layouts in the system, regardless of any flow- or layout-specific restrictions that may exist.

To learn more about the available permissions and the default permission groups, see Permission Groups.

Authentication

Fixed

Population of default values when saving SAML settings — We've fixed an issue that caused default values of SAML settings (Security & Identity in Administration > System Settings) to be saved when users attempted to enter and save different values for those properties.

42.1.4 (15 Jan 2026)

Updates

This version includes a number of updates that optimize our internal testing and deployment processes.

42.1.3 (15 Dec 2025)

Reporting

Updated

New CSV files in the Usage Bundle — We've added the following CSV files to the Usage Bundle:

entry_accuracy — Includes data on the daily accuracy of Identification and Transcription tasks based on QA results. This data is grouped by flow and layout and differentiates between tasks performed by the machine and those performed by keyers.
classification_automation_metrics — Provides data on the number of pages classified by the machine and by keyers each day, along with the resulting automation rate. This data is grouped by flow and layout.

Security

Fixed

Addressing security vulnerabilities — To ensure security, we've updated our versions of django, urlib3, fonttools, ray, and pip.

42.1.2 (25 Nov 2025)

Custom Supervision

Updated

Decision editing remains disabled for documents that belong to cases outside the current submission, preserving data integrity.

Usage Bundle

Updated

Manual work metrics in Usage Bundle — the Usage Bundle now includes detailed metrics about manual work performed in your instance. This update provides greater visibility into manual tasks, helping you better understand your business processes and identify opportunities for increased automation.

These metrics are contained in a new CSV file, manual_task_metrics, which includes daily metrics for manual tasks at both the layout and flow levels. The following manual task types are tracked:

Identification
Transcription
Flexible Extraction
Custom Supervision
Classification — The report now includes the actual time spent on classification at the flow level.

The file contains the following key metrics:

Number of manual tasks of each type completed for documents from a given layout.
Number of documents from a selected layout for which a manual task (per task type) was created.
Wait and Active times: Time spent waiting for a keyer to pick up a task, and time spent by the keyer to complete a task for a document from a given layout.

Permissions

New

To learn more about the available permissions and the default permission groups, see Permission Groups.

42.1.1 (5 Nov 2025)

Version 42.1.0 was not released and is not supported.

ORCA VLMs

Updated

Highlighting predicted field locations during ORCA VLM QA tasks — To increase efficiency and reduce potential for errors, the system highlights the predicted locations of fields during ORCA VLM QA tasks. A field's approximate location is shown when it is selected in the interface's right-hand panel. This update makes the VLM QA user experience consistent with that of VLM Supervision tasks.

More information about ORCA VLM QA tasks can be found in Vision Language Model Quality Assurance.

Training Data Management

Updated

Easier handling of anomalies for missing fields — We streamlined the process of marking fields as not present when resolving anomalies by adding a new Mark field as missing button. You can now:

Quickly mark fields as not present when annotations are missing by clicking the Mark field as missing button.
Use the new keyboard shortcut (Command + Option + M for Mac, Ctrl + Alt + M for Windows) to speed up the process.

This update reduces manual work when handling anomalies during the model-training process.

For more information, see Labeling Anomaly Detection.

Flows

Updated

Enhancements to flow visualizations — To improve the user experience when viewing flows in Flow Studio and on Flow Run pages, we've made the following updates:

Graying out of read-only flows — If a flow is read-only (e.g., the "Document Processing" subflow), it will no longer be grayed out when viewed on the Flow Studio canvas. This update increases the readability of read-only flows.
Highlighting of executed path — When showing a flow's run, the system highlights the path the submission took in the flow. This highlighting makes submissions' paths more apparent, especially in complex, nested flows.
Expanding executed or in-progress routes only — If a flow has multiple branches, only the branches that were executed or are being executed are expanded on flow-run pages.
Buttons for expanding / collapsing all branches — We've added Expand All and Collapse All buttons to the Flow Studio canvas and flow-run pages. Clicking these buttons expands or collapses all of the flow's branches, respectively. Keyboard shortcuts for these buttons are also available.

Division of "On-Error with included Submission data" flow into top-level flow and subflow — Beginning in v42.1, each Hyperscience version comes with the flow "On-Error with included Submission data V3," which consists of:

a top-level flow, where input and output connections are configured, and
a read-only subflow, which contains the logic for gathering information about the flow run and submission.

This update simplifies the process for upgrading the "On-Error with included Submission data" flow.

To learn more, see On-Error Flows.

Maximum size of block logs — Block logs are offloaded to the file store if their JSON files are greater than 50MB in size. This update preserves database capacity and improves the responsiveness of flow-run pages.

Fixed

Resubmitting submissions to disabled or archived flows — We've fixed an issue that caused errors to occur when submissions were resubmitted to flows that were archived or disabled. As part of this update, the system notifies the user that these flows must be live in order for the submissions to be resubmitted.

Manual Classification

Updated

Simplified selection of layout variations — Keyers can now select a layout variation directly when creating a new Structured document in Manual Classification, without needing to select a layout group beforehand. With this update, the selection process is faster and more intuitive, allowing users to search for and assign the correct variation in a single step.

Machine Transcription

Updated

Improved processing of single characters — We've introduced enhancements to reduce cases where single (or "lonely") characters were misread or dropped during processing. Specifically, we've added contextual information to models that prevents individual characters from being incorrectly transcribed as empty strings. This update improves recognition accuracy for documents that include small or isolated characters.

Custom Supervision

Updated

Support for Multiple Occurrences — Keyers can now provide values for multiple occurrences of Transcription fields and Decisions in Custom Supervision tasks. If you anticipate that there will be multiple occurrences of a field, you can add the occurrence_index property to each instance of the field in the supervision_template JSON.

Table Identification

Fixed

Side-by-side tables routed to Manual Identification — We’ve fixed an issue where side-by-side tables were routed to Supervision even if table locator model is available. Documents with side-by-side tables are now processed through Machine Identification without manual intervention.

42.0.15 (12 Mar 2026)

Updates

This version includes a number of updates that optimize our internal testing and deployment processes.

42.0.14 (3 Mar 2026)

Field Identification

Fixed

Duplicate responses to Field Identification tasks — We've fixed an issue that created duplicate responses to Field Identification tasks in some situations, causing those tasks' flows to fail.

Reporting

Updated

Disabling Machine Working Time reporting — To reduce latency, you can disable the collection of machine-time data upon the completion of submissions. If you would like to disable this data collection, set the HS_MACHINE_TIME_REPORT_ENABLED ".env" variable to false.

Note that setting this variable to false will result in empty Machine Working Time reports. The data in these reports will not become available retroactively if you set HS_MACHINE_TIME_REPORT_ENABLED to true in the future.

42.0.13 (26 Feb 2026)

Reporting

Updated

Licenses

Updated

42.0.12 (12 Feb 2026)

Updates

This version includes a number of updates that optimize our internal testing and deployment processes.

42.0.11 (4 Feb 2026)

File Storage

Fixed

42.0.10 (28 Jan 2026)

Updates

This version includes a number of updates that optimize our internal testing and deployment processes.

42.0.9 (15 Jan 2026)

Reporting

Updated

New CSV files in the Usage Bundle — We've added the following CSV files to the product_analytics folder of the Usage Bundle:

entry_accuracy — Includes data on the daily accuracy of Identification and Transcription tasks based on QA results. This data is grouped by flow and layout and differentiates between tasks performed by the machine and those performed by keyers.
classification_automation_metrics — Provides data on the number of pages classified by the machine and by keyers each day, along with the resulting automation rate. This data is grouped by flow and layout.

42.0.8 (25 Nov 2025)

Usage Bundle

Updated

Manual work metrics in Usage Bundle — The Usage Bundle now includes detailed metrics about manual work performed in your instance. This update provides greater visibility into manual tasks, helping you better understand your business processes and identify opportunities for increased automation.

These metrics are contained in a new CSV file, manual_task_metrics, which includes daily metrics for manual tasks at both the layout and flow levels. The following manual task types are tracked:

Identification
Transcription
Flexible Extraction
Custom Supervision
Classification — The report now includes the actual time spent on classification at the flow level.

The file contains the following key metrics:

Number of manual tasks of each type completed for documents from a given layout.
Number of documents from a selected layout for which a manual task (per task type) was created.
Wait and Active times: Time spent waiting for a keyer to pick up a task, and time spent by the keyer to complete a task for a document from a given layout.

Custom Supervision

Updated

Decision editing remains disabled for documents that belong to cases outside the current submission, preserving data integrity.

Permissions

New

To learn more about the available permissions and the default permission groups, see Permission Groups.

Training Data Management

Fixed

Removal of training data during flow-run cleanup — We’ve fixed an issue where older training data uploaded in Training Data Management was sometimes removed during the automated cleanup of expired flow runs. The cleanup process has been updated to ensure training data remains intact and fully accessible.

42.0.7 (30 Oct 2025)

Tables

Fixed

Manual Transcription for nested tables — We've fixed an issue that caused submissions with documents containing nested tables to halt in Manual Transcription. The issue occurred when the Table Identification models for the documents' layouts predicted the presence of empty rows in child tables.

Reporting

Updated

Environment ID in the Usage Bundle — The name of the Usage Bundle's application_usage CSV file now contains the instance's environment ID, if available. The environment ID also appears as metadata at the beginning of the file (# environment_id=<environment_id>).

42.0.6 (22 Oct 2025)

Flow Blocks

Fixed

Extracting data when Document Renderer Blocks are used after Classification — We've resolved an issue that prevented data from being extracted from submissions whose flows had Document Renderer Blocks after Classification Blocks. In these cases, the submissions halted in their flows' Complete Blocks.

Cases

Fixed

"Perform Task" links for Custom Supervision tasks on Case details pages — We've fixed an issue on Case details pages that caused Custom Supervision "Perform Task" links to appear for pages not requiring Custom Supervision instead of for pages requiring it. This issue occurred when Custom Supervision tasks were created for pages not assigned to a layout.

42.0.5 (17 Oct 2025)

Flows

Fixed

Restoring IDP_SYNC functionality after database failure — We’ve resolved an issue where the IDP_SYNC block manager could enter an infinite loop following a database failure. The block manager now recovers correctly from transient database issues, ensuring the environment remains stable and processing continues as expected.

Authentication

Fixed

Configuring signing algorithms for SAML authentication — We've fixed an issue that caused the signing algorithm for SAML requests to default to SHA-1, regardless of the algorithm specified in the SAML configuration. This update also enables the use of the RSA-SHA256 signing algorithm.

42.0.4 (9 Oct 2025)

Image Correction

Fixed

Halted submissions when Image Correction is used multiple times in a flow — We’ve resolved an issue where submissions failed when Image Correction was applied both within Classification and as a separate step in the same flow. Submissions are now completed when Image Correction is enabled in multiple parts of a flow.

42.0.3 (2 Oct 2025)

Machine Transcription

Updated

Improved processing of single characters — We've introduced enhancements to reduce cases where single (or "lonely") characters were misread or dropped during processing. Specifically, we've added contextual information to models that prevents individual characters from being incorrectly transcribed as empty strings and later filtered out. This update improves recognition accuracy for documents that include small or isolated characters.

Custom Supervision

Updated

42.0.2 (26 Sept 2025)

Versions 42.0.0 and 42.0.1 were not released and are not supported.

Highlights

A leader in intelligent document processing, Hyperscience strives to consistently add value, drive innovation, and improve the user experience with every new version of Hypercell. As such, we’re introducing the following key features in Hyperscience v42.

ORCA

ORCA General Prompting Block — The General Prompting Block allows for more flexibility in the application of ORCA VLMs, with prompt inputs now being possible outside of a layout interface and directly in flows. This block allows ORCA to work independently of Semi-structured layouts and creates more possibilities for ORCA prompts to power data-processing tasks or to act as a Model-in-the-Loop for a larger variety of use cases. As a result, you can leverage the ORCA General Prompting Block to power features like Document Chat, or you call it through an API for data manipulation.

Highlighting predicted field locations during ORCA Supervision — During ORCA Supervision tasks, the system now highlights the predicted locations of the document’s fields. A field's approximate location appears when it is selected in the interface's right-hand panel. By helping keyers find fields in documents, this update reduces the time required to complete VLM QA tasks.

Note that ORCA field locations differ from Field ID locations in that the suggested locations test to be larger and more dynamic than Field ID locations, with no explicit initial training from the user.

For more information about ORCA VLMs, see ORCA (Optical Reasoning and Cognition Agent) VLMs.

Structured Document Classification and Flexible Extraction

Placing unmatched pages in Structured documents — Previously, if you filled a gap in a machine-classified layout (e.g., a missing page 3), the unmatched page was always added to the end of the document, even if you attempted to append the page in a specific position. In v42, when you drag and drop an unmatched page into an empty page position (e.g., between page 2 and page 4), the system respects the manual override and places it exactly where you set it, without appending it to the end.

We’ve added the following options for placing pages in Structured documents:

Append page — We’ve introduced an Append page button that allows you to place the selected page at the end of the document. Note that if you choose to append it, the machine’s confidence will be reduced, and it will be sent for manual extraction.
Add to Doc — Add the selected page to the specific document you’re currently working on. Note that the classification will be treated as a manual match, and the content on the page will require manual extraction.

As part of this update, the submission's JSON output reflects the order of the pages determined during Document Classification. This output also ensures that only manually matched pages are sent to Flexible Extraction.

To learn more, see Structured Document Classification.

Displaying fields from manually matched pages only — During Flexible Extraction tasks, use the Show fields from manually matched pages only toggle to display only the fields from the manually classified pages.

For more information, see Flexible Extraction.

Major UI updates

To improve the user experience, we've updated the Hypercell user interface. The features described below include updates that change how certain actions are completed in Hypercell. We recommend notifying your team members of these changes or offering them updated training, as needed.

Models

Models are now visible in the main menu of the product — We’ve moved the Models section out of the Library, reducing the amount of time it takes to find the models you’re looking for. “Models” is now a standalone category in the main menu. You can manage your models directly through this section.

Before:

After:

Tasks

Tasks Overview is updated with a refreshed design — We’ve updated the layout of the Tasks Overview tab (Tasks > Overview) to remove graphs and the occurrence of side-by-side cards.

Before:

After:

Upgrade notes

These updates may impact your upgrade process or affect initial processing times after upgrading. For more information or assistance, contact your Hyperscience representative.

Operating systems

Supported Ubuntu versions — We've removed support for Ubuntu 20. For information on supported operating systems, see Infrastructure Requirements.

Databases

Supported PostgreSQL versions — We've removed support for PostgreSQL 14 and added support for PostgreSQL 17. To learn more about supported databases, see Infrastructure Requirements.

Required disk space

Minimum disk space required for v42.x.x — Due to the growing sizes of packages and dependencies, each application VM running v42 requires a minimum of 400GB of available disk space (300GB in the root volume, 100GB in the /var volume). This space is in addition to the disk space consumed by the operating system (i.e., the space needs to be available after starting a new VM).

To learn more about minimum infrastructure requirements, see Infrastructure Requirements.

Additional features and enhancements

User experience

Design updates — The following key components were updated with a refreshed design in this release:

Perform Tasks interface enhancements — On smaller screens (less than 1500px in width), the Supervision and QA task type tables on the Perform Tasks page (Tasks > Perform Tasks) are displayed one below the other.
- Numbers are right-aligned for better readability.
- Buttons now have a consistent width across both tables.
“File Name” column added to the Training Data table in Training Data Management (TDM) for all models — You can now view document names directly in the TDM interface to help identify and manage training data faster. Additionally, the original file name is searchable from the Training Data table in TDM.
Improved “Training Data Health” card in TDM for Classification — You’ll now see the number of eligible layouts in both the Training Data Health and the Summary cards. The Training Data Health card is expandable, so you can view all individual layouts when multiple are present.
Layouts are eligible for training when they meet the minimum number of pages required for training.
Consistent file-upload dialog boxes across the platform — We’ve updated the file-upload dialog boxes across the platform to improve consistency.

Flows

Updated design of flows and blocks in Flow Studio — To improve the user experience, we’ve made the following enhancements to the Flow Studio user interface:

Information included in flow blocks — We’ve enhanced the design of flow blocks in Flow Studio to include more block-specific information, including:
- the names of connections configured in Input Blocks and Output Blocks,
- the number of blocks and branches in each Routing Block, and
- configuration errors.
You can click on a block to reveal or hide its subflows or notifications.
Visualization of branch merging after Routing Blocks in Flow Studio — When branches merge together in a flow after a Routing Block, the flow’s visualization in Flow Studio includes a Merge card to indicate where the merging occurs. Clicking a Merge card reveals or hides the merged branch.

By allowing you to reveal or hide parts of the flow, these changes reduce the amount of scrolling needed to view a flow’s contents and the effort required to know your relative location in the flow.

Reducing RAM used by block processes — In v42, the system does not start block processes automatically after a flow is deployed. Instead, these processes are started only if tasks have been scheduled for the blocks, reducing the total amount of RAM consumed by block processes.

As part of this update, we've added the HS_RUNNABLE_BLOCKS_DISCOVERY_POLICY and HS_ACTIVE_BLOCK_CUTOFF_SECONDS “.env” file variables, which allow you to specify the conditions under which block processes start and stop.

For more information, see Reducing RAM Used by Block Processes.

Maximum size of block inputs, block outputs, and workflow-engine payloads — To prevent out-of-memory errors, block inputs, block outputs, and workflow-engine payloads are now limited to 500MB. If this limit is exceeded, the data is offloaded to the file store.

Recursion in nested subflows — We've fixed a recursion-related issue that caused errors when processing submissions through flows containing a large number of nested subflows. As part of this fix, we've created the HYPERFLOW_ENGINE_MAX_SUBFLOWS_DEPTH_LIMIT ".env" file variable, which has a default value of 100.

To learn how to change the default limit, see Recursion in Nested Subflows.

New version of “On-error with included Submission data” subflow — To reduce the time required to retrieve submission data, we’ve created a new version (V2) of our “On-error with included Submission data” subflow. The first version of this flow continues to work in v42.

More details about on-error flows can be found in On-Error Flows.

Flow Blocks

Specifying page limits for files — We've added an Enable File Page-Limit Check setting to the Submission Bootstrap Block. If you are using the Document Processing flow included in your instance, this setting appears under the Submission Bootstrap settings type when editing the flow.

When enabled, the Maximum Pages Allowed Per File becomes available, allowing you to specify a page limit for files in a submission. If a file in a submission contains more than this maximum number of pages, the submission will fail.

For more information, see Flow Blocks and Document Processing Subflow Settings.

Flow Runs

Limiting task parallelism in flow runs — We’ve created a new “.env” file variable, HYPERFLOW_ENGINE_MAX_PARALLEL_TASKS_PER_WORKFLOW, which prevents Fork and Foreach Blocks from creating more than a specified maximum of subflow runs or tasks. This update prevents flow runs from generating more parallel tasks than the system can effectively process at once. The variable’s default value is 1000000.

For more information, see Limiting Task Parallelism in Flow Runs.

Layouts

Uploading files of multiple types when creating Structured layouts — We've fixed an issue that allowed users to upload files of different types when creating a Structured layout. In previous versions, the application allowed users to upload files of different types, but it became unresponsive when they attempted to do so. In v42, if you try to upload multiple file types when creating a Structured layout, an error is shown, additional uploads are blocked, and the Next button is disabled.

Semi-structured Document Classification

Page limit for training Semi-structured Classification models — The default training limit for Semi-structured Classification models has been increased to 100,000 pages. This new maximum allows you to train models on larger datasets without additional configuration. If your use case requires training on more than 100,000 pages, reach out to your Hyperscience representative for assistance.

Structured Document Classification

Improved layout variation selection in Document Classification — We’ve added a Layout Variation drop-down list in the Document Classification task. You can search for a layout variation by name directly in the drop-down list, or you can scroll through the list to find the layout variation you’re looking for. This change makes assigning layouts and variations more efficient, especially when working with a large number of options.

Note that the drop-down list is visible only when you have variations in a Structured layout group that has already been selected, either by the machine or manually, before the Document Classification task.

Flexible Extraction

Filtering fields by variation — You can use the Show fields from selected variation only toggle to see fields from the variation you’re currently working on. Doing so makes it faster and easier to focus on what matters, especially in complex documents with many variations within the layout group.

To learn more about the options available during Flexible Extraction, see Flexible Extraction.

Full Page Transcription

Faster processing enabled by configuring Full Page Transcription Block — You can now configure Full Page Transcription Blocks to selectively process any combination of text, signatures, and checkboxes. For example, if a use case does not require the transcription of checkboxes, you can disable the transcription of fields of that data type. Unselected models are not run for page elements that are irrelevant to the submission, thus reducing submission-completion times.

For more information, contact your Hyperscience representative.

ORCA (Optical Reasoning and Cognition Agent) VLMs

ORCA Composite Block — ORCA now supports composite block functionality in flows, allowing for simplified implementation of ORCA VLMs. The block removes duplicate calls and offers improved compatibility between block versions and across application versions. Additionally, the inclusion of this block in v42's Flows SDK reduces the complexity of incorporating ORCA VLMs into custom flows.

Increased throughput of ORCA Blocks — We’ve improved the throughout of ORCA Blocks, with an average increase of 5-10%.

Custom Supervision

Updated version of the Custom Supervision Block — Existing flows will continue running as expected, while new flows will now use the updated CUSTOM_SUPERVISION_3 block by default. This update ensures better long-term stability and compatibility, with no expected changes to results or throughput. The new version introduces tighter validations around the Supervision template to improve stability, and it may not be fully backwards compatible. No changes to throughput or accuracy are expected.

“Show Custom Supervision Tasks as separate items” setting — The Show Custom Supervision Tasks as separate items setting in System Settings (Administration > System Settings) allows you to control how Custom Supervision tasks appear on the Perform Tasks page (Tasks > Perform Tasks).

When enabled:
- Each Custom Supervision Task type is shown separately, but only if there are active tasks of that type.
  - The same applies if task- or flow-level restrictions prevent the user from accessing that task type.
When disabled (default behavior):
- All Custom Supervision tasks are grouped under Custom Supervision on the Perform Tasks page.

Enabling this setting gives teams the flexibility to surface specific task types to keyers for better focus.

For more details on the Show Custom Supervision Tasks as separate items setting and other available settings, see Application Settings Overview.

Leveraging ORCA VLMs in Document Chat — With the introduction of the General Prompting Block mentioned in the Highlights section of these release notes, you can now use ORCA VLMs in Document Chat. This update makes it possible to use Document Chat in environments that cannot be connected to the internet or for use cases where the use of publicly available LLMs is not allowed.

Quality Assurance

Tooltip for “Submit” button for Full Page Transcription QA and Vision Language Model QA tasks — If a keyer hasn’t completed all of the entries for a Full Page Transcription QA or Vision Language Model QA task, a tooltip will appear when the keyer hovers over or clicks the Submit button, telling them to complete all entries before submitting. This tooltip lets the keyer know why the Submit button is disabled when attempting to submit tasks.

Document Renderer Block

Reducing final PDF file size with colors in generated PDFs — You can now set an Output Image Mode in the Document Renderer Block to control how images appear in the block’s generated PDF. This reduces the final PDF file size, optimizing performance. You have three options to choose from:

Keep original colors
Convert to grayscale
Convert to black & white

Additionally, the existing Image Quality setting now applies only when Output Image Mode is set to Keep original colors or Convert to grayscale.

More information about the Document Renderer Block can be found in Flow Blocks.

Connections

Notifiers for Microsoft Azure Blob Storage and Google Cloud Storage (GCS) — With the addition of the Azure Blob and GCS Notifier Output Blocks, you can send submission data to the Azure blob or GCS bucket of your choosing.

Each Notifier can create a single JSON for all of a submission's processed documents, individual JSON files for each processed document, or individual JSON files for each document matched to a layout and a JSON file for each unmatched page. You can also choose whether to send all of a submission's data or only high-level data.

For more information about these Notifiers, see Azure Blob Notifier and GCS Notifier.

Workload Identity authentication option for GCS Listener — We’ve added a Use Workload Identity setting to the GCS Listener, which allows you to obtain credentials via Workload Identity Federation. This option is applicable to SaaS deployments and to on-premise Kubernetes deployments inside Google Kubernetes Engine (GKE) clusters.

To learn more, see GCS Listener.

Reporting

“Usage Report” is now “Usage Bundle” — We’ve renamed the Usage Report to “Usage Bundle” to better reflect the export’s contents.

User Performance reporting for Full Page Transcription QA — Metrics for Full Page Transcription QA have been added to the following reports:

Keyer Projection report
- KeyerPerformance.csv
  - Full Page Transcription QA Time Spent (Seconds)
  - Segments Reviewed in Full Page Transcription QA
  - Segments Reviewed in Full Page Transcription QA per Hour
  - Segment Characters Reviewed in Full Page Transcription QA
  - Segment Characters Reviewed in Full Page Transcription QA per Hour
- HourlyReportingTaskOverview.csv
  - Workflow UUID
  - Total Users
  - Total Time Spent (Seconds)
  - Tasks in Starting Work Queue
  - Tasks Added to Work Queue
  - Tasks Completed
- HourlyReportingSubmissionOverview.csv
  - Users Performing Full Page Transcription QA
  - Time Spent in Full Page Transcription QA (Seconds)
  - Full Page Transcription QA Segments in Starting Work Queue
  - Full Page Transcription QA Segments Added to Work Queue
  - Full Page Transcription QA Segments Completed
Supervision Volume — Segments per day

More details on these reports can be found in Keyer Projection Report and Supervision Volume.

Full Page Transcription in Machine Working Time report — The Machine Working Time report now includes the active time and waiting time for Full Page Transcription tasks.

To learn more about this report, see Working Time.

Metrics for Full Page Transcription QA responses in the Usage Bundle — We’ve added the following metrics to the application_usage CSV in the Usage Bundle:

Number of QA Responses on Machine Full Page Transcription
Number of QA Correct Responses on Machine Full Page Transcription

Files included in the Usage Bundle — For customers using the automated usage-transmission option, the majority of the files in the manual download are now included in the nightly automated transmissions to Hyperscience.

For more information about transmitting usage data automatically, see Automatic Transmission of Usage Data.

Versions of flows included in the Usage Bundle — The Usage Bundle now includes data for only the most recent versions of flows, reducing the size of the bundle.

To learn more about the contents of the Usage Bundle, see Usage Bundle.

Permissions

“Complete Full Page Transcription QA Tasks” permission — We’ve created a Complete Full Page Transcription QA Tasks permission, which is enabled for the System Admin, Business Admin, Data Keyer Admin, Data Keyer Staff, and Knowledge Worker permission groups.

For more information about available permissions and the default permission groups, see Permission Groups.

Authentication

Windows authentication for Microsoft SQL Server (MSSQL) — If your instance uses a MSSQL database, you can now use Windows authentication, Microsoft's recommended authentication method, instead of SQL Server authentication. Windows authentication leverages Linux's Kerberos Ticket Granting Ticket (TGT) and provides a higher level of security than SQL Server authentication. As a result, taking advantage of this feature can reduce potential compliance overhead and facilitate installations and upgrades in Microsoft ecosystems.

Note that Windows authentication is not supported for Kubernetes deployments.

More information about Windows authentication can be found in MSSQL Server - Windows Authentication.

Caching local identity-provider client objects — Local identity-provider client objects are cached in memory for 5 seconds after their creation. This update minimizes database access for HTTP requests made by machine accounts, whose credentials change infrequently.

Infrastructure

Retrieving data from a replica database — You can now set up a replica database to retrieve reporting and audit-log data in on-premise instances. This configuration can improve system responsiveness and prevent timeouts when attempting to read these kinds of data, particularly in instances that process a high volume of submissions.

For more information on how to set up a replica database, see Retrieving Data From a Replica Database.

TLS enabled by default for MSSQL database connections — In v42 and later, MSSQL database connections use TLS encryption by default, due to the upgrade of the ODBC driver to v18. This change applies to both the application's database connection configured in the ".env" file and to any database connections created through Database Access Blocks.

If you enabled TLS in your previous version of Hyperscience, no action is required on your part to continue using TLS encryption in v42. If necessary, you can disable TLS by implementing one of the options outlined in MSSQL.

Django upgrade — We've updated the version of Django that our application uses from 4.2.23 to 5.2.1. Version 5.2 is the latest long-term support version of Django.

Submission Retrieval Store

Support for Google Cloud Storage (GCS) — You can now use Google Cloud Storage as a submission retrieval store. When connected to GCS, the system receives file URLs from the bucket you specify, which it then uses to download the files and process them as submissions. You can configure individual flows to ingest data from a particular bucket by editing the Submission Bootstrap settings in each flow.

More information on setting up GCS retrieval stores can be found in Flow Blocks.

API

Updated “Monitor a Flow Run” endpoint — We’ve updated our Monitor a Flow Run endpoint to allow you to retrieve a summary of a flow run’s data more efficiently.

To learn more about this endpoint, see our API documentation.