A submission is created when a set of files are submitted into Hyperscience together. The system then interprets each image as a page and matches each page to a live layout in the system. Pages are grouped into a document based on the matched layout.
Accepted file types
We support the following file types:
File type | Notes |
|---|---|
DOC | Files are converted to PDF format and then paginated. Because of this conversion, output consistency cannot be guaranteed. If possible, consider using an alternate file type. |
DOCX | Files are converted to PDF format and then paginated. Because of this conversion, output consistency cannot be guaranteed. If possible, consider using an alternate file type. |
EML (files and their attachments) |
|
HEIC | |
HEIF | |
HTM | Files are converted to PDF format and then paginated. Because of this conversion, output consistency cannot be guaranteed. If possible, consider using an alternate file type. |
HTML | Files are converted to PDF format and then paginated. Because of this conversion, output consistency cannot be guaranteed. If possible, consider using an alternate file type. |
JPEG | |
MSG (files and their attachments) | |
Editable PDFs are not supported. | |
PNG | |
TIFF | |
TXT | |
XLS | Files are converted to PDF format and then paginated. Because of this conversion, output consistency cannot be guaranteed. If possible, consider using an alternate file type. |
XLSX | Files are converted to PDF format and then paginated. Because of this conversion, output consistency cannot be guaranteed. If possible, consider using an alternate file type. |
XPS | |
ZIP | Password-protected ZIP files are not supported. |
Not all of these file types can be used to train Classification models. For more information, see Model Management.
Encrypted files are not supported.
Supported languages
Hyperscience supports the automation of submissions written in any of our supported languages.
For more information, see Supported Languages.
Submission statuses
A submission goes through multiple steps in Hyperscience to complete extraction and completing a step changes the status of a document.
Below is a simplified Document Processing Graph that shows the machine and manual tasks a submission can go through. It includes both Structured and Semi-structured document statuses.
For more information on the manual tasks shown in this diagram, see the article What is Supervision?

Processing Graph
Status aggregation
If pages and documents in a submission have different statuses, the earliest status in the Document Processing Graph will be shown. The same is true for pages in a document.