This feature is available in v41.2 and later.
The Azure Blob Listener allows you to ingest files from a specified Azure Blob Storage URI.
Contents of submissions
The connector accepts both single files and prefixes with files as submissions. Whether a file is processed as its own submission or as part of a larger submission depends on where it is located relative to the source URI:
If it is directly under the source URI, the system processes it as an individual submission.
If it is under a prefix under the source URI, the system considers it part of a larger submission, which consists of all files directly under the prefix.
Only one level of nesting is recognized by the connector when creating submissions with multiple documents. If there are other prefixes contained within the prefix you specify, files under those prefixes are ignored.
Metadata
For each submission, the Azure Blob Listener can accept a JSON file that contains metadata, case data, and an external_id. The name of this file depends on whether the submission consists of a single file or a set of files:
filename.ext.json for individual files, where filename is the name of the file and ext is the file's extension
prefixname.json for files under an Azure Blob prefix, where prefixname is the name of a prefix under the source URI.
The metadata files must be located directly under the source URI. If you put those under a prefix of the source URI, they will be ignored, and a submission won't be created.
An example metadata file for an Azure Blob Listener submission appears below.
{
"metadata": {
"test": "Metadata for file in Azure Blob container"
},
"cases": [{
"external_case_id": "900",
"filenames": ["div_lic_1.jpg", "div_lic_2.jpg"]
}],
"external_id": "123"
}
Archiving processed files
As files are ingested, they are copied to an archive URI and deleted from the source URI. The names of files do not change when they are archived and include the files’ prefixes, if any.
If the deletion of files from the source URI leaves empty prefixes, those prefixes will not be deleted and will remain in the container.
Sample use cases
Another system places documents in an Azure Blob container on a regular basis. I want to ingest those files one by one.
I want to regularly scan and ingest certain types of files under a certain prefix in an Azure Blob container.
Block settings table
In addition to the settings outlined below, you can also configure the settings described in Universal Integration Block Settings.
Name | Required? | Description |
---|---|---|
Azure Blob Source URI | Yes | The location the connector will scan for blob files. Can contain a prefix and trailing slash.
|
Azure Blob Archive URI | Yes | The location the connector will move files to after they have been ingested into Hyperscience.
|
File Extensions | Yes | A list of the extensions that image files will need to have to be eligible for processing. If there are file extensions that you want to support but do not see in the drop-down list, select other, and enter the extensions in Other File Extensions. |
Other File Extensions | No | A comma-separated list of file extensions that do not appear in File Extensions. This field only appears if other is selected in File Extensions. |
Include Submission Level Parameters | No | Indicates whether the system will ingest JSON files along with document files and submission Azure Blob prefixes. These JSON files can contain information such as metadata, case data, and external_id values. These JSON file names should match the names of the related files or Azure Blob prefixes (e.g., XYZ.jpg.json for XYZ.jpg). |
Authentication Type | Yes | Can be one of the following:
Service Principal should be selected if you want to use M2M Azure Entra ID authentication. It is an Application Service Principle type. Storage Account Key requires an account name and access key. |
Client ID | Yes, if Service Principal is selected for Authentication Type | Azure Entra ID Application Service Principal Client ID. This setting is only available if Service Principal is selected for Authentication Type. |
Client Secret | Yes, if Service Principal is selected for Authentication Type | Azure Entra ID Application Service Principal Client Secret. To edit the secret, click Edit value, modify the secret, and then click Done. This setting is only available if Service Principal is selected for Authentication Type. |
Tenant ID | Yes, if Service Principal is selected for Authentication Type | Azure Entra ID Application Service Principal Tenant ID.​​ Unique identifier of the Tenant in which the corresponding application is registered with. This setting is only available if Service Principal is selected for Authentication Type. |
Storage Account Name | Yes, if Storage Account Key is selected for Authentication Type | Azure Storage Account name. This setting is only available if Storage Account Key is selected for Authentication Type. |
Storage Account Key | Yes, if Storage Account Key is selected for Authentication Type | Azure Storage Account key. To edit the key, click Edit value, modify the key, and then click Done. This setting is only available if Storage Account Key is selected for Authentication Type. |
Poll Interval (In Seconds) | No | The frequency at which the connector will monitor the source URI for submissions. Defaults to 10. |
Warm-Up Interval (In Seconds) | No | The length of time that a file must remain unmodified before it is eligible for processing.
Defaults to 15. |
Setting up the Azure Blob Listener
To set up the Azure Blob Listener, enter the settings as described in the Block settings table above.
Before deploying a flow with the Azure Blob Listener enabled, ensure that the credentials you’ve specified in the block settings have the following permissions assigned:
Read Blob and Put Blob for both the source and the archive URIs
List Container and Delete Blob for the source URI
To test if the permissions have been properly set, click Test Connection at the bottom of the connector settings in Flow Studio. If the required permissions are present, no errors will be detected.