This article summarizes a workflow, involving submitting high-resolution files to Merritt – in tandem with directly depositing objects from Nuxeo into Merritt.
Use Case
As an example scenario, a campus library has digitized a film collection resulting in high-resolution video files in DPX format, and lower-resolution access copies (e.g., a WAV audio file, mezzanine file in MOV format, access file in MP4 format, etc.). The high-resolution files are not intended for public access, and will be solely managed as a preservation copy.
Recommended Approach
Given the generally large size of the high resolution files (and intent to solely manage them as a preservation copy), we would recommend directly ingesting those files into Merritt – and not managing them in Nuxeo.
The lower-resolution access copies can, however, be managed in Nuxeo – and deposited from Nuxeo to Merritt, to ensure that they are also preserved for the long-term.
Steps
Build the objects in Nuxeo (comprising the lower-resolution access copies only)
Once the objects have been created, submit a request to establish a direct deposit of the collection from Nuxeo to Merritt.
Stage the highest-resolution files on a web server, so that they have web-addressable URLs. Note that this web server must be accessible to Merritt’s Ingest service (the Merritt team and local IT can help with this). If staging the files in Amazon S3, please contact us for details on preparing the URLs.
Use a manifest to submit the high-resolution files to a specified collection in Merritt.
Manifests can be created by using the merrittManifest.xls file provided in Merritt documentation. This Excel document incorporates macros that will export worksheet content to an actual file type that Merritt recognizes as a manifest (a .checkm file).
Each row in the worksheet should represent a specific file to be ingested. Columns provide the means to enter information about the file, such as: the URL where it can be found, a checksum method, checksum digest, primary identifier (ARK), a local ID, and additional object-level metadata.
In this case, as every row in a worksheet will be associated with a single file, use the “batchOfFiles” sheet in merrittManifest.xls to begin (you may want to duplicate the sheet and delete the sample content). Note that the initial line with column headers must be in bold text.
NB: If you would like to submit zip files, or containers in a manifest, use the batchOfContainers sheet as an example. There is also an associated BatchOfContainers macro for this alternative case (see step 5).
In the primaryIdentifier column of the worksheet, indicate the Merritt-assigned ARK for an object that was previously directly deposited from Nuxeo into Merritt. This ensures that the high-resolution file is added to the desired, pre-existing object as a new version.
Optionally, the localIdentifier column can be used to augment the object-level Local ID metadata in Merritt. If a text entry is made in this column, said text will be appended to any existing Local ID information that was already present per the Nuxeo-to-Merritt direct deposit process.
Caution: When using a Local ID string, make sure it is unique. A non-unique string (for example, one that is being used by an existing object) will cause Merritt to update the object associated with this Local ID.
Although optional, it is recommended that a hash algorithm and hash value be entered into the manifest for each line item. The hash value, or digest, will be used by Merritt to confirm an exact copy of the file was correctly ingested into the system.
The remaining fields are all optional as well: fileSize, creator, title, date. Given that some or all of this object-level metadata may have already been populated by the direct deposit process, it is best to check if the metadata already exists in Merritt for any object that will be updated with a high-resolution file using the manifest.
Once entries for all of the desired files are made in the Excel worksheet, use the “BatchOfSingleFiles” macro to export a .checkm file for submission to Merritt.
In Excel, choose View > Macros > View Macros > ThisWorkbook.BatchOfSingleFiles > Run
Name your batch manifest file. A file extension of .checkm will automatically be added to the file.
To submit the manifest to Merritt:
Log in to Merritt. If you work with multiple collections, choose the collection this batch will be submitted to.
Click Add Object.
Click the Browse button and select the batch manifest you just created.
Click Submit. You will receive an email to acknowledge that the batch submission has been queued. A subsequent email will be sent when all content in the batch has been ingested.
Step-by-step instructions for creating and submitting a manifest can also be found under the “Workflow: A Batch of Single Files” section on the Merritt Manifests documentation page.