General requirements

Calisphere harvests metadata for digital objects, and displays that metadata (as well as some content files) on the site. A “digital object” is a content file(s) and affiliated metadata record(s) that represent an individual resource. We can harvest metadata for digital objects that are:

  • Hosted in a digital collections system that displays metadata and allows for the viewing, downloading, and/or streaming (depending on the format) of digital files.
  • Openly and publicly available.

We do not support the harvest of content that is generally out-of-scope for DPLA, including:

  • Digital objects that are not publicly accessible (i.e., have access restrictions)
  • Metadata-only records (i.e., a record that does not have an associated content file)

Harvest methods

We can harvest metadata from a range of sources and using different methods. We are using the term “harvest” loosely to apply to a range of approaches, for example:

  • Using the OAI-PMH protocol to harvest metadata from CONTENTdm, Omeka, DSpace, BePress, etc.
  • Using the CMIS protocol to harvest metadata from systems such as Preservica
  • Using APIs to harvest metadata from systems such as Solr, Flickr, and Internet Archive
  • Obtaining a metadata export of MARC21 or MARCXML records
  • Obtaining customized XML exports from systems such as PastPerfect

We've already worked with a number of institutions using different systems to expose their metadata for harvesting. In most cases, such as with CONTENTdm, there are just a few simple steps, requiring just a few minutes of time, that will allow you to share your collections with Calisphere. We can walk you through the process.

Our objective is to find the method that is easiest for you. We'll work with you to discuss your system and figure out what that is.

Metadata formats

There are no restrictions on metadata formats that can be harvested. However, the metadata must be mappable to a set of standardized data elements, which have been based on the DPLA Metadata Application Profile. A small number of data elements must also be present in each record that is harvested, to support use and discovery of the objects. These mappings and requirements are defined in our Metadata Scheme and Crosswalk.

Maintaining stable or persistent URLs

It is important that your digital objects have stable -- and ideally, persistent -- URLs. When we harvest your digital objects, we create Calisphere-based URLs that are based on your source digital objects' URLs (e.g., in CONTENTdm, Omeka, DSpace, BePress, etc.). If you change those source URLs -- e.g., by migrating the collections to a different platform, changing the domain name, etc. -- the links from Calisphere to your digital objects will break. If users have cited your objects (e.g., in Wikipedia, articles, etc.), those links will also break.

Hence, we recommend using persistent URLs such as ARK or DOI whenever possible, to minimize broken links. If you do migrate collections from one platform to another or change domain names, you should also consider redirecting the old forms of URLs to the new URLs, so the older links do not break.