This page provides an overview of a new Calisphere harvesting system under development and our plans to deploy the updated system; we aim to fully develop the updated system by the end of the 2023 calendar year, with an updated timeline to transition all harvesting operations to the new system in early 2024.

As a next step in development, we will be testing the new infrastructure to verify that the full pipeline–from harvesting to building the Calisphere index–is performing as expected.

If your organization has contributed digital collections to Calisphere, we will be in touch regarding test re-harvests of your collections. We will share the results with you, for previewing on our new Calisphere-stage site. The collections will not be published in the public/live Calisphere site until we transition completely over to our new system, in early 2024. No QA work is required on your part. 

Testing the full harvesting process is a critical part of developing this new infrastructure. This will also ensure that Calisphere has the most current version of your published collections. Please review this page for additional details.


About the new harvesting system

Re-harvesting digital collections

Harvesting digital collections via the legacy harvester


About the new harvesting system

In 2021, we started an active development project (called Rikolti), to replace our current (and “legacy”) Calisphere harvesting system. By the end of 2023, we plan to fully develop the new harvesting system. In early 2024, we plan to transition all harvesting operations to the new system, and sunset the legacy harvester and transition fully to the new system, which is designed to be modular and fast, using current, well-supported technologies. More detailed Information about our approach is available in the Rikolti project GitHub

What is harvesting?

Calisphere uses a harvesting model for contributing your collections into this aggregation. This strategy  allows us to programmatically “fetch” collections from your local digital asset management system – specifically, descriptive metadata and thumbnails for items in the collections. Once fetched, we “map” the metadata into a central index underlying Calisphere, to support searching and browsing of the items.

Why are you developing a new harvester?

Our existing Calisphere harvester is outdated, and uses deprecated, unsupported technologies adapted from the Digital Public Library of America’s (DPLA) open-source code base from 2013. We’re committed to developing infrastructure that uses current, well-supported systems that can continuously support the statewide aggregation of digital cultural heritage resources. We are planning to sunset our existing outdated harvester and transition fully to Rikolti in early 2024. 

What changes in harvesting will contributors notice? Will there be changes to Calisphere?

We are using an updated technology framework to create a more efficient, flexible harvesting operation. We do not anticipate any changes to the process and steps involved with sharing collections with Calisphere. We also do not anticipate any impacts or changes to how collections and items are searched, browsed, and displayed in Calisphere.

When will development of the new harvester be complete?

We are aiming to develop a fully functional and first iteration of the new harvester (a “minimum viable product”) by the end of 2023, and sunset our existing system in early 2024. After this work is complete, we will continue to review our priorities to strategize development of feature enhancements.


Re-harvesting digital collections

My organization’s collections are already in Calisphere. Why do they need to be re-harvested?

Calisphere harvests collections by connecting to the platforms that contributors are using to manage and publish their digital collections. As a way of testing the new infrastructure, we will be re-harvesting your collections to the Calisphere-stage site to ensure a continued connection to your platform. By doing so, we will be able to verify that the full pipeline–from harvesting to building the Calisphere index–is performing as expected.  Additionally, re-harvesting the collections will ensure that we have a current version of your collections. This is our opportunity to refresh your collections in Calisphere, so we are in sync with the records currently published in your public platform.

Do we have to do any QA checking of our re-harvested collections?

As part of testing the new harvesting infrastructure, we will be conducting “data validation tests” to ensure we are  correctly “fetching” your digital collection data, and “mapping” your data into our underlying Calisphere index. CDL staff will be reviewing the results of these tests during the re-harvesting.

No QA work is required on your part, though we will be in touch if we have any questions related to re-harvesting. Please feel free to preview your re-harvested digital collections in our Calisphere-stage site.

What do you mean by “data validation testing”?

As part of our harvesting development, we are also conducting data validation testing. This validation testing will programmatically compare the existing collection data in Calisphere, with the newly-reharvested collection data. The results of this validation will output any differences found between the two data sources; as a baseline goal, we are aiming for 100% data fidelity between the two sets for prioritized, core fields. However, we do anticipate that collections may have been updated in your public platforms–adding new records, updating metadata, removing records–which our validation tests will surface as data differences. We will evaluate the results of the data validation tests  to first evaluate whether our fetching and mapping processes are correctly configured.

How do I preview the results from the re-harvest?

We have a new staging site to preview collections that have been harvested through the new Calisphere harvester. Preview collections at the Calisphere-stage site.

When will these newly re-harvested digital collections be published?

Throughout the rest of the 2023 calendar year, our goal is to completely develop the new harvesting infrastructure. Beginning in early 2024, we will replace our current harvesting infrastructure with the new system; this includes transitioning to the new Calisphere index. Once we switch Calisphere’s underlying index over to the new Calisphere index, the newly re-harvested digital collections will be published on the public/live Calisphere site. We anticipate the newly re-harvested digital collections will be published in early 2024.


Harvesting digital collections via the legacy harvester

My organization has collections that we’d like harvested to Calisphere before the end of 2023. How do I request that?

Our legacy harvester will be sunsetted only after our new harvester is fully deployed and operational; we will continue to run harvesting requests using our existing operational workflows until we transition to the new harvester, in early 2024. Please submit requests to harvest new collections and/or re-harvest existing collections using our Harvest and Re-harvest Request Form

Note that our legacy harvester will need to have an established connection with your platform (i.e., Calisphere has previously harvested from your system). If we have not harvested from your system before, we will first need to establish a connection with your platform; we will be able to begin onboarding new contributors and/or platforms once we are fully on the new harvester, after we transition to the new harvester system, in early 2024. Please feel free to contact us for more information. 

My organization has contributed collections to Calisphere, but we have since migrated to a new platform. What will happen to our collections?

Please contact us if you have migrated to another platform, and we will arrange for an initial call to discuss the details. We will need to write a new mapper to work with your new platform, and will be able to continue this service once we are fully on the new harvester, after we transition to the new harvester system, in early 2024.

My organization worked with California Revealed to contribute collections to Calisphere. What will happen to our collections?

We are coordinating with the California Revealed team; once we are fully on the new harvester, we are planning to reharvest your collections from California Revealed’s platform.