For any given collection published in Calisphere, reports can now be generated indicating the extent and degree of unique metadata values in the collection. This may be helpful for quality control checking and data cleanup purposes.

You can view the reports by adding /metadata to the end of the URL of any collection on Calisphere. For example:

1. Find the collection's URL:

https://calisphere.org/collections/#####

2. Add /metadata to the end of the URL:

https://calisphere.org/collections/#####/metadata

3. This will bring you to a "Metadata Summary" page that provides an overview of that collection's metadata.

The “Metadata Summary” page

The “Metadata Summary” page available for each specific collection is presented as a table, as shown here:

Metadata Summary page

The summary table indicates how many records are in the collection, and will provide two metrics for each of the available metadata fields:

% records with field indicates the metadata completeness, by analyzing how many records have a value in that field. For example, this table indicates that 100% (all) of the records in this collection have a value in the “title” field.

% unique values in field indicates the variety or uniqueness of the values in the field across the records in the specific collection. For example, this table shows that all of the records in this collection have the “title” field completed, and that 98.62% of those values are unique. 1.38% of the titles are not unique and appear in multiple records.

Further details available for specific fields

Some of the fields within the “Metadata Summary” table are presented as active links. Clicking on these links allows further investigation of the values in those fields.

The screenshot below shows a list of all “titles” that are used to describe the materials in this collection. The default sort order is by total number of times the value appears.

Title values results page

Each of these titles can be selected for further information. For instance, you may select a link to view all of the three items with the title, “Horse Fact Sheet.”

Specific titles results page

This set of results provides a listing of all items within the collection that share this title field. Note that though the two items shown in the example above have the same title, they are unique items that were created on different dates.

This type of detailed results page can be generated for any metadata field, except for description. In the example below, links to each results page are presented as active links (in orange) within the “Metadata Summary” page. However, results pages are not available if the metadata field is not being used by the collection. For instance, this summary page does not include a link to a results page for the “format” field because the items in this collection do not utilize this field, as the summary table indicates that 0.0% of records have that field completed.

Metadata Summary page; fields without results pages

Display a list of items that have empty (or missing) metadata values

As you consult these metadata analysis reports to quality control check your items, you may be interested in pulling up a list of items that do not have that field completed. For example, the screenshot below shows that 99.79% of the records have the “subject” field completed. This means that 0.21% of items in the collection have an empty (or missing) value in the “subject” field.

While this report does not currently display a list of items with empty values, that set can be gathered on the standard collection search page. The example below will use “subject” to demonstrate the steps:

1. Go to the collection landing page:

https://calisphere.org/collections/#####

2. Find the “Search within collection” box, and type in:

-subject_ss:["" TO *]

3. Click the “Refine” button

This will produce a set of results for this collection that have empty values within the subject field. The screenshot below shows that there are three items without a “subject” value.

Any metadata field can be evaluated this way by simply swapping out “subject” for the field of interest. The table below maps out each field’s search query.

field

Query to display items with empty/missing values

title

-title_ss:["" TO *]

alternative_title

-alternative_title_ss:["" TO *]

contributor

-contributor_ss:["" TO *]

coverage

-coverage_ss:["" TO *]

creator

-creator_ss:["" TO *]

date

-date_ss:["" TO *]

extent

-extent_ss:["" TO *]

format

-format_ss:["" TO *]

genre

-genre_ss:["" TO *]

identifier

-identifier_ss:["" TO *]

language

-language_ss:["" TO *]

location

-location_ss:["" TO *]

publisher

-publisher_ss:["" TO *]

relation

-relation_ss:["" TO *]

rights

-rights_ss:["" TO *]

rights_holder

-rights_holder_ss:["" TO *]

rights_note

-rights_note_ss:["" TO *]

source

-source_ss:["" TO *]

spatial

-spatial_ss:["" TO *]

subject

-subject_ss:["" TO *]

temporal

-temporal_ss:["" TO *]

type

-type_ss:["" TO *]

description

-description_ss:["" TO *]

provenance

-provenance_ss:["" TO *]

rights_uri

-rights_uri_ss:["" TO *]

transcription

-transcription_ss:["" TO *]


We hope these metadata analysis reports provide additional insight into your digital collections. Please contact us if you have any questions or feedback about the reports.