Linked Data Report Addresses Needs of Archives and Special Collections


There are unique requirements for describing items held in archives and special collections so that the items may be readily discovered by local or remote users.  To identify exactly what barriers currently exist that hamper such discovery, OCLC convened a group of 15 professionals from their Research Library Partnership for a series of discussions. The resulting report, Archives and Special Collections Linked Data: Navigating between Notes and Nodes, outlines challenges and opportunities facing those charged with care of special collections in maximizing the advantages associated with linked data.  

The quoted clips from the report shown below provide a flavor of current challenges and in some instances, appropriate next steps, but they do not capture the full nature of the discussion. Reading the full report by those professionals working in these areas is recommended. 

  • Descriptive Data Models Do Not Currently Serve Special Collections

The dominant data encoding standards are not well adapted for special collections in a linked data environment...The result has been the note-ification” of descriptive information, with a lot of information found in note fields. 

  • Discursive Description Presents a Challenge to Entification

Linked data forces you to structure your data. But there are clear limits to being able to clearly express things that are in paragraph form. Archival description makes especially extensive use of note fields...Migrating this data to entities is time-consuming and requires as-yet underdeveloped data models...

  • Potential for Better Discovery

Many discovery-related tasks are challenging or impossible in the current library discovery environment

  • Prescriptive vs. Permissive Data Modeling 

“Current data models, where they exist, are shaped by disciplinary practice...Some elements of description that are shared or appear to be shared between disciplinary practices may mesh well together…, whereas other elements may not mesh together so easily. Because the data may not be harmonized, there is a need for statements of equivalence. This is a problem of siloed data and the data models for different disciplinary practices…”

  • Ethical Issues and Community Engagement 

“Publishing data can put people and organizations in harm’s way, such as information about human rights organizations, opposition political groups, etc. Making this data available in linked open data structures can be particularly risky because it is easy for this information to be exploited and propagated into other systems.”

  • Multilinguality

“There are serious challenges in sourcing authority records in multiple languages in the current environment; in some cases, sources exist but are unavailable. In other cases, major contributors of standards do not place emphasis on reaching multilingual audiences with the default and only language being English.”

  • Sustainability

“To date, the majority of linked data efforts have been grant-funded or special one-off projects. This has impacted the perception of value and utility, especially for library administrators and has made sustainability of these projects problematic.” 

  • Archival Issues

"There will need to be a shift in mindset to think about how to structure archival descriptions natively as linked data and develop the tools and standards necessary to create data for archives in scalable production environments. One first step might be the ability to include stable, authoritative identifiers in tools and systems already commonly used."

  • The need to express relationships and change over time

"Expressions of relationship is often central to understanding the collection as a research object -- who created it and under what circumstances, who owned or used it and in what ways, how it evolved over time and under whose influence...Current systems don’t support the clear expression of these relationships and only comfortably express a limited set of relationships between agents and resources."

  • The Long Tail of Authorities/Identifiers in Special Collections

"Because of the rare or unique -- and often local or regional nature of special collections -- they need to represent many people, families, and corporations that are not in authority files like the LC/NACO authority file. The barrier to being involved in NCACO and similar is too high (cost, training, etc.) for many archives and special collections."


The OCLC Research Position Paper is only 14 pages long; for those interested in applying Linked Data to their special collections and archives, it should be a quick and thought-provoking read.

Citation: OCLC Research Archives and Special Collections Linked Data Review Group. 2020. Archives and Special Collections Linked Data: Navigating between Notes and Nodes. Dublin, OH: OCLC Research.