Sloan Grant Supports Development of Resource Synchronization Standard

The Alfred P. Sloan Foundation has awarded a grant to NISO and the Open Archives Initiative to develop a new open standard on the real-time synchronization of web resources.

Problem

Increasingly, large-scale digital collections are available from multiple hosting locations and are cached at multiple servers. examples of such collections include the Internet archive’s WayBack machine, Twitter’s collection of tweets, and linked data collections such as Freebase or DBpedia. In addition, high profile portals rely on resources originating in many distributed repositories. examples include the Europeana portal, CiteSeer, Hathitrust, and OAIster. this proliferation of replicated copies of works or data on the Internet has created an increasingly challenging problem of keeping the repositories’ holdings and the services that leverage them up-to-date and accurate. as we move from a web of documents to a web of data, synchronization becomes even more important: decisions made based on unsynchronized or incoherent scientific or economic data can have serious deleterious impact.

The OAI Protocol for Metadata Harvesting (PMH) 2.0 specification can be used to effectively synchronize the metadata about the resources, but synchronizing the resources themselves was never specified. although some resource synchronization methods exist, they are generally ad hoc, arranged by the individuals involved, and cannot be universally deployed.

Proposed Solution

A standard will be developed for an interoperable, efficient, and lightweight mechanism to support synchronizing web resources at scale. the standard will save time, effort, and resources by repository managers by automating the replication and updating process. It will increase the general availability of content from these repositories and will alleviate the variety of problems created by outdated, inaccurate, superseded content that exists on the Internet.

Use Cases

  • Synchronization of linked data content
  • Recurrently collecting memento metadata from IIpc web archives to central aggregator
  • arXiv mirroring

CORE TEAM

Cornell University & OAI:
Berhard Haslhofer, Carl lagoze, Simeon Warner

Old Dominion University & OAI:
Michael l. Nelson

Los Alamos National Laboratory & OAI:
Martin Klein, Robert Sanderson, Herbert Van de Sompel

NISO:
Todd Carpenter, Nettie Lagace, Peter Murray

Footnotes

Resource Sync Workroom

www.niso.org/workrooms/resourcesync/