Building A Sustainable Open Research Infrastructure

August 2020

Presenter: Alice Meadows, Director, Community Engagement, NISO

Speaker Notes

Introduction

[Slide One]

[Slide Two]

Before I talk about the research infrastructure, let’s define what we mean by infrastructure more broadly.

[Slide Three]

For the purposes of today’s talk I’m going to use roads as an analogy, as they’re an essential component of the global infrastructure.

[Slide Four]

Research infrastructure must be equitable.

[Slide Five]

Whether you’re a driver, a cyclist, or a pedestrian, the road network is available to you - it provides equitable access for everyone.

[Slide Six]

Likewise, everyone who participates in the research process has access to the research infrastructure - not just the obvious elements like persistent identifiers and standards, but also some less obvious “public goods” like common business practices and common technology solutions. This includes not just researchers, but also research managers and administrators, librarians, funders, and of course publishers and editors.

[Slide Seven]

Research infrastructure must be valued (be seen to add value).

[Slide Eight]

The road infrastructure provides a level playing field that allows organizations to develop all sorts of tools and services that benefit users - from delivery services to transportation options. By cooperating as a community to build and maintain the roads, organizations can then compete with each other based on the quality of their offerings.

[Slide Nine]

Providing equitable access to the research infrastructure works in the same way. Organizations cooperate to build and maintain it, and then compete with each other to develop tools that are valued by researchers - like manuscript submission systems and content platforms, as well as other types of value-added services like specialized databases.

[Slide Ten]

Research infrastructure must be trusted (and trustworthy).

[Slide Eleven]

There’s nothing worse than infrastructure that doesn’t work. In fact, typically users only notice the infrastructure when it fails - good infrastructure is effectively invisible. To continue the analogy, roads should be smooth and free of potholes, safe and reliable for all users.

[Slide Twelve]

Users also expect the research infrastructure to run smoothly and they get frustrated when it doesn’t - whether that’s being confronted with a 404 error, or not being able to submit a manuscript. Researchers are busy people and they don’t want to waste time on burdensome administration, especially if it doesn’t work properly. They want research infrastructure they can trust!

[Slide Thirteen]

Research infrastructure must be interoperable.

[Slide Fourteen]

People use many different forms of transportation, not just roads, and we need to make it easy for them to switch between each one. These images show the car train between England and France that enables travel between the two countries without ever getting out of your car!

[Slide Fifteen]

Likewise, we want to make sure that the research infrastructure is connected, not fragmented, for example, so that content can be readily indexed and cross-referenced, information can move seamlessly between different systems, for example, from funding systems to manuscript submission systems, so that grant information can be easily added.

[Slide Sixteen]

Research infrastructure must be sustainable.

[Slide Seventeen]

To be trustworthy, roads must be well built and well maintained. Ensuring adequate funds not just for building but for keeping our roads in good order is essential. For the road infrastructure, these costs are typically met by some form of taxation - direct (road tax) or indirect (income tax).

[Slide Eighteen]

And the same applies to the research infrastructure. Whether we are investing in software, hardware, or people, there are upfront and ongoing costs that have to be met. Having a viable and sustainable business model is essential. Many research infrastructure organizations are open - but that doesn’t mean they are cost-free - they are typically supported by some combination of membership dues and/or grant funding. Commercial research infrastructure tools and services are usually expected to be profit-making.

[Slide Nineteen]

Research infrastructure must be community-driven.

[Slide Twenty]

There’s no point in building roads to places that people don’t want to go - they won’t be used. Community-led decision-making ensures that we build roads and paths in the right places, for the benefit of all.

[Slide Twenty-One]

Likewise, research infrastructure is more likely to be used if it is driven by community needs - whether national or global, disciplinary or cross-disciplinary. Finding ways to invite community participation and feedback is essential to developing a successful research infrastructure.

[Slide Twenty-Two]

Research infrastructure must be open.

[Slide Twenty-Three]

Perhaps most importantly of all, there should not be any barriers to the use of the road infrastructure - roads must be open to all.

[Slide Twenty-Four]

And the research infrastructure must also be open to all. Inclusion not exclusion is a key principle. That means open source software, openly available documentation, tools and services that are free to researchers, and more.

[Slide Twenty-Five]

“Everything we have gained by opening content and data will be under threat if we allow the enclosure of scholarly infrastructures. We propose a set of principles by which Open Infrastructures to support the research community could be run and sustained.” Bilder G, Lin J, Neylon C (2015) Principles for Open Scholarly Infrastructure-v1, retrieved June 27,2020 , http://dx.doi.org/10.6084/m9.figshare.1314859

This quotation is from a seminal paper by Geoffrey Bilder and Jennifer Lin of Crossref, and Cameron Neylon, an open access advocate and professor of research communications at Curtin University in Australia. It sums up why it’s so important for the research infrastructure to be open. Their proposed open infrastructure principles overlap with the attributes I’ve been talking about. They focus on governance (community ownership), sustainability, and what they call insurance, which is really all about openness.

[Slide Twenty-Six]

There are many many open research infrastructure organizations around the world, though not all of them meet all these requirements. It’s important to note that commercial organizations can be open, if they meet the criteria I’ve just outlined. However, this is pretty rare, as you can imagine. Here are a few examples of fully open, not-for-profit organizations. They include persistent identifier organizations like Crossref, DataCite, ORCID, and ROR (the Research Organization Registry), tools like the annotation service hypothesis and protocols.io, and support organizations like Invest in Open Infrastructure and Creative Commons.

NISO

[Slide Twenty-Seven]

I'm going to be focusing on two -- NISO and ORCID.

[Slide Twenty-Eight]

Why do we need information standards?

[Slide Twenty-Nine]

To continue with the roads analogy, standards ensure that our road infrastructure is safe and accessible to everyone. Standards for curb cuts, street lighting, and emissions controls are just three examples.

[Slide Thirty]

NISO develops standards that make information flow more readily between systems and that make content more accessible to everyone.

[Slide Thirty-One]

Content distribution dates back to the invention of the printing press in the fifteenth century. Shortly afterwards, the first known standards were created, for the layout of book pages.

[Slide Thirty-Two]

Our vision is a world where all can benefit from the unfettered exchange of information.

[Slide Thirty-Three]

Our mission is to build knowledge, foster discussion, and advance authoritative standards development through collaboration among the cultural, scholarly, scientific, and professional communities.

[Slide Thirty-Four]

What is NISO?

NATIONAL — but operates globally
INFORMATION — in academia and beyond
STANDARDS — and best practices
ORGANIZATION — not for profit, community-led

[Slide Thirty-Five]

What Does NISO Do?

Creates, publishes, and maintains standards and best practices
Fosters adoption of existing standards
Educates the community on information and technology-related issues
Incubates thought leadership activities to advance technology

As well as developing, publishing, and maintaining standards and best practices (which we call recommended practices), we also promote the adoption of our standards, we run a program of 40-50 events annually to educate the information community about our work and about information technology more generally, and we encourage the generation of ideas for new ways to improve information technology.

[Slide Thirty-Six]

How Does NISO Work?
- Transparent and open
- Written procedures and right to appeal
All standards are freely available to everyone
- Collaborative and consensus-driven
- Members vote on new work items, draft proposals, etc
Equitable
- We aim for balance between stakeholder groups — libraries, publishers, service providers
Inclusive
- All views are considered, both members and non-members
Community-led
- Elected Board drawn from the membership
- 500+ volunteers on working groups, committees, etc.

You’ll see that our principles are very closely aligned with the criteria for open research infrastructure that I talked about earlier.

[Slide Thirty-Seven]

Who participates?

Libraries — academic, government, professional, public, including consortia
Publishers — commercial and not-for-profit; books, journals, multimedia; academic, government, trade; media; associations
Service providers/vendors — technology organizations serving libraries, publishers, and the wider information community

[Slide Thirty-Eight]

NISO standards include:

ISSN (now an ISO standard)
JATS (journal article tagging)
MECA (manuscript exchange between systems)
Seamless Access (improved remote access to content)
CRediT(contributor recognition taxonomy) — in progress

You’re probably familiar with many NISO standards, though you may not know that we developed them! These are a few examples. The ISSN (International Standard Serial Number) for journals was originally a NISO standard but has since been formalized as an ISO (International Organization for Standardization) standard. The others are all still managed and maintained by NISO. CRediT is one of our newest additions, and work to formalize that is still under way. I’ll talk a bit more about that shortly. NISO is also the Committee Manager for the ISO group on identification and description.

ORCID

[Slide Thirty-Nine]

Why do we need persistent identifiers for researchers?

This is a segue to talking about my former organization, ORCID, which provides persistent identifiers for researchers.

[Slide Forty]

Back to roads one last time! As a road user, you need signposts and maps to tell you where you are and help you find the best route to your destination. Maps incorporate landmarks and coordinates, which on their own are not useful. Who would know that these coordinates refer to Boston, Massachusetts, in the USA, where I am based?

[Slide Forty-One]

Persistent identifiers on their own are not hugely helpful, but in combination they act a bit like a roadmap, identifying and connecting different elements of the research ecosystem - the people (researchers), the outputs (publications, datasets, and more), and their organizations (universities, funders, etc). Different persistent identifiers are used for each of these different elements. As well as ORCID iDs, this slide shows a few other examples. DOIs are used mostly for publications but are also starting to be used for grants. ISGNs are International Geo Sample Numbers, used for identifyiing geological samples. ROR is a registry of research organizations. ISNI is the International Standard Name Identifier, used for a mix of people and places. RAiDs are Research Activity Identifiers and they can be used to group different identifiers together, for example, into a single project. Interestingly most of these are standardized in some way, either through ISO or NISO.

[Slide Forty-Two]

This is a nice example of why ORCID is needed. Names are very ambiguous. At the top of this slide you see a hyper-authored article which includes 38 contributors whose last name is Wang. That makes being able to correctly identify the right Dr Wang very challenging - for people and for machines. The example at the bottom is from another multi-authored article, but here everyone has an ORCID iD, which means they are uniquely identified, can be easily and reliably discovered by people or machines, and can then get the recognition they deserve for their work.

[Slide Forty-Three]

ORCID’s vision is a world where all who contribute to research are uniquely identified and connected to their affiliations and works, across disciplines, borders, and time.

[Slide Forty-Four]

What is ORCID?

OPEN — provides open tools and supports open research
RESEARCHER — across all disciplines, organization types, levels, geographies
(&) CONTRIBUTOR — includes everyone who contributes to research in any way
IDENTIFIER — a unique 16-digit number that can be used by machines and humans alike

ORCID stands for Open Researcher & Contributor Identifier

[Slide Forty-Five]

What does ORCID Do?

Provides a unique identifier (ORCID iD) for researchers and an open Registry of these iDs
Provides APIs (member and non-member) to allow exchange of information between systems
Enables reliable, researcher-controlled connections between ORCID iDs and other persistent identifiers

As well as providing researchers with their own unique identifier, ORCID also provides Application Programming Interfaces (APIs) that can be used share information stored in ORCID records with other systems, like manuscript submission systems, profile systems, and more. In combination with other types of persistent identifiers, this enables automatic connections to be made between researchers, their works, and their organizations. This in turn saves time and reduces errors, improves discoverability, and helps researchers get recognition for their work.

[Slide Forty-Six]

How does ORCID work?

Free registration for individual users
- Open documentation, FAQ, and 24/5 online support desk
Non-members
- Open API, open documentation, open source, support via user group
- Authenticate iDs, display with works
Members
- Member API, open documentation, technical/ implementation support from ORCID staff
- Authenticate, display AND connect data to and from ORCID records and their systems

Registering for an ORCID iD is free for users, and there is also an open API that anyone can use to collect ORCID identifiers from their own users and access publicly available information in ORCID records. Member organizations pay a fee that entitles them to use a member-only API which enables them to do more - for example, to add information to records, or to access data that is restricted to members only.

[Slide Forty-Seven]

Who Participates? (Numbers as of June 2020)

~9 million registered users from every continent
1,200 members from 45 countries
- Research institutions
- Publishers
- Associations
- Funding organizations
- Vendors/service providers

ORCID’s users come from every research community worldwide, and members are based in every continent. The vast majority - about 75% -are research institutions like universities. Around 10% are publishers or associations that use ORCID primarily for their authors and reviewers.

[Slide Forty-Eight]

ORCID Connections

~58 million works
~5 million employment affiliations
4.6 million + education affiliations
2.3 million + peer reviews
850,000+ funding activities
403,000+ membership and service activities
260,000+ invited positions & distinctions
~2,000 research resources

Although publishers are only a relatively small part of the ORCID membership, there are many many more publications connected to ORCID records than any other type of information. Most of this information has been added by the users, often by pulling it in from Crossref, Scopus, or another database, but they can also give Crossref permission to update their record automatically whenever they publish an article that has a Crossref DOI. Users have also added most of the other content, with two exceptions. Peer review activities an only be added by a third party - most often Publons - and the same is true of research resources, which are essentially in-kind grants to access, for example, laboratory equipment or a special collection.

[Slide Forty-Nine]

I mentioned the CRediT taxonomy earlier (also known as the Contributor Role Taxonomy). We are currently working with the community to formalize CRediT as a recommended practice (a less technical version of a standard), which ORCID are already planning to implement - a nice example of how the two organizations overlap and work together!

[Slide Fifty]

CRediT (Contributor Role Taxonomy)

Identifies individual contributions to research projects
- 14 roles — Conceptualization; Data curation; Formal analysis; Funding acquisition; Funding acquisition; Investigation; Methodology; Project administration; Resources; Software; Supervision; Validation; Visualization; Writing – original draft; Writing – review & editing
- Will provide transparency and enable improved attribution, credit, and accountability
Use cases:
- Recognizes research contributions beyond writing and drafting
- Supports wider research and researcher evaluation
- Supports identification of potential reviewers, experts, and other specialists
Implementation:
- Begin allocating terms within research outputs
- Approved as NISO project, end 2019, standardization work in progress

[Slide Fifty-One]

We need research infrastructure that is:

Equitable
Valued
Trusted (and trustworthy)
Interoperable
Sustainable
Community-driven
Open

And we need it to be adopted and used by the research community globally!

In conclusion, these are the elements needed to build and sustain a strong and robust research infrastructure - one that is open to and benefits everyone.

[Slide Fifty-Two]

You can help!

Please support the research infrastructure by implementing it, following best practices, encouraging your community to adopt it, sharing your feedback, identifying gaps, and contributing to its continuous improvement.

[Slide Fifty-Three]

Thank You!

Alice Meadows, https://orcid.org/0000-0003-2161-3781

Abstract

On Wednesday, August 5, 2020, NISO's Director of Community Engagement, Alice Meadows, delivered this talk to attendees of the Future of Scholarly Communication virtual conference, organized by Zhejiang University Press.

Building A Sustainable Open Research Infrastructure

Presenter: Alice Meadows, Director, Community Engagement, NISO

Speaker Notes

Introduction

NISO

ORCID

Related Information

2019 Storage Infrastructure Report

An Interview: The 2019 Storage Infrastructure Report

Increased Institutional Investment & Ownership of Infrastructure Reading List