Skip to main content

Data Curation - Cultivating Past Research Data for Future Consumption

Virtual Conference

About This Virtual Conference

Research data is an increasingly important component of communicating science. Responsibility to provide curation support for research data is falling on libraries, repositories, and archives. Support for research data is no small task, requiring expertise in data management, field-specific metadata structures, integration and sharing issues, potentially access control, and even rights management and privacy concerns. Cultivation and curation of this new form of scientific information at scale is a service that many in the scholarly world are expecting the library to manage, and librarians are well positioned to provide.

This virtual conference will explore the many aspects of data curation, including trusted-repository certification, metadata creation and management specific to data, systems deployment issues, facilitation of data sharing services, and data control issues. Speakers will provide first-hand experience of the unique challenges presented by curating data. The session will close with a panel discussion of future trends in data management and how libraries can prepare now to address them.

All registrants to this virtual conference will receive a login to the associated Training Thursday on Emerging Tools to Improve Management of Data to be held on September 8. (Separate registration to the training event only is also available.)  If you are unable to attend the Training Thursday in person, you can view the recording of the session.

Event Sessions

11:00 a.m. – 11:05 a.m. – Welcome and Introduction

Speaker

11:00am – 11:30am -- Research Data and Services in Academic Libraries – US and Europe

Speaker

Suzie Allard

Associate Dean for Research, College of Communication & Information
University of Tennessee - Knoxville

The recognition of the importance of research data has academic institutions seeking the best path for providing support to researchers that will help preserve this intellectual asset. Academic libraries are often the locus for providing research data services (RDS), which include data management planning, digital curation, metadata creation and conversion. Our empirical investigations in the US in 2011 and 2014 and in Europe in 2016, illustrate the various levels of RDS provided by academic libraries and identify the obstacles to expansion and growth of services.

11:30am – 12:00Noon -- The Data Lifecycle: Curating Partners to Curate Data

Speaker

Jennifer L. Lee

Director of Discovery and Access, University of Texas Libraries
University of Texas - Austin

As academic libraries evolve their approaches to managing the lifecycle of data, they must forge and manage relationships with partners, old and new, in order to effectively curate digital resources. The University of Texas Libraries engages partners across campus to develop and sustain the curation of our digital assets, including productive collaborations with various computing resources on campus.
 

12:00pm – 12:30pm -- How to curate research data: An 8 step guide with incentives to collaborate

Speaker

Lisa Johnston

Research Data Management/Curation Lead and Co-Director of the University Digital Conservancy
University of Minnesota - Twin Cities

As reproducibility and data sharing emerge as key issues for academic researchers, the data management services offered by the library must continue to scale. Building from a 2013 pilot (http://hdl.handle.net/11299/162338) the data curation workflows used at the University of Minnesota Libraries have grown into a robust service involving multiple data curation specialists that curate data deposited into our Data Repository for the University of Minnesota (DRUM) and appropriate subject data repositories. Data curation steps, including quality assurance, file integrity checks, documentation review, metadata creation for discoverability, and file transformations into archival formats, are value-add services that enhance digital data for long-term preservation and reuse.

This talk will explore the data curation workflows in place not only at my institution but from 20+ disciplinary- and institutional-based data repositories such as Dryad, ICPSR, Yale, and the U of New Mexico. These experiences, collected in a new ACRL book due out in September 2016 titled "Curating Research Data: Practical Strategies for Your Digital Repository," span the sequential actions that you might take to curate a dataset from receiving the data (Step 1) to eventual reuse (Step 8). And individual institutions putting these key data curation workflows into action is just the first step. We will also discuss a new Sloan-funded project called the Data Curation Network (https://sites.google.com/site/datacurationnetwork/) that aims to create a shared staffing model for providing data curation services across academic institutions thus allowing our services to scale beyond what a single institution might offer alone.

12:30pm – 1:00pm -- Ethics and Legal Requirements

Speaker

Melissa Levine

Lead Copyright Officer and Librarian
University of Michigan

1:00pm – 1:45pm -- Lunch

1:45pm - 2:15pm Case Study I: Level Up!: Building data services at the J. Willard Marriott Library

Speaker

Rebekah Cummings

Research Data Management Librarian, J. Willard Marriott Library,
University of Utah

Research data services have become a common fixture in academic libraries, yet many libraries still struggle to develop an appropriate and in-demand mix of services to support their research community. While an elite few offer seemingly endless curatorial assistance, the majority of libraries are building basic to mid-level services such as DMP support, workshops, and consultations. This case study provides a detailed look at the University of Utah Marriott Library’s data services, the rationale behind our current service model, the results of our campus data needs assessment, and how we plan to grow our technical infrastructure into the future. In addition to an overview of our data service mix, we will look closely at one current initiative, the Entertainment, Arts, and Engineering (EAE) Thesis Preservation Project, which highlights curation challenges such as irregular and proprietary file formats, copyright restrictions, long-term preservation, and a lack of appropriate metadata standards. This presentation will highlight the Marriott Library’s data curation accomplishments to date alongside an honest assessment of ongoing challenges.

2:15pm - 2:45pm Case Study II: The Metadata is the Message: Assessing, Curating and Publishing Data for the Humanities

Speaker

Ashley Clark

XML Applications Programmer, Digital Scholarship Group
Northeastern University

Humanists and digital humanists may not always be aware that they work with data, but they are particularly conscious of the need to publish the diverse products of their work. When those products are many and complex, publication is not just a matter of displaying them to users, but providing context and prioritizing findability. Just such a publication was required by the "Cultures of Reception" transcriptions created by the Northeastern University Women Writers Project. Facing inconsistencies in both metadata and data, unforeseen complexities in entity relationships, as well as a loss in organizational memory, the Women Writers Project staff chose to use data curation techniques, such metadata rehabilitation, to drive website development and publication. This presentation will sketch out the approach, its institutional prerequisites, its benefits, and its disadvantages.
 

2:45pm -3:00pm Break

3:00pm - 3:30pm Case Study III: NYU Data Catalog

Speakers

Nicole Contaxis

Data Catalog Coordinator
New York University Health Sciences Library

Ian Lamb

Project Systems Developer
New York University Health Sciences Library

Sharing research data for reuse and reproducibility has become a growing concern for researchers, particularly as publishers, government bodies, funding organizations, and universities encourage or mandate researchers to share their data. When researchers at the NYU Langone Medical Center (NYULMC) reported that they were having difficulty discovering and navigating licenses for datasets relevant to their work, the NYU Health Sciences Library (NYUHSL) developed the NYU Data Catalog. The NYU Data Catalog increases the visibility and discoverability of datasets created by NYU researchers as well as external datasets relevant to biomedical and public health research. By listing local experts for each dataset, the Library aims to foster collaboration between NYU researchers as they discuss and share different datasets. This presentation will focus on the types of datasets being cataloged, the work associated with outreach and curation, and metadata as a part of data curation.

3:30pm - 4:00pm Case Study IV: Data Curation for Quantitative Social Science Research

Speaker

Libbie Stephenson

Data Archivist (Retired), Social Science Data Archive
University of California - Los Angeles (UCLA)

A successful data curation program is based on an infrastructure consisting of well-formed policies; adherence to established standards and best practices; and, comprehensive work flows. Curation of data in social science data archives has been carried out since the 1960’s. This case study on the Socials Science Data Archive at UCLA will cover the standards and practices that have evolved over time to ensure the long term usability of these materials, including appraisal and data quality review, ingest and metadata creation based on the OAIS model, application of the Data Documentation Initiative metadata schema, and preservation workflow. The goal of this process is to ensure long term usability of data and enable replication of analyses, regardless of changes in technology, operating systems, software or devices. Background for this presentation can be found in Peer, Green and Stephenson (2014) “Committing to Data Quality Review”, International Journal of Digital Curation doi:10.2218/ijdc.v9i1.317.

4:00pm – 4:30pm Case Study V: A Multi-Decade Case: The Evolution of Data Products and Designated Audiences

Speaker

Karen S. Baker

Doctoral Candidate, Graduate School of Information Sciences
University of Illinois at Urbana-Champaign

A three-decade ethnography of data products originating from a single data set illustrates the processes of ongoing description, continuing development, and tailored delivery. This case study demonstrates knowledge mobilization through curation of a set of polar sea ice data products for a multiplicity of audiences.

4:30 p.m. - 5:00 p.m. Roundtable Discussion moderated by Todd Carpenter

Speaker

Additional Information

  • Cancellations made by August 24, 2016 will receive a refund, less a $35 cancellation. After that date, there are no refunds.

  • Registrants will receive detailed instructions about accessing the virtual conference via e-mail the Friday prior to the event. (Anyone registering between Monday and the close of registration will receive the message shortly after the registration is received, within normal business hours.) Due to the widespread use of spam blockers, filters, out of office messages, etc., it is your responsibility to contact the NISO office if you do not receive login instructions before the start of the webinar.

  • If you have not received your Login Instruction e-mail by 10 a.m. (ET) on the day before the virtual conference, please contact the NISO office at nisohq@niso.org for immediate assistance.

  • Registration is per site (access for one computer) and includes access to the online recorded archive of the conference. You may have as many people as you like from the registrant's organization view the conference from that one connection. If you need additional connections, you will need to enter a separate registration for each connection needed.

  • If you are registering someone else from your organization, either use that person's e-mail address when registering or contact nisohq@niso.org to provide alternate contact information.

  • Conference presentation slides and Q&A will be posted to this event webpage following the live conference.

  • Registrants will receive an e-mail message containing access information to the archived conference recording within 48 hours after the event. This recording access is only to be used by the registrant's organization.

For Online Events

  • NISO has developed a quick tutorial, How to Participate in a NISO Web Event. Please view the recording, which is an overview of the web conferencing system and will help to answer the most commonly asked questions regarding participating in an online Webex event.
  • You will need a computer for the presentation and Q&A.

  • Audio is available through the computer (broadcast) and by telephone. We recommend you have a set-up for telephone audio as back-up even if you plan to use the broadcast audio as the voice over Internet isn't always 100% reliable.

Please check your system in advance to make sure it meets the Cisco WebEx requirements. It is your responsibility to ensure that your system is properly set up before each webinar begins.