Library Digitization in Eleven Links
I can remember exactly what I was doing on December 14, 2004. That was the day that Google announced its Library Project (soon to be known as Google Book Search), and the information community was buzzing about the significance. I spent the day processing news accounts regarding the scope of that digitization project, summarizing them as best I could for our members, and then stayed later than usual in the office, waiting for a final clarifying detail to come from a Stanford librarian before hitting the “send” button on a post to the NFAIS listserv.
In 2004, there was still some uncertainty as to which of the various internet giants was going to be dominant. Amazon had announced earlier in the year that its A9 search tool had emerged from its beta phase. The integration of the A9 search tool with the Amazon website allowed students to rapidly uncover content from digitized versions of book titles, stored by Amazon for the purposes of encouraging sales, while also retrieving usefully relevant web content. For example, if an undergraduate used the system to research the notorious Civil War prisons Andersonville and Elmira, results returned could include a snippet from a published title (as part of Amazon’s Search Inside the Book feature) alongside relevant web-based content supplied by the National Park Service. Engineered by Udi Manber, A9 was viewed as an exciting innovation in surfacing and selling print products.
Meanwhile, Google had only just gone public with their IPO in 2004. That year, at the Frankfurt Book Fair, Google announced its Publisher Program, which promised to support the same type of search functionality. Publishers willingly signed up, unaware that the Library Project would be announced two months later. The Library Project was ambitious, digitizing titles acquired for collections held at Harvard, Stanford, the University of Michigan, the Bodleian Library at Oxford University, and the New York Public Library. This was a breathtaking step farther than Amazon, and the information community was thunderstruck as it tried to process the implications of what such an expansion could mean.
This is the story that is told in Along Came Google: A History of Library Digitization by Deana Marcum and Roger Schonfeld. Note the subtitle. This book documents from a library perspective the implications and long-term impact of Google’s move to make a significant corpus of “offline content searchable online” through optimized means of scanning and digitization. The outcome of Google’s ambitious project would ultimately be diminished, due to constraints resulting from extended legal battles, but key library leadership has managed to create the infrastructure needed to sustain and carry on the massive digitization needed. There were significant barriers to that work, as the authors note, despite the fact that “in this story, there are many actors, all of good intentions. Inevitably, it is also a story of limitations and failures to collaborate.”
The book asks key questions in its conclusions regarding whether it is possible to create a truly national digital collection for the United States as a whole. Can the information community in this country successfully collaborate in developing “a single coordinated program to provide digital access for the entire historical and cultural record that is easy to use and ubiquitously accessible”? If not, are we at least able to sustain one that is “the accumulation of many efforts, all of them incomplete, controlled by an array of different actors”?
That we are more likely defaulting to the second (less satisfactory) result becomes clear as one reads of the foundational efforts behind HathiTrust, the Internet Archive, Jstor, and a host of other partial aggregations of digital texts. Reading Along Came Google will make clear the tangle of competing interests with which libraries had to wrestle and the priorities that sometimes helped and sometimes stymied the creation of digital resources now taken for granted by students, scholars, and writers. The HathiTrust cooperative is such a case in point. The continued expansion of access to materials held in that collection represents a striking value, and their response to the need for access during the pandemic was well thought out. (Recent work on the reading interface has improved that user experience, as well.)
Ordinary mainstream readers, those not familiar with the information industry, are unlikely to ever recognize the names and efforts of Paul Courant, Wendy Lougee, Brewster Kahle, and others. Through interviews with these key figures, Marcum and Schonfeld convey the past work of these individuals, as well as their sense of where we are now.
One such quote appears on page 157: Paul Courant of Michigan is not ready to give up, even though he says “mass digitization is dead.” He adds, “We are back in that place where we do deals with various entities that want to digitize things, where we allow them to give us copies in a few years. We get the miscellaneous grant from Mellon to digitize. We have done well with hidden collections. But all the holes didn’t get filled. We still don’t have good solutions for preservation of current stuff. Now we are in Zeno’s paradox. Nobody wants to pay for it.”
That comment does not seem to promise great things for a national digital collection. It does sum up, however, the biggest ongoing issue faced by libraries. Melding together the wealth of print and digital collections for the public good is an expensive process, and libraries at all levels operate under fluctuating funding commitments.
After reading Along Came Google, I was reminded of Deanna Marcum’s 2016 Miles Conrad Lecture, in which she spoke of the need for digital leadership in libraries:
“Digital leaders are distinguished from non-leaders by their different combinations of skills, attitudes, knowledge, and their professional and personal experiences. Leadership must be driven by unique attitudes appropriate for the distributed, digital age. Digital leaders must be flexible and adaptable and possess wide intellectual curiosity and a hunger for new knowledge. They must be willing to see value in sharply different perspectives and be comfortable with uncertainty, and like all leaders of all times, must possess true passion for what they do. They look globally for solutions and challenges and also hunger for constant learning. They maintain a more egalitarian and results-oriented approach than the leaders who come before them.”
She continued, “The reason we need to concern ourselves with defining digital leadership is that libraries are in a pivotal moment, and a digital mindset is needed at every level of the organization. The utilization of digital technology in making research and teaching and learning easier and more efficient for those they serve is critical. Libraries’ very survival depends upon making the transition from a local institution to a node in a national and international information ecosystem. The skills needed to build a local collection are not sufficient for seeing the challenges and opportunities in a global environment.”
I also like a brief reference included in Marcum’s lecture—a comment made by a librarian at the time associated with the University of Ghent. “I am a humble librarian; I became humble when I saw what Google could do. And very simply, what Google has done is make information easily accessible. The local library is no longer a collection, but a set of services that connects the user to all information everywhere.”
Along Came Google: A History of Library Digitization does not pretend to answer the question of whether we will ever manage to bring together a cohesive national collection. The authors realize that the impact of the events following Google’s foray may not be properly understood for years to come. They simply document the library community’s story of disruption and adaptation through to the present. Throughout the pandemic, libraries of all sorts have been challenged to maintain service levels in delivering needed materials to students and scholars. We should be cheering those disruptive digital leaders who have reason to expect more and better from providers in assisting them to connect a global community of users to all information everywhere.