Выбрать главу

because we chose (unwisely it now turns out) to build the system around a set of proprietary software and hardware products marketed by the Xerox Corporation. Our relationship with Xerox soured when the corporation would not give us the tools we needed to export the image and index data out of the Xerox system into an open, non-proprietary system. About two years ago, we decided not to upgrade the image management system that Xerox built for us. Almost immediately we started having a series of system troubles that resulted in us abandoning (temporarily) our goal of getting the books online…. In the meantime, the images are safe on a quite stable medium (for now anyway).

The medium is magneto-optical disk; the project was paid for in part by the National Endowment for the Humanities.

Newspapers have pages that are about twenty-three by seventeen inches — twice as big as the upper limits Conway gives. The combination of severe reduction ratios, small type, dreadful photography, and image fading in the microfilmed inventory make scanning from much of it next to impossible; one of the great sorrows of newspaper history is that the most important U.S. papers (the New York Herald Tribune, the New York World, the Chicago Tribune, etc.) were microfilmed earliest and least well, because they would sell best to other libraries. We may in time be able to apply Hubble-telescopic software corrections to mitigate some of microfilm’s focal foibles, but a state-of-the-art full-color multimegabyte digital copy of a big-city daily derived, not from the original but from black-and-white Recordak microfilm, is obviously never going to be a thing of beauty. And no image-enhancement software can know what lies behind a pox of redox, or what was on the page that a harried technician missed.

In the late eighties, the Commission on Preservation and Access wanted an all-in-one machine that would reformat in every direction. It commissioned Xerox to develop specifications for “a special composing reducing camera capable of digitizing 35mm film, producing film in difference reductions (roll and fiche), paper, and creating CD-ROM products.” As with Verner Clapp’s early hardware-development projects at the Council on Library Resources, this one didn’t get very far. The master digitizers — Stuart Lynn, Anne Kenney, and others at Cornell, and the Mellon Foundation’s JSTOR team, for example — realized almost immediately that they shouldn’t waste time with microfilm if they didn’t have to. “The closer you are to the original, the better the quality,” Anne Kenney told me. “So all things being equal, if you have microfilm and the original, you scan from the original.” JSTOR came to the same conclusion:

One interesting discovery that we made in the process of obtaining bids is that working from paper copies of back issues of journals, rather than from microfilm, produces higher quality results and is — to our surprise — considerably cheaper. This conclusion has important implications beyond JSTOR.

It sure does have important implications: it means that most of the things that libraries chopped and chucked in the cause of filming died for nothing, since the new generation of facsimilians may, unless we can make them see reason, demand to do it all over again.

CHAPTER 35. Suibtermanean Convumision

The second major wave of book wastage and mutilation, comparable to the microfilm wave but potentially much more extensive, is just beginning. At the upper echelons of the University of California’s library system, a certain “Task Force on Collection Management Strategies in the Digital Environment” met early in 1999 to begin thinking about scanning and discarding components of its multi-library collections. Two of the librarians “anticipated resistance1 to the loss of printed resources, especially by faculty in the Humanities, but agreed that the conversation had to begin.” Others prudently pointed out that the “dollar and space savings would likely be minimal for the foreseeable future and should not be used to justify budget reductions or delays in needed building improvements.” Still others wanted to be sure that the organizers of the program arranged things so that the “campuses which discarded their copies would not be disadvantaged.”

For some years, Cornell’s Anne Kenney has been a leader of the scan clan. She knows (as she told the attendees at a Mellon Foundation-sponsored conference in 1997) that “the costs of selecting, converting, and making digital information available can be staggering.” Since it is so horribly expensive, she believes that the only way libraries will be able to pay for it is if “digital collections can alleviate2 the need to support full traditional libraries at the local level.” Therefore, over the past decade, in its various grant-funded scanning projects, Cornell University has snarfed its way through a banquet of old material, employing the language of earnest preservationism whenever it was expedient. (The books are “deteriorating,” “rapidly self-destructing,”3 etc.) They have disbound a collection of what are in some cases extremely rare math books6 of a century ago (and printed up germ-free facsimiles7 of them on a Xerox DocuTech printer), and books on Peruvian guano8 and butterflies and forestry. The paper facsimiles are a way of easing the transition to the digital library: “Conceivably, this may at some point allow librarians to propose other service alternatives as a substitute for traditional shelf storage,” says a footnote to the report — meaning that to use a book, you would have it printed out on demand or you would read it on-screen. Of course, if the book had to be printed out, there was an opportunity to generate a little revenue, too: “There may also be opportunities9 to underwrite some of the costs of preservation through the sale of facsimile editions.”

A few years later, Cornell got Mellon money to scan original runs of nineteenth-century American magazines like Scribner’s, Scientific American, Harper’s, and Atlantic Monthly. A wonderful nineteenth-century monthly magazine, replete with many hundreds of engravings, called The Manufacturer and Builder (already microfilmed in 1989 by the Northeast Document Conservation Center), was unmade by Cornell as one of its contributions to the digital Making of America project. (The Making of America was conceived by Stuart Lynn and others at Cornell in part to alleviate the problem of the “escalating cost of storage10 and the lack of adjacent building space.”) Ah, but it’s searchable, you may say, and neither the microfilm nor the original issues are: it’s worth destroying an illustrated run of The Manufacturer and Builder to get a fully searchable copy of it up on the Web. Yes, it is searchable, but because the type of the original is small and the resolution of the scanning is only six hundred dots per inch, the image-processing software doesn’t have enough information to chew on. As a result, while the images Cornell offers are legible, the OCR text available for your searches sometimes speaks in a language entirely its own. Here, for example, is Cornell’s searchable text of the beginning of an 1883 article about a subterranean convulsion11 in Java: