Выбрать главу

But it was the massive database itself that became OCLC’s real triumph. For a fee, a library became an OCLC member and got one or more dedicated Beehive terminals (advanced for their time, able to handle the diacritics that catalogers needed, when other computer interfaces generally offered only capital letters), each linked to Ohio. For two dollars per title, a member cataloger could look through OCLC’s records to see whether the book before her had already been cataloged by somebody else — either by the Library of Congress (whose MARC records OCLC bought and loaded into its database) or by another member library. (Each library was identified by a three-letter tag.) If she found a record, and the record looked good, she would request that OCLC print up a set of cards for it and send them to her. In this way, a library could eventually relegate a good deal of the cataloging work that had once been performed by degreed professionals to lower-paid clerks and student assistants.

And the brilliance of Kilgour’s enterprise was that if the cataloger did not find a record, she could undertake to describe the book herself, and contribute her work to the system as a “master record” for that book, for the good of all members. She wrote a sort of poem, following a set of rules more rigorous than a villanelle’s; she sent it off to people in Ohio who published it for her; and then she got paid a few dollars — in the form of a cataloging credit against future OCLC charges. The more fresh “copy” a cataloging department offered OCLC, the cheaper its use of OCLC was, and thus there was plenty of incentive for all libraries, engaged in the creation of a kind of virtual community long before there were such things as Usenet and listservs, to pump up the burgeoning database. What began mainly as a handy, unilateral way of delivering the Library of Congress MARC files to member libraries turned into a highly democratic, omnidirectional collaboration among hundreds of thousands of once-isolated documentalists: currently, there are close to thirty million records in the database, only a quarter of which originally came from the Library of Congress, the majority being the work of nearly seven thousand member libraries.

But amid this public-spirited hubbub there were some signs of trouble. “Distributed computing,” in the recent words of Paul Lindner, one of the architects of Gopherspace on the Internet, “is like driving a wagon pulled by a thousand chickens”—and distributed cataloging, although its principal database is anchored in central Ohio, exhibits a similar noisy, gabbling, drifting quality. Quality was, indeed, a serious problem from the start: predictably, some libraries were much more careful and skillful at describing books than others. Wright State University, out of misguided zeal or a lust for cataloging credits, reportedly pushed thousands of unwholesome records into the OCLC database — at least, Wright State is often dumped on now, perhaps undeservedly, by the folklorists of OCLC history. Libraries began to “blacklist” institutions whose three-letter tags were sure signs of bibliographic corruption. “The scuttlebutt got around fast as to who did sloppy cataloging,” one librarian told me. In truth, though, everyone made mistakes. The interactive, cooperative group authorship of a resource of this complexity was something utterly new, and since OCLC exercised no editorial control over the contributions pouring in from its members, the cumulative perils of Fred Kilgour’s forward-thinking system took perhaps longer than they should have to emerge.

One source of entropy was OCLC’s laissez-faire concept of the “master record.” The very first attempt to catalog a book on the database, no matter how unmasterly, how inadequate it might be to the needs of other libraries, became by default the “master record” for that book. For years — until, in 1984, OCLC granted a small group of libraries enhanced-member status, allowing them to improve upon faulty or skimpy records they encountered on their own — any sort of change to the master record was a laborious manual process. If a cataloger noticed the typo herself a week after she had conclusively pressed the send key at her terminal, she could not (if another library had tagged the record with its initials by then) correct her mistake onscreen; she had to fill out an error report and mail it (not electronically but with a stamp) to OCLC. I have heard librarians and professors of library science mention errors enshrined in the OCLC database that they haven’t bothered to take the time to try to fix — in some cases, serious errors affecting the retrievability of books to which they themselves have contributed.

The other serious weakness of the OCLC database was its lack of “authority control”—librarianship’s grand term for the act of naming entities (people, churches, government departments, periodicals, subject headings, and so on) consistently. Assume, to take a simple example using a university database, that you are assigned the task of cataloging an eminently hummable document by a person named Pjotr Iljics Csajkovszkij. Who is he? Is he perhaps the same individual as P. I. Cajkovskij? And does P. I. Cajkovskij bear some intimate relation to P. Caikovskis? Could it be that Peter Iljitch Tschaikowsky, Peter Iljitch Tchaikowsky, Pjotr Iljc Ciaikovsky, P. I. Cajkovskij, Peter Iljitsj Tsjaikovsky, Piotr Czajkowski, P. I. Chaikovsky, Pjotr Iljics Csajkovszkij, Pjotr Iljietsj Tsjaikovskiej, Pjotr Ilitj Tjajkovskij, P. Caikovskis, Petr Il’ich Chaikovskii, 1840–1893, Peter Illich Tchaikovsky, 1840–1893, Peter Ilych Tchaikovsky, 1840–1893, and Peter Ilyich Tchaikovsky, 1840–1893, are actually all the same man? If so (and this degree of title-page variation is by no means unusual for voluminous authors, many of them less well known than Tchaikovsky), the computer has to be informed of that fact outright; otherwise, symphonies and string serenades will be sprinkled haphazardly over the alphabet and a searcher won’t have any idea what he is missing.

Authority control has always bedeviled the makers of catalogs, and the bigger the catalog, the more eras of publishing history it covers, the hairier things become. For Sirine and Sirin and Nabokoff-Sirin, see Nabokov. For House & Garden, see HG. For Alexander Drawcansir, Petrus Gualterus, Conny Keyber, Scriblerus Secundus, John Trottplaid, and Hercules Vinegar, see Fielding, Henry (1707–1754). For Ogdred Weary and St. John Gorey, see Gorey, Edward (1925–). In the late seventies, the second version of the Anglo-American cataloging rules caused a convulsion of despair in libraries when it demanded that Samuel Clemens be officially called Mark Twain, just because more of his books appeared under his primary pseudonym than under his real name. The whine of power erasers was heard through the land. (In librarianship, “eraser lung” was the seventies equivalent of carpal tunnel syndrome.) It is safe to say, however, that the apostles of St. MARC completely failed to foresee how abysmally poor the computer would be at grasping the concept of human identity. A person — even a fairly inattentive person-paid to file cards in a card catalog all day can tell that “Alexander the Great, 356–323 B.c.” is the same man as “Alexander, the Great, 356–323 b.c.” and “Alexandria the Great, 356–323 B.C.”; we would also expect him to sense the unitary presence behind cards for “Montagu, Lady Mary (Pierrepont) Wortley, 1689–1762” and “Montagu, Mary (Pierrepont) Wortley, Lady” and “Montagu, Mary Pierrepont Wortley, Lady, 1689–1762”—to use examples from one online catalog. “The card catalog,” as Tom Delsey, of the National Library of Canada, wrote in 1989, “exhibited a relatively high tolerance for deviation from literal and logical norms.… Typographical errors or inconsistencies in headings could be silently corrected in the process of filing the card; added entries that did not match exactly the corresponding main entry on the card to which they were related could nevertheless be placed in their proper sequence in the file.”