Geoff Nunberg over at Language Log gets stuck, justifiably, into the incredibly bad classification of books being scanned at Google Books. Give that this is, as he says, the likely last ever scan of all pre-electronic, pre-Mickey Mouse Amendment copyright, books, it means we have given up good libraries for a very bad one easily accessed.
But it’s not easily accessed, either. Many books in GB you cannot read, even when they are out of copyright. Sometimes you can download the raw PDF via The Internet Archive if you click on the “HTTP” link at the left, but much of what Google has is not accessible in any form. And as libraries are pressed for space, increasing numbers of old books are simply being thrown away or sold second hand.
In fifty years we may no longer have widely disseminated older books, journals and magazines. Thus, we erase our cultural memories.
There are lots of things that are just utterly mystifying about Google Books; like occasions when you can only get Snippet View of books and journals published in the early nineteenth century. Extraordinarily frustrating: they have the entire book scanned, it’s public domain, GB actually got the year right, it’s right there — and you can’t get anything more than a snippet (and sometimes the snippets have nothing to do with what you searched for — like snippets of blank parts of the page). You also occasionally get books scanned behind other books — completely differed books, but scanned together and classified according to the book in front.
My favorite Google Books lapse, though, was when I came across a book of nineteenth century Trinitarian theology that was authored, according to Google Books, by the Holy Trinity.
In ’08 I worked a temporary job at my campus library withdrawing books and journal series from the collection. Books were saved for the annual used book sale, but journals were just thrown out. Right before I started working, another library technician had withdrawn and threw into a dumpster years worth of the Journal of Paleontology and the Journal of Vertebrate Paleontology. I asked if either MSU’s geology department or the Museum of the Rockies were asked if they wanted the nicely bound journals. Nope, that wasn’t the library’s policy!
Well, I withdrew several runs of computer technology journals/reports, including one for Bell Labs. Bozeman is home to the American Computer Museum, and it now has in its collection stuff I withdrew. All it took was a phone call to the museum director and a formal transfer letter (for provenance), and it was done.
My favorite thing in Google Books is when you are viewing a book in its entirety, and going through page by page, and oops! – the scanner’s hand was in the scan. Somebody was working a little too fast.
One obvious thing to do would allow a button for you to hit if you think a book is likely public domain. They could then have a person go through and flag such books and make it easier to get access to them.
“In fifty years we may no longer have widely disseminated older books, journals and magazines. Thus, we erase our cultural memories.”
Indeed. Plus, relying so heavily on the internet as an information source is as dangerous (if not moreso) than it ever was. One of Albert-László Barabási caveats about the otherwise wonderful scale-free network infrastructure of the web is that one only needs to bugger a single main hub, of which Google is the biggest I believe, and you would witness a high tech equivalent of the burning of the library of Alexandria.
Comments are closed.