GBS: Harshing on the Metadata


I’m sorry I couldn’t make it to the Berkeley conference on Friday. It seems like a fascinating day; the panels managed to get away from the legal details and delve into some of the very hard policy issues. The video, I’m told, will be online within a couple of weeks. Until then, for a general conference wrap-up, here’s a collection of links to stories about it.

The most interesting panel of the day appears to have been the “information quality” one. It was moderated by Paul Duguid, who’s written a very thoughtful piece on scanning quality, and it featured an absolutely scathing presentation by Geoffrey Nunberg on the atrocious metadata to be found in Google’s catalog. He’s concerned that Google will have the “last library” (whereas I see my job as making sure that there will be others), and thus, he’s concerned that our universal library will have howlers like an edition of Bonfire of the Vanities from 1888 and books on the Internet from 1905. Don’t even get him started on cataloguing Tristram Shandy as “Biography and Autobiography.” His post at LanguageLog, A Metadata Train Wreck, gives the argument in some detail and responds to some of the objections raised by Googlers.

Relatedly: Google Turns Classic Books into Free Gibberish eBooks.