The Laboratorium
December 2010

GBS: Patent Infringement Claim Against Google Books

Paul Allen’s Interval Licensing has filed an amended complaint in its lawsuit against AOL, Apple. Google, and other companies, alleging infringement of U.S. Patent No. 6,263,507, “BROWSER FOR USE IN NAVIGATING A BODY OF INFORMATION, WITH PARTICULAR APPLICATION TO BROWSING INFORMATION REPRESENTED BY AUDIOVISUAL DATA.” Paragraph 29 alleges that Google Books is one of the infringing products:

Defendant Google operates an automated book classification system as part of its Google Books website and service. When new book information is received by Google, the hardware and software associated with the Google Books classification system indexes and categorizes the book. The categorization is based at least in part on a comparison between the new book information and information related to other books that have been indexed and categorized by Google Books. The hardware and software associated with the book classification system have infringed and continue to infringe at least claims 39, 40, 43, 82, 83 and 86 of the ‘507 patent under 35 U.S.C. § 271.

This Is Not Only a Test

On a lark, I decided to upgrade to Movable Type 5. So far, it looks successful, but please let me know if anything seems amiss.

Textual Corruption at Work

In 1852, John Leighton, under the pen name Luke Limner, published a list of twenty-nine “Notes on Books and Bindings,” with charming pieces of advice, such as “Never cut up a book with your finger, or divide a printed sheet if it be ill folded, or one page will rob the other of margin.” He published it in the July 31, 1852 issue of Notes and Queries, a kind of Victorian Ask Metafilter for scholarly questions about literature, language, and history. That issue was later collected in Volume 6 of the first series of Notes and Queries, at pages 94-95. You can read copies of it online through the Internet Archive or Google Books.

In 1870, John Power published A Handy-Book About Books, at pages 128-29 of which he reproduced Limner’s list. You can read copies of it online through the Internet Archive or Google Books. Power’s version has twenty-eight items, not twenty-nine. It cites to volume “v” rather than the correct “vi” of the collected Notes and Queries. And it is a royal mess. Compare, for example, some corresponding entries. Where Limner had:

Never lend a book without some acknowledgement from the borrower; as “I.O.U.—L.S.D.—‘Ten Thousand a Year’—L.L.D.”

Power has, simply:

Never lend a book without an acknowledgment.

Or Limner:

Never brand books in unseemly places, or deface them with inappropriate stamps; for to mar the beautiful is to rob after generations.

Versus Power:

Never brand books in unseemly places, or deface them with inappropriate stamps.

The one that most concerns me is the last item in the list. Limner’s version reads:

Never pull books out of the shelves by the headbands, nor toast them over the fire, or sit upon them; for “Books are kind friends, we benefit by their advice, and they exact no confessions.”

And here is Power’s:

Never pull a book from the shelves by the head-band; do not toast them over the fire, or on them, for “Books are kind friends, we benefit by their advice, and they reveal no confidences.”

Power has drained the vitality from Limner’s sentence; notice, for example, that he has managed to introduce a clash between the singular “book” and the plural “them.” Still, to modern ears, his final phrase is perhaps more powerful: “reveal no confidences” resonates in this age of digital books and privacy concerns. I found this latter version in Henry Petroski’s The Book on the Bookshelf, who cites to Power. But still I wonder. Limner put the final string of clauses in quotation marks, which makes me think perhaps he got it from somewhere else.

GBS: Jones and Janes on Anonymity in a World of Digital Books

Elizabeth A. Jones and Joseph W. Janes of the Information School at the University of Washington have just published Anonymity in a World of Digital Books: Google Books, Privacy, and the Freedom to Read in Policy & Internet. It is the most careful and sustained analysis to date of the privacy issues surrounding the proposed settlement, as well as being an absolutely crackerjack example of how to apply Helen Nissenbaum’s contextual integrity theory of privacy to a specific problem. Here is the abstract:

With its Books project, Google has made an unprecedented effort to aggregate a comprehensive public-access collection of the world’s books. If successful, Google’s collection would become the world’s largest and most broadly accessible public book collection—indeed, project leaders have frequently spoken of their desire to create a “universal library” (Toobin 2007). Still, the Google “library” would differ from established contexts for the provision of free, public access to reading materials—like public libraries—along several policy-related dimensions, of which perhaps the most glaring is its treatment of reader privacy. This paper teases out the specific differences in reader privacy protections between the American public library and Google Books, and what those differences might mean for the values and goals that such contexts have historically embodied. Our analysis is structured by Helen Nissenbaum’s “contextual integrity decision heuristic” (2009), which focuses on revealing changes in informational norms and transmission principles between prevailing and novel settings and practices. Based on this analysis, we recommend a two-pronged approach to alleviating the threats to reader privacy posed by Google Books: both data policy modifications within Google itself and inscription of privacy protections for online reading into federal or international law.

The article is available for free download with registration or at any institution with a site license.

GBS: No Exclusivity in the Scans?

Barbara Casassus, Google will not enforce exclusivity over library scanning, The Bookseller, Dec. 20, 2010:

In a major policy about-turn, Google has said it will not enforce exclusivity clauses in contracts to scan and index library book collection, according to an opinion about online advertising released last week by the French competition watchdog.

Partners of Google Book Search may sign agreements authorising other search engines to access automatically digital copies of books for indexing and search purposes, the Competition Authority quoted Google as saying in a letter last July 19 to Anthony Whelan, head of cabinet of European Commissioner for Digital Agenda Neelie Kroes.

Note that this is simply a statement that Google will not seek to exercise any contractual restrictions over the uses of digital scans it has provided its partners (at least for indexing and search). It doesn’t remove any copyright restrictions on the use of the work.

The actual opinion of the French Competition Authority is available online (in French). The relevant passage is at page 56 of the PDF:

Réponse du 5 octobre 2010 donnée par Google: Les bibliothèques partenaires de Google Book Search doivent limiter l’accès automatisé aux copies numériques créées par Google. Les partenaires de Google Book Search peuvent cependant signer des contrats avec n’importe quel autre moteur de recherche, autorisant le moteur en question à accéder automatiquement aux copies numériques des livres pour les indexer et y effectuer des recherches. Google a officiellement confirmé sa position dans une lettre, datée du 19 juillet 2010, à M. Anthony Whelan, chef de cabinet de la Commissaire à l’Agenda numérique (Neelie Kroes).

GBS: OCR Issues in the GBS Corpus

Danny Sullivan has been playing around with the new Google Ngram Viewer. He’s found that it has some trouble properly recognizing the medial S. When your corpus includes a great many pre-1800 uses of the word “suck,” this produces some unfortunate search results.

Natalie Binder has a series of posts on related issues:

GBS: Quantiative Cultural Analysis Demonstration Project

The Google Books team and a group of researchers from Harvard published a paper in Science, Quantitative Analysis of Culture Using Millions of Digitized Books. The abstract:

We constructed a corpus of digitized texts containing about 4% of all books ever printed. Analysis of this corpus enables us to investigate cultural trends quantitatively. We survey the vast terrain of “culturomics”, focusing on linguistic and cultural phenomena that were reflected in the English language between 1800 and 2000. We show how this approach can provide insights about fields as diverse as lexicography, the evolution of grammar, collective memory, the adoption of technology, the pursuit of fame, censorship, and historical epidemiology. “Culturomics” extends the boundaries of rigorous quantitative inquiry to a wide array of new phenomena spanning the social sciences and the humanities.

Here is an over-the-top article about it from the Guardian; here is a more restrained article from PC Magazine. And here is something rather more interesting: a tool to see trends in word use over time in the Google Books corpus. Just put in a term or terms, and you can see how the frequency with which they’re used changes over the years.

UPDATE: The New York Times has a well-written story on the paper and tool.

UPDATE: Geoffrey Nunberg has a typically long and thoughtful essay on the paper and tool in the Chronicle of Higher Education.

GBS: Digital Public Library Moves Forward

Library Journal reports:

A new effort is under way to create a blueprint for a comprehensive national digital library that will put the country’s cultural heritage only a mouse click away.

The Berkman Center for Internet and Society, a research program at Harvard Law School, announced December 13 that it will administer a collaborative effort whose goal is to create an overarching and open governing structure under which ongoing digitization projects, such as the HathiTrust and others, would willingly work.

The Alfred P. Sloan Foundation will provide the funding for the exploratory initiative, which is being billed as the “Digital Public Library of America.”

“The idea is to create a big tent where lots of people can work hard toward a public-spirited solution,” John Palfrey, the Faculty Co-Director at the Berkman Center and the Vice Dean of Library and Information Resources at Harvard Law School, told LJ. “It’s not a competitive effort. It’s meant to be complementary to its core.”

GBS: The Government Printing Office Joins the Google eBookstore

Google Books has an interesting new partner for its Google eBookstore: the Government Printing Office. Titles ranging from the federal budget to presidential papers will be available for purchase. The Washington Post article asks, “If the agency is doing less actual printing and providing access to more its publications online, is GPO still necessary? Or does the Government Printing Office perhaps need a new name?” I would ask it a little differently. What is the GPO doing charging for these books?

Consider, for example, the detailed appendix to the Budget of the United States Government for fiscal year 2011, as prepared by the Office of Management and Budget. One can obtain it:

Since the Budget is a government work, it is not subject to copyright. I can understand charging for the printed version, which weighs over five pounds. But for the ebook? What would happen, I wonder, were someone to take the free PDF and submit it through the Partner Program, with DRM off and a download price of $0.00?

GBS: ASMP Case Extended Again

No surprises here. The parties in the photographers’ lawsuit put off the deadline for Google to reply to the lawsuit for the seventh time. The new deadline is January 23.

GBS: Hemphill on Antitrust Analysis

Columbia Law Professor C. Scott Hemphill gave this year’s Milton Handler lecture to the Antitrust and Trade Regulation Committee of the Association of the Bar of the City of New York in April. His subject was “Collusive and Exclusive Settlements of Intellectual Property Litigation. The lecture is to be published in the Columbia Business Law Review, and what looks like the final PDF version has been posted to SSRN. The lecture draws a line connecting two recent high-profile issues involving “exclusive” settlements of IP cases, and asks whether antitrust law ought to care about them. His two examples are the Google Books settlement and recent “pay-to-delay” settlements in which a drug-patent owner pays off a generics maker to drop a lawsuit challenging the patent’s validity. Both parts are interesting; I’d like to focus here on the Google Books half.

Hemphill steers a middle course between the settlement’s antitrust skeptics and its antitrust defenders, making three and a half points. His first is that the settlement’s “de facto exclusivity” is by itself not an antitrust issue unless it “makes it harder for later entrants to achieve digital distribution of orphan works.” Here, his reasoning largely parallels Einer Elhauge’s defense of the settlement, which argues that multiple features of the settlement are likely to make it easier, not harder, for others to develop their own digital book platforms and to bring currently orphaned works to the public.

As a good scholar, though, Hemphill immediately asks what the implications would be if he were wrong about these effects. This leads him to consider the “fallback argument” of settlement proponents, which he terms “the ‘one is better than none’ argument.” After reading through the scholarly debate and explaining the ways in which the “one is better than none” formulation oversimplifies nuanced positions on both sides of the debate, Hemphill gives a new reading of antitrust law’s take on the issue:

The one is better than none view is an incomplete statement of antitrust law. Although, as noted above, antitrust enforcers cannot always insist upon a structure that is more competitive than the status quo offered by the parties to an agreement, sometimes they can. When presented with a joint venture that has some procompetitive and some anticompetitive features, an antitrust enforcer must consider whether there is a less restrictive way to achieve the procompetitive effect. As then-Judge Sotomayor put it, “a restraint that is unnecessary to achieve a joint venture’s efficiency-enhancing benefits may not be justified based on those benefits.” The insistence on utilizing a less restrictive alternative, where available, is shared by courts and enforcement agencies. In other words, we are not always forced to accept the bitter with the sweet.

This leads Hemphill to consider whether a judge considering a class-action settlement should reach the antitrust framing that might try to distinguish between bitter and sweet. His answer is “no.” The inquiry under Rule 23, he argues, should focus on class members, not consumers:

Even if consumers are harmed, they are generally outside the concern of Rule 23(e). (On this view, the benefits that the settlement brings to consumers should be ignored too.) Therefore, one could accept the argument that the settlement raises the cost of new entry, and yet approve the settlement on the ground that this effect does no harm to class members.

In a footnote that also offers a useful reading of several other class action settlements, Hemphill explains why keeping the antitrust analysis out of the class-action approval stage doesn’t mean ignoring the antitrust issues entirely.

Moreover, even in an antitrust case, the court may resist engaging in a full antitrust analysis at the settlement stage. For example, in Grunin, the court concluded that although “a court cannot lend its approval to any contract or agreement that violates the antitrust laws,” it would decline to approve the settlement on antitrust grounds only if the alleged illegality were “a legal certainty” or “illegal per se,” conditions not present there. Id. at 123–24. To undertake a full antitrust analysis in the settlement of a non-antitrust class action seems even further afield. The point here is not that the Department of Justice is wrong to raise antitrust objections—its authority under 28 U.S.C. § 517 to file a statement of interest is broad— but that the district court’s review is comparatively narrow.

This is a short paper—Hemphill gives the Google Books settlement a total of fifteen pages—but I found it interesting and helpful.

GBS: Google eBookstore Terms and Conditions

Here are some of the legal documents connected to Google’s new eBookstore:

Terms of Service (for readers):

Privacy Policy (for readers):

Terms and Conditions (for copyright owners):

Please let me know if there are any I should add.

D Is for Digitize Symposium Issue Published

I’m very happy to announce that the symposium issue of essays from last fall’s D is for Digitize conference has just published. I wrote the introduction for this collection of seven essays on the settlement by some of the leading scholars to study it. In light of today’s (entirely coincidental) launch of Google’s eBookstore, this volume is even more relevant. The authors and the editors of the New York Law School Law Review worked extremely hard to bring these essays to you; I hope you enjoy and learn from them.

The issue is available from the Law Review’s website. Here’s the table of contents: