HathiTrust Single-Handedly Sinks Orphan Works Reform


In a series of blog posts yesterday whose tone can only be described as “gleeful,” the Authors Guild has been showing that specific books aren’t orphans. So far, they’ve found copyright owners or literary agents for J.R. Salamanca’s The Lost Country, Albert Bandura’s Adolescent Aggression, and James Gould Cozzens’s Confusion. They didn’t track down Walter Lippmann’s The Communist World and Ours, but it appears that someone else did. The legwork involved wasn’t particularly intensive: some Google searches, some queries of standard copyright-related databases, and some phone calls.

This would be a dog-bites-man story, except for the fact that all of these books were on HathiTrust’s list of orphan works candidates. Oops. All of these books had gone through HathiTrust’s workflow, which was supposed to carry out “due diligence” to determine whether these works were likely to be orphans.

Once is a mistake, twice bad luck, and three times is a sign of a broken process. The Authors Guild’s experiment demonstrates that HathiTrust’s orphan-tagging workflow cannot be relied on to identify genuinely orphan works with sufficient confidence to be usable. Out of 166 books originally on the list, at least four have been identified as non-orphans. A 2.5% false positive rate isn’t going to be acceptable.

The workflow itself isn’t described in particularly much detail, despite HathiTrust’s promise to “post as much of the project’s internal documentation as appropriate on this page.” It calls for:

  • A check that the book is not available on Amazon or Bookfinder.
  • A check that the author isn’t on the “live list.”
  • “Look for copyright holder contact information.”
  • “Attempt email contact.”
  • “Attempt phone contact.”

Whatever those last three steps comprise, it isn’t working. Whatever databases they’re checking for contact information aren’t sufficient.

On Twitter, Justin Grimes referred to these findings as “The ‘one example’ rule for invalidating arguments.” It’s true that these are individual books, not necessarily representative of the broader corpus of books scanned by Google and held by HathiTrust libraries. But this was also a sample chosen by HathiTrust itself. This was the libraries’ chance to put their best foot forward, to show that their process could be trusted, to show that there are real orphans out there. The results were not reassuring.

Legally, there are reasons why these non-orphans may not matter much in this case. Paul Aiken, Executive Director of the Authors Guild, has said that the lawsuit is primarily about the large-scale digitization (millions of books), not the much smaller Orphans Works Project (hundreds). The Authors Guild may have a hard time making legal claims specifically about the Project, for procedural reasons I’ll get into in future posts. Still, these discoveries are, as Eric Hellman said in a comment, “Major egg on the elephant’s face!”

And, looking to the broader picture, these revelations will discredit other efforts to make genuine orphan works more accessible. No one will ever be able to make the orphan works argument again without opponents bringing up the HathiTrust orphans that weren’t. Copyright owners will always regard such efforts with suspicion, as a pretext just for distributing the books, copyright be damned. And the idea of a “diligent search” sounds a lot less reassuring now that HathiTrust’s initial searches have been shown to be ineffective in multiple cases. The title of this post may be an exaggeration, but not by much.

I hope to update this post to deal with any responses from HathiTtust and the libraries, and with further developments.


James, as you know the University of Wisconsin, in what I consider a commercial enterprise with Google made unauthorized copies of my work in 2008. The university’s illicit copy was given to the HathiTrust. I insisted that my work be removed from their databases and on May 25, 2009 in an email to me John Wilkin of the HathiTrust wrote:

“We have received a request from the University of Wisconsin, which authorized the digitization of your work for preservation purposes, that they wish to remove your work from HathiTrust” (emphasis mine)

I am sure if Google & Company had more respect for the copyrights of others the HathiTrust would have been more successful in their search for copyright owners of these so cxalled “orphan works”.


Due diligence does not equal infallibility. Insofar as the stated purpose of the Orphan Candidate List is “to help and encourage possible copyright holders to identify themselves so that we can identify them in record,” then the system already appears to be working. Far from demonstration of the HathiTrust’s ineptitude or bad faith, this is only further demonstration of the HathiTrust and its partners’ willingness to respect the intellectual property rights of authors and legitimate copyright holders.


This is a fair point. Opening up the lists has corrected some misclassifications before the books actually went live. But the public scrutiny the current list is receiving is unrepresentative. I don’t think we can expect the attention given by outsiders to a list of 166 books in the week a lawsuit called attention to it to scale to tens of thousands at other times. And that’s what really matters: the accuracy and sustainability of the system.

On Twitter, David Sanger suggested using crowdsourcing to identify and find copyright owners. This is an interesting possibility, one that could help build better registries for clearing rights. That said, some of the issues the Authors Guild alleges with these specific books could have been answered by turning to existing registries.


If random members of the “crowd” can find agents, heirs, Amazon for-sale listings, and other readily and publicly available information, surely the Hathi Trust can do it. Their whole point is, they just don’t want to take the trouble to comply with copyright law. However, it is not fair for them to expect other people to do their due diligence for them. AFAIK, according to US copyright law, the Hathi Trust bears the burden and is legally liable for their own copyright violations.


What is unclear to me at this point is whether the Guild has succeeded in identifying rights holders. The examples they’ve listed are of finding the authors of some of the works in question. But the books in question were written at the beginnings of the authors’ careers, at a time when it was not uncommon for authors to turn over all rights to their publisher. If those publishers have gone out of business and their is no clear trail of where the copyrights went, does the still living author have any rights over those books? Is it the case that a particular work may still be an orphan even if the author is easily identified?


T Scott is exactly right. AG and James are both making way too much of these putative reunions. More to come.


A source at a major research library tells me that “The Lost Country” has not circulated there since at least 1993, when their records went digital. It has collected dust on the shelf for almost a decade (at least). The book is out of print. This is not exactly a leak of the latest Kanye record.


Out of 166 books originally on the list, at least four have been identified as non-orphans.

I can add another.

Yesterday, in a comment, I pointed out that contact details for the representatives of the estate of Eleanor Farjeon could be easily and quickly traced using the WATCH database managed by the Universities of Reading and Texas.

Following this, steps were taken to notify the primary agency. Her book Portrait of a Family has now been removed from the list.

As to what the Orphan Works Project is doing by way of ‘checks’: well, they certainly are not using the WATCH database. In addition to the Farjeon estate, and the estate of Pulitzer Prize-winning novelist James Gould Cozzen (mentioned above), it also has a contact address for the estate of Frederick C. Copleston, SJ, a distinguished historian of philosophy, whose book A History of Philosophy is on the list. His estate is now controlled by a Jesuit charity. The WATCH database may well have contact details relating to other authors on the list; I have only looked up some of those whose names I immediately recognise.

I am not, as it happens, totally opposed to properly managed arrangements for republishing printed works whose copyright holders have not been located following a thorough and transparent search. But what we are seeing here demonstrates only too clearly one of the major reasons why many authors are utterly opposed to any such scheme. The University of Michigan and the HathiTrust are not being transparent about the searches they have conducted, and it is clear that they have not laid down an effective methodology.

T. Scott suggests that in the case of some of these books, the rights may have been transferred to a publisher who cannot now be traced. If this is indeed the point at issue with certain works on the list, then the HathiTrust should be reporting this in the individual records. Incidentally, there is a companion to the WATCH database called FOB: ‘Firms out of Business’.

I understand that in the UK, if a company is dissolved (as opposed to being taken over by or amalgamated with another company), it falls to the Treasury Solicitor to manage any intellectual property that may have belonged to it. I assume there are similar arrangements in the US.

With regard to Tom Bruno’s point: it is emphatically not reasonable that copyright holders and their agents should be burdened with making up for the inadequacies of a crew of overreaching librarians.

(And no, I am not opposed to libraries. I love them. It is the behaviour of certain librarians that I increasingly find disturbing.)

Incidentally, the Google cache indicates that there were 167 books on the HathiTrust list on 11 September. So six have since been removed. Two of the works whose authors have been traced by the Authors Guild are at present still on the list.


A source at a major research library tells me that “The Lost Country” has not circulated there since at least 1993

That is beside the point, as you should realise. The point is that the HathiTrust is supposed to have carried out a diligent search for the rights holders, and that this process has been shown to be inadequate.

Here is a quite startling demonstration: A History of Philosophy (11 volumes), by Frederick C. Copleston, is currently available, new, on amazon.com.


I am, offhand, not aware of any point in publishing when it was typical to sell all rights to a book publisher. It certainly did and does happen; but generally, it’s not perceived as a good financial deal by professionals or their literary agents.

Writing as a staff member for a corporation is a different issue. I have written a number of computer manuals whose copyrights belong to the corporations who developed the software, where I worked as a full-time employee. On the other hand, I do not think the Hathi Trust list presented includes this kind of corporate work.

I am fairly ignorant of practices regarding the dissolution of US businesses, but I will make a shot at a comment. (A) The business can be bought by, or merge with, another business. I would assume intellectual property is acquired along with the business’s other property. Or (B) The business goes bankrupt, and its property is divided among the creditors. If a publishing business owns intellectual property, I imagine its creditors try to use that somehow to get some of their money back, quite possibly by selling rights to other publishers if they do not publish books themselves.

Another issue: Photographs, drawings, book cover illustrations, maps, etc., are often created by someone other than the person(s) who wrote the text. Rights to illustrations in one book may well belong to a number of different people. For example, art books very commonly license rights from the many different museums who own the works of art discussed in the book. (The museums would rather take photos and license them to others on request, then haul valuable original artworks out of storage every time someone wants to take a photo.)

While it is yet more trouble for the Hathi Trust (or any other entity that wants to declare works “orphaned”), to locate an illustrator other than the author—let alone a number of illustrators—this needs to be undertaken before the copyright can be considered researched.

Also, the introduction, and other supplementary material such as appendices, are sometimes not only written but copyrighted by someone other than the author of the main body of the work.

Then, of course, there are collections of essays, short stories, poems, etc., written and copyrighted by numerous different authors. The publisher typically holds the copyright to the selection and organization of the work for this anthology; but often, does not actually own each essay, story, or poem.

In short, all the multiple-copyright-owner issues present for the versions of the Google Settlement presented, also apply to the works proposed as so-called orphans by the Hathi Trust. However, I strongly believe that anyone who considers it too much trouble to research copyright status and locate copyright owners should just take the route of not using a work they have not gained the legal right to use.


Another way property of a bankrupt company can be dealt with is by auction. A trustee sells it to whoever attends the auction, and then the proceeds of the auction are divided among the creditors.

As far as I know, the unclaimed property databases posted by US states list financial assets in accounts the owner has neglected for a sufficient length of time (and which assets are absorbed by the state if the owner fails to claim them), but not intellectual property.

Fran


On a related point Hathitrust and Google have digitized a number of 19th century works using late 20th century reprints as their source rather than the original volumes. Whilst they are perfectly entitled to copy the latter I doubt they have any legal authority to make copies the republished works.


Some further points:

First, regarding J. R. Salamanca’s The Lost Country.

i) Salamanca has two books currently in print and available on Amazon. One is a reprint of a book first published in 1961. If The Lost Country were to be erroneously identified as an orphan, that would weaken protection for Salamanca’s other books; it is reasonable to fear that he might be tagged in future as a missing author, and his books considered fair game.

ii) Did the HathiTrust contact the publisher of the volumes that are in print? One presumes not. Did they even notice that he had books in print? They evidently failed to notice the Copleston reprints. This raises again the question: how is the HathiTrust conducting their searches for copyright holders? Why are they so vague about the details?

At some point this morning, Salamanca’s The Lost Country was removed from the list.

Which brings me to another point. The fact that the Authors Guild and some other parties have already had notable successes in tracking down holders of rights in works erroneously tagged as ‘orphans’ does not constitute any sort of evidence that the ‘system’ is ‘working’, as some librarians have been saying on the web. The system is a train wreck. HathiTrust bears responsibility for this.

Let me spell this out: authors, their representatives and friends have plenty of work to do of their own, productive work. It is not reasonable to expect them to sort out messes created by salaried library bureaucrats.

And James is right, of course: scale is one of the issues. Right now, one can run one’s eye down the list. But if the HathiTrust continues in this course, what is going to happen in a few months’ time? Will every literary agency in the world have to spend a couple of hours a month running the names of their authors through the HathiTrust’s database? Will every author, and every literary executor, be expected to check in on a regular basis? What a total waste of time.

Yesterday I said that I was not entirely opposed to properly managed orphan works schemes. But as I contemplate what is happening here, I am becoming increasingly mistrustful.


Sebastian, under United States law, that will depend on whether the reprinter has added anything copyrightable to the work. For a photographic reprint, the answer may well be “no.”


Michigan responds to problems in the orphan works determination process: bit.ly/oxqj1h


And as a result of the design of our process, our mistakes have not resulted in the exposure of even one page of in-copyright material. University of Michigan (Emphasis mine)

Such arrogance!!


They are not admitting that the authors and publishers who identified non-“orphan” works are not, in fact, the authors and publishers of the specific works identified. Does that mean the University of Michigan explicitly expects to “crowd source” copyright research?

As for the exposure of “even one page of material,” they haven’t posted the entire books yet, right? Meaning at this point copyrighted material would not have been exposed even if no one had done any copyright research for them.


BTW, the University of Michigan’s post raises one issue:

Suppose they post another 150 or so books on their next list, for which they have actually done such exhaustive research that not a single book with a readily locatable author, publisher, or agent is on the list. And not a single copy is available anywhere online.

So what? That does not mean the University will bother to research the next 20,000 books. They can call it “testing the process” all they want, but there is no guarantee of any consistency.

I am opposed to so-called “orphan works” projects because their whole point is that the entities who want to use the works, don’t want to spend the time and money to research the copyrights and copyright holders. If they did that, they could actually go the final step and ask for permission. Of course, they don’t want to be refused … another reason to just use the works and hope the copyright holder doesn’t notice.


Actually, it is much more common for academic authors to sign over copyrights or broad exclusive licenses than it is for authors of popular fiction. It is also more common for academic authors to have very little understanding of the rights they have given or retained, and hence to leave behind orphaned works. Membership in the Authors Guild is limited to authors who retain the copyrights in their works, and who clearly are very watchful over their rights, which means the Guild doesn’t necessarily represent the ‘typical’ authors of the scholarly works that make up the vast majority of research library collections.


Brandon,

You’re just making blanket assertions as to how academics and authors “feel,” and using this as a basis for why their copyrights should be violated. Even your assertion that university library books are limited to the scholarly is invalid. And if it were true, it is no excuse for violating the copyrights of some authors on the grounds that other authors may not care.

In fact, even your assertion that the owner-finding efforts regarding the books on the HathiTrust list are all members of the Author’s Guild, is invalid.

These are legal issues. They are not issues of how you “feel” about copyrights and how you assert others “feel.”


My goodness, Frances. Considering this issue isn’t about how people “feel,” your postings and Douglas Fevan’s certainly read as remarkably emotional.

May I recommend a review of a well-regarded, but in this case too rarely cited, law:

Article I, Section 8, Clause 8 [The Congress shall have Power…] To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries.

I’m intrigued by the suggestion that the villains are the universities and libraries trying to promote the progress of science and arts and not the guilds advocating the loss of so much information for the sake of a theoretical loss of pennies. I say “theoretical loss” because no one would ever seek permissions for those non-orphaned materials if they weren’t in a repository because no one could ever find them.


Considering this issue isn’t about how people “feel,”…

Do I feel anger? You betcha!! When a large American university uses my work as currency to secure pirated copies of my work from a large American Corporation and distributes those copies to its pal, yes I feel anger.


And those limited times already exist under copyright law. Those so-called “orphans” will fall into the public domain. Why the haste?

I am an author and publisher. The work of authors, illustrators, and other creators of works is what is promoting the progress of the sciences and arts. The labor and financial investments of publishers, and other such entities such as music companies, enable authors and other creators of works to focus on creating the works, rather than spend large amounts of time and money producing and marketing those works.

Libraries are just disseminators. Along with wholesalers and retailers. The progress of the sciences and arts can continue without libraries, but not without the creators and producers of works being paid for their substantial investments of labor and money.

In plain English, culture can’t do without us creators of works. The libraries are merely trying to appropriate language that applies to the people who are, frankly, doing more than just scanning and posting other people’s work. We are putting real labor and money into it, far more collectively than they are.

And, a work being available in print form in a library and/or on the used book market does not constitute “unavailable,” or a work no one will ever hear about. Ever hear of interlibrary loan of print books—which is quite different from the Hathi Trust’s publishing efforts? Was culture dead, or the progress of the sciences and arts in any way hindered, before there were e-books or an Internet? Was no one reading books before the net, in libraries or elsewhere?

I’m not arguing that all those libraries bought print books just to put them in a sealed vault where “no one could ever find them,” which is preposterous. I am arguing that the Hathi Trust, in scanning print books without permission and proposing the distribute the scans, is violating copyright. Which quite literally means “the right to make copies.

There is no such thing as an author’s “guild” in the sense of a Renaissance guild that controls access to trade secrets, the number of people who can enter a trade, and their qualifications for doing so. Anyone can become an author, and many people do.

But, as someone who has spent well over 20 years and tens of thousands of dollars of my own money creating books, I need to get paid for my work. And yes, my books are in libraries that are part of the Hathi Trust—even though they are not scholarly.

I am sure you feel much more contented about the prospect of getting (even more) free books (even more) easily from libraries than I am by the prospect of losing my livelihood and therefore, my profession. But, exactly how is it a flaw on the part of writers to earn a living, or even want to be paid for years of work? The exercise of finding rights holders for the first Hathi Trust list has already proved that not all these works are abandoned and that someone does care about them.

And if you are argue that these works are so obscure and undesirable that no would ever want them, then why is it so important to violate thousands of copyrights to distribute them in e-form instead of print form? After all, the libraries already have the books—it’s not as if they are actually supplying anything new.


And those limited times already exist under copyright law. Those so-called “orphans” will fall into the public domain. Why the haste?

I am an author and publisher. The work of authors, illustrators, and other creators of works is what is promoting the progress of the sciences and arts. The labor and financial investments of publishers, and other such entities such as music companies, enable authors and other creators of works to focus on creating the works, rather than spend large amounts of time and money producing and marketing those works.

Libraries are just disseminators. Along with wholesalers and retailers, who will still be around even if libraries disappear. The progress of the sciences and arts can continue without libraries, but not without the creators and producers of works being paid for their substantial investments of labor and money.

In plain English, culture can’t do without us creators of works. The libraries are merely trying to appropriate language that applies to the people who are, frankly, doing more than just scanning and posting other people’s work. We are putting real labor and money into it, far more collectively than they are.

And, a work being available in print form in a library and/or on the used book market does not constitute “unavailable,” or a work no one will ever hear about. Ever hear of interlibrary loan of print books—which is quite different from the Hathi Trust’s publishing efforts? Was culture dead, or the progress of the sciences and arts in any way hindered, before there were e-books or an Internet? Was no one reading books before the net, in libraries or elsewhere?

I’m not arguing that all those libraries bought print books just to put them in a sealed vault where “no one could ever find them,” which is preposterous. I am arguing that the Hathi Trust, in scanning print books without permission and proposing the distribute the scans, is violating copyright. Which quite literally means “the right to make copies.

There is no such thing as an author’s “guild” in the sense of a Renaissance guild that controls access to trade secrets, the number of people who can enter a trade, and their qualifications for doing so. Anyone can become an author, and many people do.

But, as someone who has spent well over 20 years and hundreds of thousands of dollars of my own money creating books, I need to get paid for my work. And yes, my books are in libraries that are part of the Hathi Trust—even though they are not scholarly.

I am sure you feel much more contented about the prospect of getting (even more) free books (even more) easily from libraries than I am by the prospect of losing my livelihood and therefore, my profession. But, exactly how is it a flaw on the part of writers to earn a living, or even want to be paid for years of work? The exercise of finding rights holders for the first Hathi Trust list has already proved that not all these works are abandoned and that someone does care about them.

And if you are argue that these works are so obscure and undesirable that no would ever want them, then why is it so important to violate thousands of copyrights to distribute them in e-form instead of print form? After all, the libraries already have the books—it’s not as if they are actually supplying anything new.


I will add that all market losses are theoretical until they have occurred—but that does not mean they will not occur. Bear in mind that authors and publishers of out-of-print works are now widely looking to e-book and print-on-demand technologies to reissue older works, or to keep them in print of they are still in print—with the purpose of making money from them. And yes, they often hope to sell those books to libraries as well as to consumers. The Hathi Trust using one e-copy and transmitting it to all its members (and as far as I know, there is no limitation on the number of new members who may join the Hathi Trust), instead of the publisher or author selling each a new copy, can cut into sales quite a lot. And then, any one student uploading a copy to a torrent site might well kill sales.

People are taking the attitude of “If the neighbors leave their lawnmower out for a couple of weeks, they must not be using it, I can use it, therefore I am entitled to take it.” This is theft, with intellectual as well as physical property.


must not feed the trolls…must not feed the trolls…


I have been digging into the Google cache.

This contains evidence that early this week the University of Michigan removed from the list of so-called ‘Orphan Candidates’ the following work:

The story of Babar, the little elephant by Jean de Brunhoff (1899-1937), translated from the French by Merle Haas. (New York, H. Smith and R. Haas, 1933)

This well-loved children’s book is still in print and available from amazon.com. The translator Merle Haas died in 1985. It seems likely that the translation is still in copyright. But in any case, Babar is a registered trademark. Lucky escape, HathiTrust!


And as a result of the design of our process, our mistakes have not resulted in the exposure of even one page of in-copyright material.U-M Library statement on the Orphan Works Project, 16 September

About this too the University of Michigan Library is in error. It is currently possible to access through the HathiTrust orphan works database the full text of at least one book which is in copyright in the UK. Not only is it in copyright, but it is in print and being actively exploited by the copyright holders. (It is in print in the US as well.) This book is from the University of Michigan Library and was digitised by Google.

HathiTrust has marked it as in the public domain. Perhaps it is – in the US. Though after the last few days, I don’t have much confidence in their ability to get things right. Establishing copyright under US law often seems to be a complicated business. In Britain, the matter is more straightforward. The book is unquestionably in copyright. It is in print and available for purchase. By making the text freely available to users outside the US, HathiTrust are, to say the very least, intruding on the rights of the copyright holders and interfering with their exploitation of the work.


And if they put it on the net it’s available to anyone with a connection. Any attempt at national restrictions is meaningless, given how quick and easy it is for even the most technically clueless to use a proxy server.

I will bring up yet another point: These are the Google scans. If the public-domain scans Google posted are any indication, the scans of so-called orphan words are replete with missing and blurred pages, foldout pages not folded out, shots of human fingers on pages, and error-filled, unproofed OCR text. If the aim of the libraries is to aid scholars, this kind of quality doesn’t suffice.

On the other hand, if the aim of the libraries is really to save considerable shelf space, and therefore money, by throwing out a bunch of print books in favor of easily storable scans, maybe they think this level of quality does suffice.


Brandon It is not trolling

In economic terms academic authors are ,relatively, well paid employees of large corporate structures. Therefore it is natural that academic authors will see measures that increase the wealth and increase the market dominance of the large cooperate structures that employ them as a good thing.

And Authors that are individual sole traders or small businesses will see that same increase in the power of large corporate structures as a threat to their trade , a bad thing.


Actually, academic authors often write books that are used as course textbooks, which can be quite lucrative.

But, the point of an author maintaining control of the copyrights of his her her work, is that the author has the opportunity to do whatever he or she chooses. For example:

  • To write to make money from the work itself, to write to support an academic or corporate career, to write a book to accompany and be sold with a non-book product, to write from a desire to inform or educate, or to write from a desire for creative self-expression. Or several of these motives at once: They are not mutually exclusive.

  • To choose from among publishers, the one the author judges best able to produce and promote the work well. Or to self-publish.

  • To keep a book in print, or not, depending on the author’s economic and personal motives. The author may have produced a revised, better edition that will make more money, the author may have repudiated beliefs expressed in the book, the author may have been sued for alleged libelous statements in the book.

If an author wants a book not to be circulated, he or she may not be able to stop the circulation of used print copies, but at least he or she can refrain from actively producing and promoting the book. On the other hand, the author and/or his her or heirs may wish to keep the book in print for the length of the copyright term, to make more money (even if not much at a time), or to reissue an out-of-print work when its subject becomes more timely, or new publishing technology becomes available.

  • To grant rights to the book to other parties, for money or not, up to and including voluntarily choosing to put the book into the public domain before the copyright expires.

Every author has his or her own goals (usually a mixture of them). Every author experiences a different degree of success in achieving his or her goals. Every book is different too, in terms of its editorial and production needs, the labor involved in writing and producing it, the expense, the sales of the book, the sales of subsidiary rights such as movie rights or sales of print extracts of the book, and many other things.

One problem with the proposed Google Settlements is they assumed all books and authors are the same. The Hathi Trust scheme presents the same problems.


Forgot to say, the sales trajectories of books also vary a great deal. Some books go out of print in a few years and are never reprinted. Some are what is sometimes called “evergreens”: They sell year after year after year, perhaps modestly in any given year, but all the sales add up. Some go out of print, or only sell at a very modest level, then become trendy, even bestsellers, many years after their first publication.

The current publishing trend is to keep books available in print-on-demand form and/or ebook form for long periods, and to make older titles available for sale again. It is no longer necessary to pay for a large print run and warehouse the books, though of course there are still other costs. And, although there have always been self-published books, authors are rethinking self-publication now that new technology for not only producing but marketing the books is available, and now that self-publishing is starting to lose its stigma.


Sorry for my silly “trolling” remark. Long day. Long week!


James, it seems to me this is just a burden-shifting issue. For whom is it most expedient to identify a copyright owner? Should the burden be just on the hopeful copyright user, or on the copyright owner? Perhaps it would be best if it were some combination of the two, as seems to be the case here, where HathiTrust made what appears to be a reasonable,* albeit imperfect, search, which then shifted the burden to the copyright owner to error check it based on the particularized knowledge it had (but the HathiTrust or similarly situated party necessarily would not). It seems to me that’s exactly how the process should work.

Someone upthread asked if it were fair that a copyright owner have to police these orphan-work databases, but the answer probably is yes. Especially in the age of RSS and bots and such, why shouldn’t a concerned copyright author be able to use such a tool to alert them if one of their works ends up in an orphan work database? It would be much simpler, cheaper, and effective for them to do that than for the libraries to have to go to inefficiently exhaustive lengths to verify beforehand. (* I guess we can fight about what’s reasonable, but I would hope it would not be something prohibitively expensive, something which would have the effect of scuttling the entire orphan-works preservation effort. But I fear that’s what HathiTrust’s critics are calling for.)


How is distributing scans of these books acting to “preserve” them? Just scanning them might, although the Google scans are of poor quality. But not distributing the scans among libraries in the Hathi Trust (some of whom never paid for even one copy of the print book), then those libraries distributing the scans to patrons.

Seems to me this is all about libraries saving money: On purchasing print books and e-books, on storing print books, on scanning print books (because Google did it), on assuring scan quality (which apparently nobody has done), on seeing whether scanned works are actually even rare and now, on finding out whether copyrights have been renewed and on locating copyright owners.

The Hathi Trust does not have to actually use the Google scans. Not being a lawyer, I cannot comment on whether it was legal to scan the books to begin with. (And I gather Google is still scanning away—why is the Author’s Guild not issuing an injunction to halt scanning of copyrighted works?) But, I’d guess it would be less problematic to actually store the scans in a “dark archive”—as the University of Michigan once tried to reassure copyright holders it would do, though look how its mind has changed—than to be distributing copies.

Put the scans on a server and don’t distribute them. Voila, no burden of chasing copyright holders.

I have no reliance whatsoever on the idea that the only books the libraries will distribute without prior permission will be old works, that is “not mine.” Certainly Google is scanning, and the libraries accepting the scans, without regard to copyright or in-print status. In any case, I have some obligation to defend my colleagues.

Why should I bear the burden of chasing every library—and other entity—that wants to distribute my work without my permission and without paying me? It’s not even remotely in my interest to facilitate libraries violating my copyrights, or anyone else’s, by letting them create a precedent of “just use it and see if they withdraw it from the list, after we’ve distributed a bunch of copies and those been pirated all over the planet anyway.” As I understand it, copyright law puts the burden of asking for permission on the asker. And that’s where it should stay.


Cathy a problem- If a file is made available on the web it is often ‘mirrored’ and cached in seconds, putting it back in the box , virtual impossible. For authors a requirement to search and keep a look out could be a requirement to be too late to do anything more than shunting the stable door


FYI: The Issue of Orphan Works (NPR -runs 6:20)

Interesting picture


Re the above comment.

The picture, I believe, is of a social worker with a group of her orphan charges. With this in mind:

A social worker by the name of “Hathi” visits a shopping mall in a large American city. There she discovers a youth sitting on a bench minding his own business and goes up to him and asks, “What is your name little one?

“Book”, replies the youth.

“Come with me” demands Hathi, “and I will look after you.”

“But I have parents,” insists Book.

“You are in this busy mall all by yourself and that is not right. Come with me now” Hathi demands again.

“But, if you would only look at this piece of paper with my parents name on it, you would see that I am not alone, I don’t want to go to the orphanage!” a frightened Book pleads as Hathi pulls him to his feet to escort him to his new life.

(Note: I have a deep respect for social workers, and hope I have not offended them)


One question about the Hathi Trust process:

Have they consolidated their listings of book titles?

Google scanned each work as it came in from a library, without checking whether it had been scanned before. Google posted multiple scans of the same editions of many public-domain works on their own books site.

The previous proposed Settlements allowed copyright holders to opt into the Settlement, but opt specified works out of Google display. As I understand it, from authors who underwent this process, Google presented a list of their scanned titles for them to choose from. However, as books were (and are) still being scanned, the authors discover that these lists kept changing. They could opt out a work, only to have it reappear later.

The Google scans went to the Hathi Trust. Therefore, if a copyright holder “opts out” a book from the Hathi Trust display, is that book opted out once and for all, or can it reappear?

Also, is the Hathi Trust storing contact information from authors, publishers, or agents who come forward and present it, so they can look and say, “Jane Author gave us her contact data when she opted out another book, so now we can just email her?” Or do they plan to just post her next book and see if she keeps checking the new lists?


But in any case, Babar is a registered trademark. Lucky escape, HathiTrust!

I may be wrong, not being a lawyer, but it seems to me that there is no legal problem whatsoever for a library to distribute copies of works solely because they contain characters which are registered trademarks, like Babar or Mickey Mouse (other legal problems might prevent this, of course). Assuming of course, they do not do so in a way which would confuse the public into believing that the trademarks were theirs and not the trademark owners (e.g., renaming a branch of their library system the “Babar Archives” with a big exterior video display of a running Babar cartoon).

Just another example that lay people often are confused by complex legal issues (frankly, I wouldn’t be that surprised if it becomes apparent in the further discussion that I’m in the wrong and Gillian is actually correct). Unfortunately, it seems to me that these issues seem to me to be cropping up in everyday life more and more. Does anyone here believe that even 1% of lay people read (and understand!) the legalese “Terms of Service” of each and every web site they look at, or even that of Facebook?


Sorry, but another point:

This posting system assumes everyone is on the net. But, where authors are alive, we are currently looking at old authors. (I fully expect that to change if and when any legal precedent is set of “use first, and ask later.”) Authors who grew up without the net, and who by this time may well have vision problems, arthritis, or other physical ailments that make it difficult or impossible for them to embark on the Internet now.

If Jane Author is, by this time, just barely getting by in managing her daily life or needs the assistance of others to do it, it is not fair to expect her to lose her copyright protection just because she is not jumping on the net every day to see if her books have been posted—let alone learning how to use bots.


FYI: U. of Michigan Copyright Sleuths Start New Project to Investigate Orphan Works, The Chronicle of Higher Education, May 16, 2011


Mr. Courant [Paul Courant, dean of libraries at the University of Michigan] bristled at the characterization of his library’s effort. “I plead not guilty of elfin whimsy,” he said, noting that the library has set up a careful, time-consuming, and expensive procedure to determine whether a book is truly orphaned—an effort it announced in May. — U. of Michigan Tests Murky Waters of Copyright Law by Offering Digital Access to Some ‘Orphan’ Books, The Chronicle of Higher Education, September 17, 2011


Here are the US legal requirements the Hathi Trust is supposed to be following (for US works only):

http://www.copyright.gov/docs/nla.html

http://www.loc.gov/cgi-bin/formprocessor/copyright/cfr.pl?&urlmiddle=1.0.2.6.1.0.173.37&part=201&section=39&prev=38&next=40

And here is the discovery process recommended by the Online Books Page, including useful links:

http://onlinebooks.library.upenn.edu/okbooks.html


The “rescue” and recovery of “Orphan Works” is far more important than any stumbling blocks that the University has encountered so far.

Their process had some holes in it and it can be tweaked,simple as that - the Author’s Guild is so afraid someone is going to step on someone else’ rights they take a defensive attitude than one of WORKING WITH THE UNIVERSITY to make sure “mistakes” in searching “Orphan Works” can be avoided.

“Orphan Works” of Books,Sound Recording,and most notability, films are slowly deteriorating, and with regard to film,many are lost!


Their process had some holes in it and it can be tweaked,simple as that…

Herb, The goal of this project was not for the HathiTrust to unite rights holders with their works, but for for the HathiTrust to lay claim to as many works as possible. This is demonstrated by the ease that some rights holders were found by others. This project is not about, as you put it, The “rescue” and recovery of “Orphan Works” ” which I agree is important work, this is about HathiTrust trying to define the law to suit their wants.


Is theft theft or have I moved into an alternate reality?


Has the HathiTrust released the number of copyrighted works that they processed with their so called “Work Flow” to arrive at the hundred-some they put on their “Orphan List”? For the rights holders they did contact, if any, I wonder what the response was to their request to use the work?


it seems to me that there is no legal problem whatsoever for a library to distribute copies of works solely because they contain characters which are registered trademarks, like Babar or Mickey Mouse (other legal problems might prevent this, of course)

Babar is not just a character in a book; he is a character in a series of children’s picture books. The image of Babar is distinctive and widely recognised. There is an interesting piece on trademarks here; it relates specifically to iconic characters in well-known comic books, but the same rules presumably apply to certain illustrated children’s books. It is hard to see that they would not. I am open to correction by a lawyer, of course.

Back in June when the University of Michigan announced its intention to make digital scans of ‘orphan works’ available to users of its library, Paul Courant, the university librarian, asserted: “The work we’re talking about is not commercial, and most of it never was … It’s scholarly work…” Some scholarly work has commercial value, even high commercial value, and Professor Courant surely should be aware of this. He is, after all, a Professor of Economics. That issue apart, the appearance of Babar, the Little Elephant on the initial list of ‘orphan candidates’ is an especially striking demonstration that the HathiTrust ‘Orphan Works Project’ has by no means confined itself to works of scholarship.

Meanwhile, Ron, you talk about libraries distributing copies of works. But it is not ordinarily the role of libraries to distribute books. Publishers distribute books. In the case of trade publishing of books that are in copyright, they do this under license from the author or other rights-holder. Libraries lend or provide access to published copies that they have purchased, or sometimes copies that have been donated to them. Either way, someone has paid for those copies, and their issue in that form has been authorized: which is not the case with the digital scans of copyright works that the HathiTrust ‘Orphan Works Project’ has been proposing to circulate.


Gillian, I’m sorry to be the source of correction, but Ron’s understanding of trademark law is closer to the truth. It does not matter how famous the trademark is.

On the question of distribution, Ron is correct as well, at least at the term “distribution” is used in U.S. copyright law. The copyright owner’s distribution right is the right “to distribute copies or phonorecords of the copyrighted work to the public by sale or other transfer of ownership, or by rental, lease, or lending.” Thus, when a library lends a book to a customer, that is considered a distribution.


“Orphan Works” of Books,Sound Recording,and most notability, films are slowly deteriorating, and with regard to film,many are lost!

I understand that there are special conservation problems relating to film and I sympathize with your concerns. With regard to books and recordings, I believe that the libraries already have the right to make copies for conservation purposes, under Section 108 of the US Copyright Act, which James cited in a previous post. I do not think it is helpful to blur the differences between books and films in this respect, nor should ‘orphan works’ digitization schemes be confused with conservation programmes.


James, I am always grateful to be corrected by someone who knows more than I do. Thank you.

I am interested in this trademark thing. Babar has been registered as a trade mark, and the registrations - a whole series of them - were renewed earlier this year. Does that not protect images of him that would otherwise be out of copyright? Or doesn’t it work quite like that?


It does protect the BABAR name and probably some images of Babar, but in a much more limited sense than copyright does. It doesn’t restrict the creation or sale of items with Babar on them. What it does prevent is someone else claiming to be the trademark owner and falsely passing off goods as though they came from or were authorized by the owner. Thus, if there are (hypothetically) Babar-branded pots and pans for children, someone else couldn’t sell similar-looking pots and pans in a way likely to make consumers mistake them for the official ones. With books, this principle has less application, because Babar is not a trademark standing for something else; Babar is itself the product.


’ I thought it was abandoned property or I thought it was public property’ is not much of a defense against trespassing , break and enter and theft charges. I imagine that the music industry, a industry that seeks and often gets punitive fines of 100 thousand dollars placed on small time MP file distributors , would not welcome an wide adoption of the libraries idea of a valid test for ‘vacant possession’.


John, actually a good-faith belief that the property was abandoned is a complete defense to theft charges, even if the belief is incorrect.


I do not think any entity other than the copyright holder, or a representative of the copyright holder (such as a publisher or literary agent) typically has much idea what the future or even current commercial value of the work is.

I’ve seen, on the net, public debates as to whether my own books are scholarly. I’d say no: In the US, being a scholarly publication often largely depends on an author’s and/or publisher’s association with a university, or at least a PhD in the field, none of which I have. However, “scholarly” is a marketing genre/label, just like any other, and all marketing genres are somewhat fluid.

I’ve also seen many people speculate on how much money I have invested in my books, how many copies of a given title I have sold to date, and what my revenues are. Absolutely all of them have been wildly off the mark, and off the mark on both directions. Furthermore, I have also seen discussions of my audiences, which do not take into account that my books have several audiences. For example, it never occurs to theatrical costumers that a book of historic clothing patterns also sells to makers of doll clothing.

And even I, as publisher and author, don’t know what my future sales will be. As any publisher or author can tell you, titles often sell much better or worse than expected and it is impossible to tell why; or sales can drop off or increase and again, it is impossible to tell why.

That is why such decisions should not be left to libraries, or to large businesses such as Google, who not only seriously don’t have a clue what the commercial potential of any given book is, it is in their interest to declare there is none. Of course, once they massively violate its copyright, that becomes a self-fulfilling prophecy, at least as far as the copyright owner is concerned.


For example, Francis says:

The Hathi Trust using one e-copy and transmitting it to all its members (and as far as I know, there is no limitation on the number of new members who may join the Hathi Trust), instead of the publisher or author selling each a new copy, can cut into sales quite a lot

This is not happening. This will not happen.


Brandon,

One purpose of the Hathi Trust that I believe I have seen stated, is “pooling” scans. This means that if, for example, one library has a copy of Babar the Elephant and it is scanned, that scan goes to all libraries. See:

http://www.hathitrust.org/mission_goals

and

http://www.hathitrust.org/access

And yes, library “consortiums” are already cutting into book sales. Even the increase in interlibrary loan has cut into book sales over the years, but physical books wear out and e-books don’t. Of course, the technical format may well become obsolete, something preservationists should consider. Exactly how many ten-year-old e-files can you read on your computer without a painful and probably messy conversion process? On the other hand, I have print books over 200 years old in my personal library and I can read them just fine. Anyone who really wants to preserve books should be thinking about printing on 100% rag paper instead of e-files.

If a library consortium can buy one e-book—let’s forget about whether this has anything to do with the Google project or Hathi, or whether the book is still copyrighted—and that e-book is then available to all members of the consortium, of course the other members won’t buy any e-copies. That is why some publishers are now looking at pay-per-number-of-views licensing arrangements.


Frances: This means that if, for example, one library has a copy of Babar the Elephant and it is scanned, that scan goes to all libraries.

Brandon is correct. HathiTrust pools resources in the sense that it has a single search engine and one set set of data centers, but the access controls for each digital book are restricted library by library. Under the Orphan Works Project as announced, a digital book would be available only within the system of the library that supplied the physical copy.


Whether it is a preemptive strike or not by the Authors Guild I do not know … but given the international copyright Treaty for libraries that the IFLA (of which the ALA is a member) intends to introduce at the WIPO SCCR 23 conference this November 2011, the limitations and exceptions to copyright that the Hathi Trust is claiming under US Copyright Law Sections 107 and 108 will look like afternoon tea.

http://www.ifla.org/files/clm/publications/tlib.pdf


but the access controls for each digital book are restricted library by library.

If the Authors Guild and you James, had not called out the HathiTrust on the orphan works issue, I am sure in a year or two the attitude of these large corporations would demand that all their partner libraries have a copy of the books that appear on their “orphan list”, copyrights be damned.


I am confused by the comments from critics of HathiTrust and its partner libraries. The uses that Hathi and its partners have made of this digital collection, and that they contemplate making, are extraordinarily modest. Folks are throwing around bold claims about undermining livelihoods, stealing books, posting books online, and sharing books with libraries that don’t already have them. As far as I know, there is no basis for these strange claims.

I’ve spoken to folks at Hathi at length about these projects, and they’ve described them in public in some detail. The facts are these:

  • Anyone can run full text searches across the corpus, but with results that are quite bare bones - just titles and page numbers, not even snippets, like Google. The effect of this on authors and other rights holders is at worst neutral, but is more likely to be positive, as it helps readers find their works.

  • The scans are mirrored in multiple, secure locations, ensuring that these works will endure long after print copies are lost, forgotten, destroyed, or simply deteriorate over time. Again, the mere existence of digital editions on a server somewhere has no effect on an author or rightsholder’s market. Instead, Hathi is doing a great service, as libraries have always done, preserving culture long after the popular, commercial market has moved on.

  • Under the proposed orphan works program, Library partners that hold a physical book in their collections could make the scan available to authenticated students and faculty online in a secure browser session. The number of simultaneous online viewers would be limited to the number of physical copies held by the user’s library. When the user logs out, the book is gone. This is analogous to browsing in the stacks at a physical library.

  • Those same users, in a secure session, can download one page at a time as PDFs for personal research use. This is analogous to making photocopies in a physical library. Anyone who has done research knows this function is invaluable, and at the same time, that it is no substitute for buying the whole book.

Libraries are emphatically NOT:

  • Sharing copies of these scans with libraries who do not already hold the title in their physical collections

  • Distributing complete digital works in a downloadable format to users

  • Posting orphan works on the open web

  • Selling anything

I am at a loss to see how any of these uses will have a significant negative effect on authors or their heirs. A lot of the rhetoric in comments above seems to be based on misinformation and confusion about what’s really going on. I’d be curious to hear from folks who are concerned about library use of digital scans:

  • Given the facts I’ve just described about these uses, where is the real harm to authors or their heirs?

  • If you think I’ve misrepresented the facts about these uses, I’d be curious to hear what you think libraries are really doing, and the basis for your claim.

Thanks.


James,

Thanks for the clarification. However, Hathi Trust could change its rules at any time, and I expect them to.


James What is the test for ‘good faith’ in the case of abandoned property? Surely it is a bit more than asking ‘is there anybody home’.


“would be available only within the system of the library that supplied the physical copy.”
Meaning it would not be on the web, it would be on a closed local net?


Douglas: these libraries are not corporations.

John: the test for “good faith” in this context is typically subjective belief: what did the person actually believe? And yes, it would be available based on authenticated access to a closed university network.


I’d argue that any large organization with strong financial needs and goals—like a library consortium—is just as focused on them as any profit-making corporation.


To follow up on Frances’ last point — about large organizations focusing on meeting their financial needs — I’d like to point out an obvious correlation to the Google settlement.

Namely, that if Google gets to claim the orphans and make money off them, it would be goofy to anticipate sustained, truly diligent search for missing authors.

Of course, we’ve heard that particular ditty sung here before. Still, how entertaining it was last week to see the Author’s Guild castigating Google’s more recent buddies, the Hathi folks, for their clumsy, failed searches!

Perhaps the AG and Google are no longer BFFs? Hope springs eternal in this human’s breast. (Both of them, actually.)


Is it possible that if the Hathi Trust suit opens the way for the libraries to distribute the so-called orphans, it will also open the way for Google to sell advertising within those orphans, distributed through the libraries or directly through Google?


@ Salley Shannon : that if Google gets to claim the orphans and make money off them

@ Frances Grimble : it will also open the way for Google to sell advertising within those orphans, distributed directly … through Google

My impression is that the libraries intend to try to defend their limited use of the (alleged) orphan works as “fair use”, and the fact that they are not profiting from their intended use substantially strengthens their case (a fair use defense is traditionally evaluated in the US courts using 4 points and that is one of them). In the case that the court decision is favorable for the libraries, I’d be really, really surprised if this would cause Google to attempt to directly distribute these works itself, especially if it is for profit, even via advertising on the side of the display.

As for FG’s interestingly contorted idea that the libraries would agree with Google to display Google ads while displaying the orphan works internally, well, as far as I know there is nothing (legally) to prevent the libraries from making such a deal with any commercial advertising entity, or even trying to do so with more than one ad provider. And I see no reason why it would necesarily be limited to the display of the orphan works, it could include things like catalog searches. Or even enormous on-campus live video displays running 24/7. (Possibly, any/all of these deals might require some kind of unbiased public “request for tender”.) If the libraries/universities are as strapped for funds as people are depicting them, it might even be a good way to gain some extra side income.

To be perfectly frank, however, I rather doubt that most of the universities involved would look at selling out in this way as something ethically/morally desirable. And given the brouhaha over all of this, I’d even be surprised if the display of the orphan works would include a short credit reading “scanned for yyy library by Google”. But then, it’s been a long time since I attended college. Have they become that commercialized since then?


Regarding the public-domain scans, the libraries are already selling print-on-demand copies of the Google scans. (Note, due to setup costs and comparable offset costs, print-on-demand is typically not one copy but a run of up to 500 copies.) See:

http://www.cleveland.com/business/index.ssf/2009/07/universityofmichigan_amazon.html

I will give some search figures for which I am indebted to Gillian Spraggs. On Amazon, she recently found 481,217 results for University of Michigan titles, 63,618 results for Cornell University Library titles, and 15,744 results for University of California Libraries. I have bought several of the University of Michigan titles, and they are the Google scans, all errors included, no value added other than printing and binding.

Of course, it is perfectly legal for the universities to reprint public-domain works. But, this does show that they are interested in profiting from scanned books, and no strangers to printing and selling POD books.


Ouch. Now that I think about it, if the universities go for “fair use” it would not be a good idea for them to gain income by selling ad space next to a display of the orphan works. So, the answer Frances is looking for, with respect to that, is probably simply “no”, and not “yes, but why limit this scenario to Google?”.


she recently found 481,217 results for University of Michigan titles
When I search for “university of michigan” (in quotes) as publisher in Advanced Search, I get 496,755 results. I find it hard to believe that 97% of all titles published by the University of Michigan are public domain reprints. This leads me to believe that Gillian’s numbers are for general works published by these libraries, and not for Google-scanned works in particular. Am I wrong?


The debate on whether e-books should contain ads (other than the traditional ones for a small number of the publisher’s other books) and if so, whether significant ad revenues can be obtained, is a very hot topic in current publishing. My opinion is that any one book except a current bestseller or long-time classic, and even the entire lists of most publishers except the largest, is not going to earn any significant ad revenues.

But selling ads in quantity—whether it is done by Google, or the Hathi Trust, or any other large entity—is a different issue. Someone who has tens of thousands of e-book titles (or more) is much more desirable to advertisers. They can agree with advertisers to put an ad in every copy of every title. Very small per-book ad revenues (and quite possibly, several advertisers per book) can add up to large total ad revenues.

As for Google—it’s already been mentioned that there is no firm legal definition of what constitutes a “library.” Google could just, if necessary, spin off some subcorporation and call it a “library.”


As for Gillian’s figures, I suggest that you, Ron, do some further digging as to how many titles the University of Michigan publishes per year, and how many are in their current active catalog. It would be a welcome contribution to our research.


Forgive me, I was asking you how many new and recent titles the University of Michigan has in print, versus their public-domain reprints. I am sure they can also enlighten you as to whether they are reprinting public-domain scans from libraries other than their own. A catalog, and if necessary direct inquiries, to the University should give you the reliable information you need.


Getting an estimate of how many titles UoM publishes per year shouldn’t be that hard. However, this doesn’t have anything to do with the accepted convention that a person posting research results in a discussion at a high academic level, like what the discussion in this blog should try to emulate, has the responsibility to reveal the method by which he obtained the results, if it is requested.


Ron,

I made no claims as to how many new titles per year the University of Michigan publishes. I also made no claims as to where I, or Gillian, got the results other than what I said. As for different Amazon results, Amazon constantly and automatically uploads from many different databases, and the results of any given search can easily change from day to day. If you have any doubt as to this, do the same search on a regular basis and compare results. (In my experience, you won’t have much luck getting info from humans at Amazon as to how the Amazon system works.)

You were the person who initially asked what proportion of the University of Michigan’s titles are public domain. From my knowledge of the publishing industry, with numbers like this, most of them are not new titles.

However, as far as I have been able to tell, you personally have never worked as an author, as a publisher, for a library, or for any other publishing-related entity. Therefore, I can understand why you often need education in how the industry works, including its practices, potentials, and current debates.

I do not, however, feel obligated to do 100% of your research for you or to provide all your education. And when I present information and someone says they find it hard to believe, I expect them to present some factual basis for their belief rather than merely casting doubt. I do not expect you to regard me (or anyone else) as an oracle, but if you want to know, then go to the source. Which in this case is the University of Michigan.

Besides, I don’t know whether the University is publishing POD books of scans from other universities as well as their own. There isn’t much reason to—a POD business is easy for just about anyone to set up. Still, you have raised an interesting question, and I look forward to your answer to it. My opinion of online forums that have serious debates, rather than chat, is that all members should contribute information. It is not fair for a handful of posters to contribute all the research and information, and for others to then constantly disagree with and doubt it without providing any real data.


The University of Michigan press is currently advertising 244 new books for Fall 2011. This number was obtained by a semi-manual analysis of their web site starting at URL http://press.umich.edu/portals.jsp . Based on that, they publish ~1K books per year.


Ron,

Publishers have “seasons,” and not necessarily four a year. Fall is usually the season when the most books are published, especially for university presses. However, the University will almost certainly be happy to tell you directly how many new titles they publish per year, and probably to keep you on their catalog mailing list for the rest of your life. This is just standard marketing info.


You were the person who initially asked what proportion of the University of Michigan’s titles are public domain.
No, I asked how Gillian did her searches. I cannot see how she could search in particular for UoM publications which were scanned by Google, or which are public domain.

Amazon search results are always sorted, and Amazon refuses to return more than 100 pages of results for a search, no matter how many “hits” there are. This more or less makes it impossible to get a good estimate of the percentage of public domain works in the search results. But based on the ~1K/year estimate, I agree that it’s almost certain that at least 85% of the UoM titles on Amazon are public domain.


Google could just, if necessary, spin off some subcorporation and call it a “library.”
Er, Frances, I already answered your question about this possibility and the answer was “no”. Unless the libraries make their case (and win!) in a totally surprising direction which doesn’t rely on “fair use”, no library will be selling ads next to displays of orphan works. That includes hypothetical Google “shell-libraries”.

Really, just because Google is supposedly trying not be evil, it’s silly to be obsessed with imagining all the possibly ways they might be evil.


Frances Grimble: Is it possible that if the Hathi Trust suit opens the way for the libraries to distribute the so-called orphans, it will also open the way for Google to sell advertising within those orphans, distributed through the libraries or directly through Google?

This is not likely. The library exemptions in Section 108 of the Copyright Act only apply to activities done “without any purpose of direct or indirect commercial advantage.” Fair use, in Section 107, considers whether a “use is of a commercial nature or is for nonprofit educational purposes.” The libraries have strong advantages in the HathiTrust suit, advantages which flow from their non-profit status and their non-commercial uses. There is a reason that the Authors Guild sued Google first, and the libraries are not likely to throw those advantages away.


Ron - the figure I gave Fran was for books published by University of Michigan Libraries, and I found it by searching under “University of Michigan Libraries” in the Publisher box of the advanced search form.

Today it is returning 481,215 results, a discrepancy of two on when I last ran this search.

As to why you get a higher figure when you enter “University of Michigan”: I imagine it will be adding in publications by the University of Michigan Press, a well-established university publishing house. The publishing enterprise of “University of Michigan Libraries” seems to be distinct from the University Press.

So far as I can see, the publications by “University of Michigan Libraries” are all reprints of works in the public domain in the US. I have not, however, checked them all … The University of Michigan Press publishes new scholarly work.


Sorry, Gillian!

These days I try to avoid posting actual research on the net, or doing if it asked, for fear of inviting trolls who don’t really want it. Should not have posted your figures without proofing the text.


The EU on “digitizing and making publicly available out-of-print books and journals”:

http://www.ip-watch.org/weblog/2011/09/20/breakthrough-gives-eu-principles-for-digitising-out-of-print-books/?utmsource=daily&utmmedium=email&utm_campaign=alerts

I have some advice for all copyright holders of out-of-print books:

  1. Scan the book.

  2. Take the scans to a print-on-demand printer. Not a print-on-demand publisher (which is synonymous with a vanity press) and not one owned by a book retailer such as Amazon or Barnes & Noble (also vanity presses). They will gain some influence of over how and where you market the work, and you don’t want that. Just to go one of the many PO printers that does nothing at all but print the book.

  3. Invest a couple of hundred dollars in getting copies printed.

  4. Get a seller account at least one large online retailer such as Amazon. Their publishing service for micropresses is called Amazon Advantage, but you can also sell through Amazon Marketplace. Don’t get tied up in any complicated contracts. Leave yourself full flexibility as to what you want to do with the work later. The point is to get your book listed in at least one public database as in print. Although, if in the US you should also get it listed by R. R. Bowker, the publisher of Books in Print. Bowker’s listings for this are free and all you do to get one is fill out an online form.

You may sell copies or not. Probably not, if you don’t market the book in addition to printing it. You may decide to self-publish it later by some more large-scale and thought-out method. You may even voluntarily decide to let the Hathi Trust, Google, or some other party digitize the work.

But meanwhile, you now have your book publicly listed as commercially available, for as long as you choose, at little cost to yourself. No one can legitimately declare it an “orphan,” to do with as they please and without consulting you. You have as much time as you like with your book under your full control, and hopefully, without chasing after umpteen parties threatening to digitize it without your consent.


If there ever was a yard in search of chickens , then ‘orphan authors’ has to be it. As best as I can see orphans means : ’ any author that can not, at this moment, see me’.


Brandon,

The harm to authors and their heirs is that a paid-for book cannot compete with a free book, thus gutting current and/or potential future sales.

It is perfectly possible to download PDFs one page at a time, assemble a complete set, and pass copies of that set around wherever the downloader wishes, including uploading it to torrent sites. (I once had shareware “site ripper” software to download all pages of a book automatically—for public-domain works!—and such software is still available; try www.tucows.com.) Students have long been hand-assembling complete books from places like Amazon’s Search Inside, where the files are posted with the publisher’s permission, but the number of pages any individual can download is limited by Amazon’s software. The students merely have a group of people each download pages up to the limit, put together a complete set of scans, and then give a copy of the complete set to everyone in the group. This rather well-known technique is called distributed downloading.

With copyrighted books being “lent” by the Internet Archive, the numerous POD publishers who for years have been downloading scans of public-domain material, printing POD editions, and listing these for sale on Amazon and elsewhere, have extended this practice to older but copyrighted books. I doubt they even check whether the book is still copyrighted or not.

Therefore, I fail to see how the Hathi Trust can keep students from transferring copies of the scans they download, completely outside the university.

I have already mentioned the problems with preserving such poor-quality scans as the Google ones for posterity, and with preserving e-files at all. Have you, personally, ever TRIED upgrading a book file only about eight years old to a new format after several editions of the software had come and gone? I have and talk about pain … Seriously, do you really think any e-format will stay current for the next couple of hundred years? If so, you’re just not paying attention to the software world. Or do you envision rounds and rounds, and years and years, of public funding as libraries have to “preserve” the same books again, and again, and again, and again, as file formats become outdated over and over?

And, one thing I have never seen is any evidence whatever that the specific books scanned were, in fact, in poor condition. That was not even mentioned in what librarians told me about the scanning project. They said Google wanted volume, the libraries guaranteed it, therefore they were packing up books shelf-full by shelf-full without the slightest regard to the copyright status or condition of the books.

As a copyright holder, I am bound to distrust any and every entity that asserts that its rights trump the copyright holder’s. That will never change.

But my distrust is further increased by the fact that for several years, throughout this process, copyright holders have been reassured and petted, constantly told that the uses Google and the libraries would make of the scans were just so tiny, so harmless, so positively benevolent that they could never harm copyright holders.

  • Google was going to create a universal library. Except, when it came to the Settlement, they also wanted a publishing house, a bookstore, an ad sales venue, and fodder for various AI and search-engine-improvement projects.

  • Google gave copyright holders a nice, reassuring database where they could enter full bibliographic data of books not to be scanned. It later turned out that Google scanned a lot of those books anyway.

  • The University of Michigan was going to put the scans of all copyrighted books in a “dark archive.” Except, then they changed their minds and decided to release scans of what they determined to be “orphan works.” This is in spite of the fact that the Sonny Bono law actually has a provision for libraries scanning copyrighted works—with rules the Hathi Trust apparently doesn’t want to follow.

  • The Hathi Trust released their test list of “orphans,” which supposedly did not include books that were currently in print, books with locatable copyright holders, and books where used copies were readily available for sale. Except, it turned out their “process” had not been used on a fair number of the books. Of course, next time the Hathi Trust will likely have the sense to release a very carefully screened sample … which does not in the slightest guarantee that the bulk of the works released will be carefully screened.

It is just not in the interest of every entity that wants to digitize everything, to distribute huge numbers, of files in digital form, to actually research renewal status, copyright holders, in-print status, or anything else. That’s simply completely counter to the quick-cheap-and-massive attitude to these projects. Furthermore, all the entities involved change their own rules whenever they feel like it, and continually try to expand the number of uses—and by extension, if not their own profit, at least their own cost savings.

What the Hathi Trust says today means absolutely nothing in terms of what they will say or do tomorrow … so why should I trust them?


Brandon,

In case this has not sunk in—copyright holders can, and do, often change their minds about whether to put a book back in print, or issue an e-form of their own. That is especially true with today’s technologies. Out of print does not mean no longer copyrighted. And being the heir of the creator of the work does not mean no one will ever be interested in it?

Why, exactly, doesn’t the Hathi Trust just stick to “preserving” all those public-domain scans? The quality of the scans s**s, but they’re legal.


from 20SEP2011 above:

Students have long been hand-assembling complete books from places like Amazon’s Search Inside, where the files are posted with the publisher’s permission, but the number of pages any individual can download is limited by Amazon’s software. The students merely have a group of people each download pages up to the limit, put together a complete set of scans, and then give a copy of the complete set to everyone in the group. This rather well-known technique is called distributed downloading.

Good one! Frances … of course all it takes is multiple computers each with multiple browsers and you can practically do it by yourself.

Also, my tropical horticulturist friend says the Google scans and OCR of public domain historical botanical works are so bad as to be (at times) practically worthless and not just the technical/Latin jargon.


Frances,

No fair user is required to structure its project or product so as to make it impossible for bad faith actors to exploit it for downstream infringement. VCRs made it possible for pirates to record and sell broadcast programs. Cassette recorders made it possible for music fans to dub and share music releases. Copying machines make it possible to create physical copies of books without permission, and computers are just giant, super-powerful copying machines. Letting physical books circulate makes it possible for an industrious kid with a DIY book scanner to create a PDF and post it on the web, but I don’t think anyone would shut down traditional libraries for that reason.

The question is not whether a bad actor could exploit this project (with some effort, let’s acknowledge), but rather whether the libraries’ use is reasonably tailored to legitimate uses and doesn’t actually encourage infringement.

Beyond these basic legal and moral points about multi-use technologies and platforms, there’s a real policy flaw with targeting Hathi. If your goal is to eradicate infringing file-sharing, Hathi is the least of your concerns. The torrents are already out there. Preventing access for students, scholars, and the print-disabled isn’t going to put that particular toothpaste back in the tube. So the harm of shutting down Hathi is wildly out of proportion to the very slight benefit authors might see in terms of stopping one or two pirate editions from being added to the pirate ecosystem.

Also, I’m quite certain that if a bona fide rights holder comes forward and publishes their own edition of their previously out-of-print work, UM and Hathi will cease using the digital scan for access purposes. I’m also fairly sure (though I can’t say for certain) that UM is not selling POD editions of public domain material to make a hefty profit. That said, there is no legal (or moral) reason they (or anyone else) could not do so. That’s what the public domain means - anyone can do anything they want with these materials without asking permission. If I had to guess, though, I’d guess that these materials are being made available on a cost-recovery basis to increase access to them, period.


That said, there is no legal (or moral) reason they (or anyone else) could not do so.

Frances has already said that in her original comment. It is not a disputed point.


I was just answering this question:

Why, exactly, doesn’t the Hathi Trust just stick to “preserving” all those public-domain scans? The quality of the scans s**s, but they’re legal.


I see - Frances meant “Why don’t they stick to preserving PD works, rather than also preserving in-copyright works?” I thought she meant, “Why not just preserve them; why also make them available?” Sorry for the misunderstanding.

To answer her actual question, first, of course, they disagree about the law. But to dig deeper, there are obvious reasons why a library would want to preserve as much material as possible, regardless of copyright status. Plenty of in-copyright works have been lost forever due to poor stewardship - see, e.g., the multiple seasons of the epic Doctor Who tv series that were just destroyed outright by the BBC in the 60s and 70s. If those works had been held in library collections, and, better yet, transferred to a robust digital format, they would still be around today. Instead, they’re gone. The risk is particularly acute for out-of-print works and orphans, which are by definition no longer being proliferated in copies by anyone other than libraries. If robust, secure backups are not made, these works could disappear forever.


Indeed, in-copyright works are at more risk than public domain works insofar as copyright poses an additional barrier to their use and dissemination by interested third parties.


Brandon,

It is true that someone who applies enough effort can hack into a server, let alone copy and distribute files they download. However, people typically steal what is easy to steal. Someone can undoubtedly pick the lock on your front door, but you probably lock it anyway. Someone can undoubtedly saw through any bars you put on your lower windows, but many householders have bars anyway. Someone can undoubtedly disable your car alarm, but you probably have one anyway. If you store your lawnmower in your garage instead of on your front lawn, someone can probably steal it anyway, but you still probably prefer the garage.

And I’d say almost certainly, you reduce your chances of theft significantly by making it more difficult. You eliminate casual thieves. Therefore, I’d bet you do bother to lock your front door instead of saying, “It’s useless, because a professional burglar who worked hard at it could get through all my protections anyway.” And I suspect, even if you live a modest lifestyle, you own things that are precious to you personally and that you don’t want exposed to thieves or vandals; most people aren’t saying, “My stuff is totally worthless anyway so let everyone steal it.”

Therefore, yes indeed, I do believe that someone storing and disseminating files whose copyrights belong to others does have an obligation to protect those files. In the case of Hathi, it’s even worse, because those copyright holders never gave Hathi permission to create those files. If Hathi had asked for permission, they could first have inquired into Hathi’s procedures before entering into an agreement with Hathi. I don’t, for example, choose a warehouse for my print books without first inquiring into their security and insurance.

On another issue, as a long-time book collector I know that it is invalid to assume that all books are in immanent danger of fatal deterioration. The so-called orphan works from the US were, by definition, published after 1923. That’s really quite recent historically, and well-stored copies usually survive quite nicely. If not, the book was often printed in a large enough print run that another copy can be purchased.

And, re Doctor Who, bear in mind that film is a significantly more fragile medium than paper—this is not a valid comparison. Though it is relevant to consider that the last preservation movement was to put books on microfilm—which, oops, turned out to deteriorate faster than paper.

Yes, all books are deteriorating in the sense that every human being is headed for the grave. That does not mean the danger of either all copies of a book or all human beings is immanent and a crisis. It does not justify violation of tens of thousands of copyrights.


Re preservation issues, I highly recommend this book:

Double Fold: Libraries and the Assault on Paper, by Nicholson Baker


My wife is a preservation librarian, and we have the Baker book at home. I’ll check it out if I get a chance.


I don’t think we’re going to get anywhere on the security issue.

Hathi uses industry standard practices to control access to the scans and to protect them from hacking. If you think libraries should be responsible for a pack of students sitting around downloading one page at a time until they get the whole book, you’re asking much much more of libraries in this context than has ever been asked of them in the physical world. Again, they could do essentially the same thing with physical volumes and a scanner.

You’re also asking more of libraries than we ask of consumer electronics companies, search engines, and on and on. It’s fairly clear that the law takes exactly the opposite view - libraries get more privileges to use and preserve copyrighted material than the average entity, not less.


Brandon,

Hathi should ask the copyright holders’ permission before digitizing copyrighted books.

I have never had my books in Amazon’s Search Inside, because of people doing distributed downloading of those files. But Amazon has made my participation in Search Inside an entirely voluntary, opt-in arrangement.

Likewise, I have never, and likely never will, produce any electronic editions of my books. (I am aware they could be pirated anyway, but pirating paper editions is much more trouble, therefore much less common.) I simply do not have to ever worry about how electronic editions of my books are used or stored by libraries, distributors, or any other such parties.

That is, as long as I retain the right I do have as a copyright holder to decide what editions of my works will be published—and scanning is publishing an electronic edition.

As for preservation, I have studied storage and preservation issues to some extent. More of textiles than paper. But I am managing a personal library of about 5,000 books, most at least several decades old and many 19th century. It simply is not true that books automatically reach a certain age and become unreadable. In most cases, post-1923 books should survive pretty well to the end of the copyright term. And I simply don’t believe in preservation arguments where no criteria were defined other than sweeping books of all ages and conditions off library shelves. For the microfilm projects—which the libraries actually had to get funding for, as opposed to free scans from Google—whether you agree with their selection criteria or not, they did define criteria.


from the Eldred v. Ashcroft US Supreme Court case:

Report of the Librarian of Congress, Film Preservation 1993, pp. 3—4 (Half of all pre-1950 feature films and more than 80% of all such pre-1929 films have already been lost)

And they tell us that copyright extension will impede preservation by forbidding the reproduction of films within their own or within other public collections. Brief for Hal Roach Studios et al. as Amici Curiae 10—21;

http://www.law.cornell.edu/supct/html/01-618.ZD1.htmlZD1.htmlcornell.edu/supct/html/01-618.ZD1.html

…Still this was not argument enough for allowing these films to enter public domain be exempted from the ‘Sonny Bono’ copyright extension.


The ease in which anyone can create a data base of material’s is pointed out in this thread. If “Archives” are granted the rights to copy, store and use copyright protected Works without consent or permission, who then decides which “Archive(s)” is granted such rights? After reading the draft treaty (http://www.ifla.org/files/clm/publications/tlib.pdf) libraries and archives want more rights to Works than the owners themselves. Fair? Reasonable?


Dan, “libraries” and “archives” already are given exemptions under current U.S. copyright law not available to others. For more extensive discussion of the definitional issues, I recommend Part II of the Section 108 Study Group Report.

Also, your statement that under the treaty, libraries and archives would have greater rights than “the owners themselves” is incorrect, if by “owners” you mean copyright owners. No one has greater rights to make use of a copyrighted work than the copyright owner, because she can authorize any use whatsoever.


Hi James, I’m aware of “fair use” in the US and “fair dealings” in CA, I work as a DMCA agent. The exemptions under the current Acts are very limited compared to what is being asked just in article 12 of the treaty. Article 12 Right to Access Retracted and Withdrawn Works Published in Databases or on Websites 1) It shall be permitted for libraries and archives to reproduce, preserve and make available in any format any work, and any material protected by related rights, which has been retracted or withdrawn from public access, but which has previously been communicated to the public or made available to the public by the author or other right holder.


Three points. First, this portion of the proposed treaty is completely optional. The next paragraph states:

Any Contracting Party may, in a notification deposited with the Director General of WIPO, declare that it will apply the provisions of paragraph (1) only in respect of certain uses, or that it will limit their application in some other way, or that it will not apply these provisions at all.

Second, as there is no general right of retraction under U.S. law, — i.e. a right to insist on the return or destruction of all published copies — most of this provision covers acts that are already legal under U.S. law.

Third, the principal intention of this section appears to be to deal with cases in which the work is published (with authorization) online, and then taken down. The underlying theory is that in a case of retraction — i.e. an attempt to withdraw the work completely from the public — there is no longer an economic interest in the work that copyright owner is interested in exploiting, and that the historical record should reflect accurately what was actually said. To the extent that the text of paragraph (1) goes beyond the exceptions necessary for that use, this is a fair reason to argue for narrowing revisions in a future draft.


“The underlying theory is that in a case of retraction — i.e. an attempt to withdraw the work completely from the public — there is no longer an economic interest in the work that copyright owner is interested in exploiting … “

Well, no. These days, there is not a hard-and-fast line between free and paid-for content among professional writers and publishers.

It could just as easily be that an author decided to turn his or her free blog or website into a book (or part of a book), and withdrew the free material so that it did not compete with the paid-for book, possibly as a requirement in the author’s contract with the publisher. Authors often do this.

Or it could be a book (or part of a book) distributed as a free e-book download for a limited time only by the author, the publisher, or sometimes a third-party site (with permission). This is a fairly common marketing method.

There are certainly differing opinions as to whether it is a good publishing strategy to distribute content free and then start charging for it. But once the author decides to go commercial with the content, that business decision should be respected by other parties.


Explanatory Note – Article 12 Libraries and archives have an important duty to preserve the public record for posterity, including any modifications or retractions made to it, and to make it available to users. This includes providing access to copyright works, and material protected by related rights, which are no longer available to the public because they have been withdrawn from databases or websites.

James our client’s remove images and text from their websites for reason. They are the rightful copyright owners and any related rights thereon. Why should libraries and archives be granted any rights to decide something different?


Another issue is that the website displaying content may

(a) Not be the lawful copyright holder, and has obeyed a request by the copyright holder to take down the content

or

(b) Be a periodical that only licensed certain uses of the material from its creator, and took down the material so as not to exceed the uses the periodical paid for


Great points Frances.


The risk is particularly acute for out-of-print works and orphans, which are by definition no longer being proliferated in copies by anyone other than libraries. If robust, secure backups are not made, these works could disappear forever.

Round and round and around we go. As I observed in a comment higher up this thread, ‘With regard to books and recordings, I believe that the libraries already have the right to make copies for conservation purposes, under Section 108 of the US Copyright Act’ …

Further, the Google/Univ of Michigan scans are not of archival quality. This is not my opinion but the view of Kalev Leetaru of the University of Illinois. (But you only need to know a little bit about image files to see that he is, of course, correct.)

Even as access copies, many of them are pretty poor, as Fran observes above. The deficiencies in the scanning of many of these works have been documented by Robert B. Townsend of the American Historical Association and Geoffrey Nunberg of Berkeley.

Fran has also made the point above that there is very little reason to believe that converting a book into a digital file format is likely to be an effective means of preserving it long term. I have met professional archivists who would agree with that assessment.

Like Fran, I own thousands of old books. Many of them date from the second half of the nineteenth century and the early part of the twentieth and are printed on acidic wood-pulp paper. I am a careful owner, but I do not keep them in archival conditions; can’t afford to. They are a scholar’s working collection, kept for use. Some which I have had for many years are browner than they used to be. None have deteriorated to the point where they cannot be used. In more than forty years of using libraries and browsing secondhand bookshops, I have only three or four times handled a book where the pages were splitting as I turned them. It is clear to me that at present this problem affects a small minority of books only, and some of those, at any rate, were also published in more expensive editions on better paper. I do not dispute the need for book-preservation programmes, but it is common for the advocates of mass-digitisation schemes to talk as though every old book is on the point of crumbling to dust. This is simply not the case. Books are pretty tough. And most of the titles printed in the age of mass production exist in multiple copies.

The HathiTrust ‘orphans’ project is not about book preservation. It has rather more to do with what Paul Courant has been quoted as saying in a University of Michigan press release: ‘students, faculty, and researchers … these days … expect to be able to access much of their research material digitally, and from locations other than the library.’ And they, of course, must have what they want, or what we are told they want.

It may also be partly about reducing the costs of storing and retrieving books that are not used very much.


Frances Grimble: It could just as easily be that an author decided to turn his or her free blog or website into a book (or part of a book), and withdrew the free material so that it did not compete with the paid-for book, possibly as a requirement in the author’s contract with the publisher. Authors often do this.

This is not a case of retraction of the work. Certain copies may have been removed from circulation; and one means of accessing the work may have been disabled. But retraction of the work — the original, copyrighted “work of authorship” — woud mean retraction of them all. The author who publishes a paid-for book containing the material previously on the website has not retracted the work, and so this provision would not kick in, and the library would not be allowed to reproduce and distribute the website version.

Dan Williams: James our client’s remove images and text from their websites for reason. They are the rightful copyright owners and any related rights thereon. Why should libraries and archives be granted any rights to decide something different?

You are begging the question, which is, “What rights should society give to authors?” Your clients do not have the right to prevent reviewers from quoting sentences from their works. They do have the right to prevent commercial publishers from running off exact duplicates and selling them through the same marketing channels. Society grants some rights and not others to encourage the creation of original works, to promote their distribution to the public, to enable future authors, to encourage education and learning, and a variety of other reasons. How and where to strike that balance is the issue that will determine what rights a copyright “owner” holds, not the other way around.

Frances Grimble: Another issue is that the website displaying content may (a) Not be the lawful copyright holder, and has obeyed a request by the copyright holder to take down the content or (b) Be a periodical that only licensed certain uses of the material from its creator, and took down the material so as not to exceed the uses the periodical paid for

Your (a) is not a case in which the work “has previously been communicated to the public or made available to the public by the author or other right holder,” so, by its terms, Article 12 would not apply. Your (b) is a case in which the portion has been “made available to the public by the author” but the full work has not. Thus, Article 12 would apply to the portion, but not to the rest of the work.


You are begging the question, which is, “What rights should society give to authors?” I was referring to our client’s copyrights already in place James.

Your clients do not have the right to prevent reviewers from quoting sentences from their works. I wasn’t referring to mere sentences, nor does the new treaty.

How and where to strike that balance is the issue that will determine what rights a copyright “owner” holds, not the other way around. I thought the copyright protection act(s) already had?


Dan Williams: I was referring to our client’s copyrights already in place James.I thought the copyright protection act(s) already had?

Congress can and does change the law to take account of changing conditions. Over the years, there have been many other changes in the copyright laws, some increasing authors’ rights, others decreasing them.

I do not think you would be happy if the rule were that the scope of copyright protection, once set, could not be changed. The Copyright Act of 1790 gave a single of 14 years, renewable once. It required registration before it gave any rights. And it did not protect authors against translations, public performances, or even quite substantial abridgments.


The changes being discussed will give libraries and archives the legal Rights to make and distribute full digital copies of Works, the Rights to publish copyright protected Works; in fact all Works subject to copyrights this meaning digital and non-digital, for (free?) publicly accessible on-line libraries and archives on the World Wide Web. James I personally don’t think those are positive changes for all parties involved.


I am a careful owner, but I do not keep them in archival conditions; can’t afford to.

The average 19th- or 20th-century book will last well under the same conditions under which human beings prefer to live (at least in the US). By conditions, I mean temperatures and their extremes, humidity, and the presence/absence of rodents and insects. If books are not stored in a hot attic or damp basement, and they are far enough away from obvious sources of problems such as dripping water pipes, they generally do fine. I believe that libraries generally manage to achieve such conditions.

Also, there are different kinds of library tape that librarians commonly use to reinforce bindings, the hinge created by the endpapers, even tears on pages. A book can be rebound and a paperback can be given a hardcover binding. The pages are the main concern and even with high-acid paper (which is no longer the main printing trend in the US), they last much longer than most people seem to think.

See http://www.thelibrarystore.com/category.jsp?path=-1|23268|85343&id=98963 for a catalog of book repair supplies.

Nicholson Baker’s (http://en.wikipedia.org/wiki/Nicholson_Baker) research on the library microfilming preservation projects indicated that what libraries really wanted was to discard those bulky, space-eating paper books once they had been microfilmed. Something to ponder, if you are interested in preservation issues. He also found out that the libraries found it easier to get funding for what were at the time trendy, high-tech microfilming projects than for the pedestrian expense of renting warehouses. Of course, in this case the Google scans were free.


On ifla tlib I was way ahead of you guys:

from Twitter:

travel_brl john e miller (#ifla re: 3.0 Draft Treaty — From script of “Other People’s Money”: Where I come from, you always say ‘Amen’ after you hear a prayer. 18 JULY


I posted this a day or so ago - it seems to have gone missing

Public Lending Right (PLR) and Educational Lending Right (ELR) are Australian Government cultural programs which make payments to eligible Australian creators and publishers in recognition that income is lost through the free multiple use of their books in public and educational lending libraries. PLR and ELR also support the enrichment of Australian culture by encouraging the growth and development of Australian writing. If you are a book publisher or creator—author, editor, illustrator, translator or compiler—you may be eligible for a payment under the PLR and ELR schemes. If you are new to PLR and ELR read Schemes and guidelines before making a claim.

http://www.arts.gov.au/literature/lending_rights

In Australia some/much of the ‘educational uses’ being discussed are covered by schemes run by the Commonwealth Government of Australia either directly or indirectly through a statutory authority . However these schemes are not ‘free’ and the rules are made by The Parliament. It is not for libraries to rearrange the complex balance that is the law of a whole society , to suit libraries. Needless to say, nobody is completely happy with the resulting system ; it is a compromise. But it is ‘near enough’. The problems and costs are real but the public benefit is believed to be greater.

The difficulty with the leaky nature of closed systems in the web world is provably being looked at as part of the remit of the parliamentary committee looking at the issues (often definitional) that are raised by the blurring of legal boundaries that convergence/connectivity creates .(i Could check if it matters enough)
The problems created for traditional ‘averaging-sampling’ systems of payment, in a world where the sheer volume of publishing of sub-median stuff threatens to completely drown excellence in a sea of general activity, is also being looked at.


I know this is a bit off from the Hathi case, but while the international consumer and/or disadvantaged IP rights communities probably have legitimate cases in increasing their rights to access copyrighted materials, they seem — especially during the WIPO SCCR negotiations that I follow most closely — to almost pathologically over-reach their legitimate concerns and thus end up with nothing.

The IFLA tlib proposed Treaty is most likely another case in point… and maybe (at least from the Authors Guild POV) that is the case with Hathi as well.


“almost pathologically over-reach their legitimate concerns and thus end up with nothing.”

There is something of the violent myopia of the Hedgehogs Solution to the myriad problems of history about them, no?

The father of the poet Les Murray used to say of a nearby shopkeeper: “He is too close, He can’t say near enough

Any idea why they are so unable to change?


How and where to strike that balance is the issue that will determine what rights a copyright “owner” holds, not the other way around.

I simply do not see that the technical capability of producing an e-book and accessing it on the net means that copyright law needs to be changed to give greater rights to libraries.

Publishers are already producing e-books and POD books. Authors and heirs of older works are already producing e-books and POD books.

Publishers are already selling e-books to libraries, and signing e-book licensing contracts with libraries. Wholesalers who sell print books to libraries are already selling e-books as well.

Reprint publishers are already reviving older works. Some “preservationists” talk as if an out-of-print book will never be reprinted or issued in a new edition, which is patently untrue.

As far as I know, existing copyright law has pretty good provisions to protect the authors and publishers of works, while enabling libraries to lend books, and even to make preservation copies.

The only “problem” I see is that libraries are complaining because not every book is available in e-form, because every e-book is not cheap, and because some publishers want pay-per-view licenses comparable to the number of times a print book is typically circulated. And readers are complaining that not every book is available free, and instantly, on the net.

Why should libraries and readers be entitled to every single thing they want? Having to pay for a book does not render it unavailable, a book being in print form does not render it unavailable, and having to mail order a book, drive to a bookstore, or place an interlibrary loan does not render the book unavailable.


To answer Mr. Walker’s query above, a simple one-reason answer regarding the visually impaired and otherwise reading disabled community as to why they over-reach is that they feel that various UN Human Rights declarations especially the UN Convention on the Rights of Persons with Disabilities (and a non-binding vote of support from the EU Parliament) trumps any objection the IP rights owners community might posit.

However, the proposals by the consumer/disability rights communities often in themselves jeopardize rights previously granted to the IP rights-holders in various treaties Berne, TRIPS, WCT etc. by each signatory country.

My T-I-C solution — as posted elsewhere on IP-watch.org — is pretty simple: Just change the Berne 9(1) clause that grants ‘exclusive’ rights of authorizing reproduction in any manner or form to the copyright owner to ‘semi-exclusive’.

Note: http://www.un.org/disabilities/documents/convention/convoptprot-e.pdf at Article 30

Also (if I may) re Treaty Proposals at WIPO SCCR23 including the IFLA tlib — http://www.scribd.com/doc/65717564/WIPO-SCCR23-Wishin-and-Hopin


Just change the Berne 9(1) clause that grants ‘exclusive’ rights of authorizing reproduction in any manner or form to the copyright owner to ‘semi-exclusive’.

‘No dictionary entries found for “semi-exclusive”.’ - Oxford English Dictionary online

Can you see why, John? It is because ‘exclusive’ cannot be modified. Just a point of information.

Yes, I know many thousand instances of ‘semi-exclusive’ are turned up on a Google search. It is nonetheless an illogical barbarism, and certainly has no place in a legal document.

In Britain we already have a clause in our copyright law granting special rights to the visually impaired:

If a visually impaired person has lawful possession or lawful use of a copy (“the master copy”) of the whole or part of - (a) a literary, dramatic, musical or artistic work; or (b) a published edition, which is not accessible to him because of the impairment, it is not an infringement of copyright in the work, or in the typographical arrangement of the published edition, for an accessible copy of the master copy to be made for his personal use. (Copyright, Designs and Patents Act 1988, 31A (1), which was added by the Copyright (Visually Impaired Persons) Act 2002)

It causes no problems that I know of, and no one has suggested that it is incompatible with the Berne Convention.


John E. Miller: My T-I-C solution — as posted elsewhere on IP-watch.org — is pretty simple: Just change the Berne 9(1) clause that grants ‘exclusive’ rights of authorizing reproduction in any manner or form to the copyright owner to ‘semi-exclusive’.

I am also skeptical about this one. First, “exclusive right” has a well-understood technical meaning: the right to stop others from engaging in the specified activity. In contrast, it’s not obvious what “semi-exclusive right” would mean. That would need to be spelled out explicitly, at which point we might as well just replace “semi-exclusive right” with whatever that more detailed definition said.

And second, given the existence of the immediately following clause in Berne — the 9(2) “three-step-test” — it’s also not clear why 9(1) would need to be changed:

It shall be a matter for legislation in the countries of the Union to permit the reproduction of such works in certain special cases, provided that such reproduction does not conflict with a normal exploitation of the work and does not unreasonably prejudice the legitimate interests of the author.

That leaves a lot of flexibility, which many of the proposed treaty provisions would fit easily within. And even if 9(2) is too restrictive for what you would like, why not change that language instead?


JG: How and where to strike that balance is the issue that will determine what rights a copyright “owner” holds, not the other way around.

FG: I simply do not see that the technical capability of producing an e-book and accessing it on the net means that copyright law needs to be changed to give greater rights to libraries.

Frances, your reply seems to me to be a non-sequitor, James was commenting on copyright law in general, not in the specific context of how it should be changed due to “the technical capability of producing an e-book and accessing it on the net”. One clearly sees this if one looks at the following discussion.

FG: Having to pay for a book does not render it unavailable, a book being in print form does not render it unavailable, and having to mail order a book, drive to a bookstore, or place an interlibrary loan does not render the book unavailable.

This is a false dichotomy. Availability is not a yes/no attribute as you are implying but rather lies on a wide spectrum, and it is interesting to consider the possible benefits or detriments society might incur by making a book (or a whole class of books) more or less easily available. (It’s interesting to consider, but in practice impossible to come to any agreements about this, since the benefits and detriments are subjective rather than objective, and even if they were objective, there is no way to run repeated experiments to make totally fair comparisons between various scenarios.)


I said T-I-C ‘Tongue in Cheek’ just to show how absurd are some of the recommendations to the wholesale increasing of limitations and exceptions to current Inteernational Copyright Treaties.

Relax.


Sorry, John. It’s an abbreviation I have never come across before. I take your point now.


Thank you, Gillian … Sorry for the confusion. And I am very familiar with the UK Copyright (Visually Impaired Persons) Act 0f 2002.

I have used some of the language in Section 31A (one-for-one copies) to make world -class Nobel Prize-caliber literature available around the world as — contrary to the accepted opinion including Royal National Institute for the Blind (RNIB) — that section nowhere states that the qualified recipient of Braille materials must be a citizen or resident of the UK.

‘World class’ literature as these are the types of books that would be available in English language at any National/major university library anywhere in the world and thus the recipient would have ‘lawful possession of a lawful copy’ as specified in the UK Act.


John the myopia I was referring to was a mental state not a physical condition. (I did not ‘see’ your background).

I think the correct term is restrictions of exclusive rights, for clear reasons of net community benefit


I don’t know about hedgehogs … I am just disappointed that the legitimate rights of disadvantaged communities to access copyrighted materials are being thwarted by the over-reach of those who are supposedly championing those rights. What purpose is served by introducing Treaty proposals that you KNOW will be opposed by the IP owner rights communities and multiple WIPO Member delegations?

How many exceptions and limitations to the ‘exclusive’ right of copyright owners can be introduced before the word ‘exclusive’ becomes a joke?

I have in fact offered a WIPO Treaty proposal with limited if any required definitions that has ample precedent in the copyright law of multiple WIPO member countries especially Japan. I do not anticipate much chance of its acceptance but it allows me to sleep well at night.


The fox and the hedgehog is a fable by Aesop. (The theme is also a relative of the ‘modern paradox’ and of Swifts parable of the debate between the spider and the honey bee in his ‘battle of the books’)

I Do not know much about the specific issues you are referring to, but judging on visible form ,I suggest that there is a reversal of means and ends that is the reason for the overreach; the restriction of rights is the vital end purpose and any benefit to the ‘disabled’(or.. whatever) is simply a means and non-vital .


Well I’ll admit to not being up on my Aesop’s fables. As for the world-wide publishing industry — where a stall is as good as a win — I’ll respond with the punch line from a 1970s Lily Tomlin comedy sketch:

“We don’t care. We don’t have to. We’re the Phone Compnay.”


‘As for the world-wide publishing industry’

The catch is, the professional representative groups facilitating the disabled’s relation ship with right-holders are the same groups that profitably facilitated the Publishing Industries relationship with right-holders.


Well if you mean IPA, AAP, UKPA, IFFRO, etc. … Yes it is the same alphabet soup. They all like to use the word ‘balance’ without ever saying what is really balancing what.


You might enjoy this one:

The Gramppian Hills: An Empirical Test for Rent-Seeking Behaviour in the Arts

Andrew Pinnock Online Publication Date: 01 September 2007

URL: http://dx.doi.org/10.1080/09548960701479508


Ps The Hedgehog knows “one big thing” , usually its because it is the only thing he can see.


Although I’m fascinated by the title of the paper John Walker posted, it’s behind a paywall (one can make a special request a copy from the author, but I’m not an academic).

A recently published paper which is right on topic, however, is Seeking New Landscapes - a rights clearance study in the context of mass digitisation of 140 books published between 1870 and 2010, published by the British Library under a CC license (on 14-Sep-2011, exactly 2 days after the filing of the suit against HathiTrust).

From the Conclusions section on page 5:

Mass digitisation potentially involves the copying and making available of millions of copyright works. At 4 hours per book it would take one researcher over 1,000 years to clear the rights in just 500,000 books – a drop in the ocean when compared to the rich collections of Europe’s cultural institutions.


JG: A 2.5% false positive rate isn’t going to be acceptable.
Isn’t trying to second-guess the decision of the judiciary a bit dangerous? Or is this statement somehow solidly based in legal precedent?

Since I’ve become interested in law, I’ve changed my view of the courts as tending to try to come to pragmatic decisions, rather than dogmatic ones. I would like to believe that the court’s decision about the allowable false positive rate would also depend upon its effects on society in general.


I have seen no proof that there are any significant benefits for the general public in digitizing older books.

It’s true that scholars can and do study just about anything. From my observation they are typically very determined. They will go to a great deal of effort to find their source material and where necessary, get permission to use it.

That leaves us with the reading habits of the general public. We have test samples: Public-domain material posted by Google, the Internet Archive, and Project Gutenberg. How much of this material is the public actively reading? Not downloading: Reading. On various forums, any number of book pirates have told me they do not feel they are violating copyrights because they never read most of what they download. It takes far less time to push a button than read a book.

Therefore, what are the large benefits to the general public that should by now be not only in the future, but evident? Are there any studies of them?


FG: no proof that there are any significant benefits for the general public in digitizing older books
As I posted previously (see end of this post):

  1. “Benefit” is obviously subjective here.
  2. It is impossible to “prove” anything about this since we cannot run two competing versions of reality, one in which the older books are digitized and another in which they are not, in order to fairly compare the results.

FG: How much of this material is the public actively reading?
If this is so interesting to you, Frances, perhaps you should go do your own research?


Ron,

I’m not the one claiming there are any benefits to mass digitization. I thought you were. Project Gutenberg has been around since 1971, Google began scanning in 2004, and the Open Libary/Internet Archive has been around since 2006. We’re now well into 2011. If there were large public benefits to scanning works and delivering them free over the net, I think we’d have seen some inking of those benefits by now. I don’t think, to see whether there are benefits from mass scanning and free e-delivery, it matters at all that these are public-domain works.

If you are not declaring these so-so called orphan works delivery projects will deliver sufficient benefits to justify their labor, expense, and the violation of numerous copyrights, please forgive me for misunderstanding you.

To my mind, the “public” asking to benefit from “universal” scanning projects is a group of people who declare themselves too lazy to go to a library and check out a book, and too cheap to pay anything at all for a book, new or used, paper or electronic. Therefore, I doubt they actually want to read these books much.

Also, many of the books in question being obscure is, essentially, a marketing issue. If people do not hear about a book, they do not look for it and if they run across it, they are less likely to pay attention. Almost all bestsellers are made bestsellers by vigorous marketing efforts by the publisher and/or the author. They are very seldom spontaneous choices by the general public that aggregate and push the book to the top. In other words, a so-called orphan work can be delivered electronically and free of charge, but if no one hears about it they will not read it.

Re book prices, in marketing you actually have to tell readers they are getting a bargain and spell out how much. They are often not price sensitive enough to do it themselves. This is why Amazon presents their discounting like this (from a listing of a random book):

Price: $10.17 & eligible for FREE Super Saver Shipping on orders over $25. You Save: $4.78 (32%)

So I’m not sure having free books makes anyone read more.


Thanks for the reference to the paper on rent-seeking, John. I found it very interesting.

Ron - to me the passage that stands out in that British Library paper is the following:

7 Permission to digitise was sought for 73% of the books in the sample. Of these:

  • rightsholders gave permission for just 17% of the books to be digitised;

  • permission was not granted for 26% of the titles;

  • for 26% of the titles no response was received;
  • rightsholder contact details for the remaining 31% of the titles could not be located. (p. 5)

A substantial percentage of traceable rights-holders definitely do not want their work digitised as part of a mass-digitisation library programme. This thoroughly undermines the facile ‘authors want to be read’ assurance that we hear so often from the enthusiasts for ‘orphan works’ and compulsory mass-digitisation projects.

While we are recommending papers to each other, let me draw attention to Six Provocations for Big Data by danah boyd and Kate Crawford.

One point they make that is very apposite is that ‘Just because content is publicly accessible doesn’t mean that it was meant to be consumed by just anyone.’ Further on they say ‘Data may be public (or semi-public) but this does not simplistically equate with full permission being given for all uses. There is a considerable difference between being in public and being public, which is rarely acknowledged by Big Data researchers.’

They are talking primarily about social media, but this can also apply to printed materials.


Gillian,

My view is that libraries and other entities (such as Google) are urgent to mass digitize copyrighted material now and furthermore, to promote laws to do so without gaining permissions now, precisely because some copyright holders may refuse permission. Even some publishers too-aggressively want to digitize older books they produced, but to which they did not buy e-rights and/or where rights have reverted to the authors.

But also, the e-book market is growing very rapidly. Whoever mass digitizes now has a better chance of controlling a larger share of the market. The longer Google or the libraries wait, the greater the chance that the publisher or author will produce an e-edition of the book—and charge libraries for single copies or pays-per-view.


GS: This thoroughly undermines the facile ‘authors want to be read’ assurance that we hear so often from the enthusiasts for ‘orphan works’ and compulsory mass-digitisation projects.
Not really. We’d have to know that the contacted rights holders were actually the authors themselves for this to be true.

If you look at Figure 18 on page 47, you learn that overall, out of the actual responders (which were only 44%), 38.6% agreed (if we include the 27% which passively didn’t respond, the agreement percentage drops to 23.9%). This seems remarkably high, considering that the negotiations were for permission “to digitise their whole work(s) for placing on the Internet for open and free public access including downloading and printing” (see page 30 of the report, and Appendix 4). The same figure shows that there were only 10 self-publishers contacted, and out of those 10, 7 responded and 3 agreed (= 42.9% of those responding and 30% of those contacted).

In addition, if the negotiations were still open at the end of the allotted time for the study, which sometimes happened because of third-party rights holders, not being able to contact all of the heirs, or other complications, this was considered to be a refusal.


Ron,

On what grounds are you contesting that 23.9% agreeing to digitization is a large percentage? Furthermore, why do you believe (if you do) that the active consent of those who provided it should govern what happens to the works of those who refused consent? And, why do you believe (if you do, I’m not clear) that the consent or refusal of a copyright holder other than the author less valid than the consent of a copyright holder who is the author?

Please explain your position.


Ron

If you were employed or a student in the right sector, access to the publication I linked to would appear to be ‘free’, you might even misrecognise this conditional privilege for a right to free access.


Frances, because the Laboratorium only supports linear commenting, not hierarchical, whenever I wish to comment on a specific comment (or often, only a small part of a specific comment), I prefix my comment(s) with a quoted section (using <blockquote> and </blockquote>). The comment you are asking about was a comment only on the quoted section of Gillian’s comment, and not on the topic of this thread in general.

The line of your questioning, and in particular your last question, lead me to believe that you didn’t read my comment in the correct context (as a reply to Gillian’s opinion).

A more cogent rebuttal to her argument (but not necessarily to her conclusion) is that she is basing her statement about authors of orphaned works on a study which gathered statistics on rights holders in general (not authors in particular) and which based its results on non-orphaned works (obviously, since they couldn’t contact the rights holders of the orphaned works, by definition).

Now that I think about it, there probably exists a very small set of “contactable authors of orphaned works”, where the rights holders of their works are non-contactable entities. I wonder what the responses would be from this population to a “what if you could authorize” question.


FG: I’m not the one claiming there are any benefits to mass digitization. I thought you were.
You don’t seem to be following my arguments, so I summarize.

I believe there will be overall benefits. But this shouldn’t bother you at all, since what I think are benefits could very well appear to you to be deficits, and I don’t believe I can ground my belief with any objective evidence. Since you have already posted many eloquent posts about how you don’t believe there will be benefits, it is unnecessary (in my eyes) that you make yet another one on that particular topic, merely because I disagree with you.

Do you believe that mankind was impoverished by the loss of the second book of Aristotle’s Poetics? Do you believe I could prove to you that it was?


Ron,

I’ve very seldom seen you claim anything clear and definite in terms of your own views, which is why I asked. It seems only fair to question you directly, rather than making wrong assumptions as to your views, or writing long rebuttals to arguments you did not make and positions that you did not take. I don’t want to address little nits picked in the format and phrasing of people’s messages instead of them main arguments.

I am not asking you, for example, to prove to me that the loss of the second book of Aristotle’s Poetics did (or did not) impoverish [hu]mankind. I merely want to know if, and if so why, you believe this has anything to do with the post-1923 copyrighted works digitized in a non-archival-quality way, which are certainly available from the library that lent them to be digitized and probably, in most cases, many other places as well. Also, if, and if so why, you believe distribution of these files to (even a portion of) the public is necessary for their preservation on a server.

I would also like to know if, and if so why, you think it matters whether the copyright holder is the author or not. Most authors already prove they are interested in making money from their work by not just posting 100% of it free on the net or putting it all in the public domain. Instead, most are seeking the best deals they can find with publishers (often with the aid of a literary agent), or arduously setting up a business. The fact that some post some work free for marketing reasons does not change this.

And, the fact that the heirs of deceased authors, or publishers or other entities to whom the author may have previously sold rights, are also interested in protecting their older works, does not lessen any argument that the owners of so-called orphan works are interested in the use of these works.

Re the preservation of current books, I have permission to quote from publisher, author, former owner of a print shop Pete Masterson. This is advice to a new publisher on selecting paper:

“Printers of books have long ago switched away from high-acid papers and most printing papers (for any purpose) are now acid free… . Alkaline paper has a life expectancy of over 1000 years for the best grades and about 500 years for the average grades. However the switch to alkaline papers is mostly due to the cost efficiencies in modernized manufacturing processes of wood pulp papers. With the reduction of acid, there is much less wear and tear on the equipment used in the paper production process (etc.). The ‘acid free’ processes also tend to allow paper mills to better comply with environmental regulations. Due to these factors, the nasty, old-style, ‘acid’ paper has ceased to be manufactured.”


James in this news briefing from the Australian Copyright council you get a nice Guernsey : “One legal expert who has been prepared to share his speculations online is New York Law School academic James Grimmelman. ”

Authors create a new sub-plot in the quest to digitise the world’s books


Do you believe that mankind was impoverished by the loss of the second book of Aristotle’s Poetics?

Purely on a point of information here: Richard Janko has revived an old theory that it isn’t lost, or not entirely; that it has survived in part through a fragmentary adaptation called the Tractatus Coislinianus. (See Janko, Aristotle on Comedy)


Would the second book of Aristotle’s Poetics gone out of print?


As I understand it, the Poetics was written in the 4th century BC. Gutenberg started using movable type printing in the early 15th century. In other words, the Poetics would never have been mass-produced anywhen remotely near its original date of writing.

The so-called “orphan works” published in the US were all published after 1923, which is an entirely different situation regarding the availability of copies of out-of-print books. However much we may regret the vanishing of various works of antiquity (the one I want most is the complete Satyricon) the situation is not even remotely the same and the comparison makes no sense.


I would say that the comparison makes sense: this is a case in which we can learn from the differences as well as the similarities. In the manuscript era, in a sense, all books were almost always “out of print” and it was the active circulation of copies among readers who made their own copies that best ensured a book’s survival. That situation changed dramatically with the arrival of print, which has very different stability properties, but in some ways digital media are much more like manuscripts than like printed volumes. I’m thinking particularly of the fragility of digital storage media and format obsolescence: digital artifacts require more active curation and preservation than printed books do.


In the manuscript era, there were very few copies of most works to begin with. Once a book has been through a modern print run, there are typically many more copies floating around than when every copy was produced by hand. Enough, in most cases, so that digitization can wait till the end of the copyright term. There’s no need to panic. There’s no need to say, “This is our last chance, we have to digitize everything now or it will all be gone tomorrow!” The Hathi Trust can just make lists of books and their copyright expirations, and wait till they fall into the public domain to digitize them. Meanwhile, I’m sure they have plenty of pre-1923 works to keep them busy.

There’s also no need to *distribute” copyrighted works without permission—quite a different issue from scanning them.

(As a side note, today I received in the mail a book I heard about from the Hathi Trust proposed “orphan works” list: Agnes Marsh and Lucile Marsh’s 1932 Text Book of Social Dancing. I merely went to www.addall.com/used, found several copies on offer, bought one at a modest price, and it’s in very nice, usable condition too. And since different books are listed on addall all the time, if I had not found any copies of this one I merely would have checked again in a few weeks.)

I do agree that digital copies are likely to last a much shorter time than paper books. If World War 3 happens and destroys almost everything on earth within a few days, the survivors in the ruins will have a much easier time reading paper books than digital ones. I really don’t see how Hathi can regard these scans as permanent.


By the way, James, if what you are worried about is survival of new books published only in e-form:

What I would suggest is, passing a law that says each publisher prints two or three copies on paper, if nothing else in print-on-demand format. This could be done fairly inexpensively. The publisher would then mail the copies to the Library of Congress, or some such repository, which would warehouse them under good, if not necessarily museum quality, conditions. Even the print-on-demand printers are using alkaline paper. Therefore these copies should last at least a couple of hundred years—till after the books fall into the public domain. When they pass into the public domain, and assuming they have not been kept in active circulation by the copyright owner, everyone, new publishers, libraries, or whoever, could take advantage of their public-domain status. These books could be republished, a few more archival copies could be produced, whatever.

And no upgrading many thousands of scans every few years when Adobe or whoever decides to release a new file format. It’s a much cheaper and more robust solution than scanning.


Re: deposit of works published only electronically, the Copyright Office is working on the problem.


Sorry, James, but I should add: Every upgrade typically has to be proofread. Every day, on self-publisher’s lists, I see e-book publishers wailing and gnashing their teeth because their automatic format converter, wasn’t. They’re producing a different file for every hardware device and each version has to be laboriously hand proofed and corrected.

I can’t see Google doing this work, I can’t see the Hathi Trust doing it, and I can’t see J. Random e-book reader doing it every time he or she changes or upgrades his/her hardware. It’s going to be more like, J. Random Reader says, “I already read the book, so I’ll just delete the file.”

I don’t actually, see print books going away. And one thing I do see some e-publishers doing, is small print-on-demand runs for reviewers. Reviewers may or may not need paper books, depending on your point of view. They do, however, love to see a nice marketing package that shows the publisher spent money to impress them. And professional reviews can make significant sales, so, it’s better to improve your chances by making reviewers happy.


James. I will read the report. Meanwhile, according to the research of publishers I know, the two mandatory copies of the “best edition” sent to the Copyright Office for the purpose of registering the copyright are given to the Library of Congress after the Copyright Office is through registering them, giving the LOC a total of three copies of each book.


I realized that not everyone knows that publishers must send the Library of Congress one copy of the best edition directly for the privilege of receiving either full cataloging data or merely a cataloging number. That is why the LOC ends up with a total of three copies of most books.


By the way, since issues of preservation can be entirely separated from issues of distribution:

Washington, DC itself is not the best place to preserve books for a couple of hundred years/till the copyrights expire. Obviously, the Library of Congress has done this for a long time. But, unfortunately, Washington DC is a prime target for attacks by terrorists and hostile foreign powers. Other major urban areas are not far behind. Some less populated area of the US would be much better, and would have the additional advantage of cheaper warehousing and labor.


I’ve very seldom seen you claim anything clear and definite in terms of your own views, which is why I asked.
For one, my views are much less definite than, shall we say, some other posters? And I don’t believe that my personal views about orphan works in general are on-topic for this thread, which I believe should focus on the particulars of the lawsuit.

On the other hand, I have no qualms about trying to refute factual claims made by others which I feel are inaccurate. Or posting references to information which I feel is relevant to the topic of the thread.

Your request, however, has spurred me to try to solidify my stand on some of the issues you ask about, and hopefully soon I will send you a link to a discussion page where we can discuss our personal views as long as you like, without being off-topic here.


I am not asking you, for example, to prove to me that the loss of the second book of Aristotle’s Poetics did (or did not) impoverish [hu]mankind. I merely want to know if, and if so why, you believe this has anything to do with the post-1923 copyrighted works
It was an attempt at a simple analogy. In a similar way to Aristotle, I cannot prove to you that even the extreme case, in which all currently orphaned works in all of the world’s libraries would be immediately lost, would actually impoverish mankind.

I do find it fascinating how my attempted analogy evinced such interesting comments by others. I feel obliged, however, since the discussion has wandered into my field of expertise, to correct the inaccurate impression that digital image formats are constantly revised in a way similar to Microsoft’s file formats.

It is highly unlikely that the scans are in any kind of proprietary format which is changed at the whim of a commercial entity. All image formats which are commonly used in industry are standardized and have open source libraries which are capable of reading them, and one can be fairly certain that Google, of all entities, would prefer using such formats. Even Adobe PDF has a specialized standardized subversion called PDF/A which is designed especially for long-term archival.


In the manuscript era, in a sense, all books were almost always “out of print” and it was the active circulation of copies among readers who made their own copies that best ensured a book’s survival.
Interestingly, one of major differences here between digitized works and manuscripts is that the physical effort to make a copy of digitized works is trivial compared to that of copying a manuscript, but during the manuscript era there were no legal impediments to this copying.
but in some ways digital media are much more like manuscripts than like printed volumes. I’m thinking particularly of the fragility of digital storage media and format obsolescence: digital artifacts require more active curation and preservation than printed books do.
Another difference is that the effort to maintain digital artifacts is almost certain to constantly decrease as time progresses (barring, of course, the kinds of catastrophic scenarios which Frances has raised), and the effort to copy manuscripts remained relatively constant throughout the entire era they were used, before the invention of the printing press.


Ron, in the early days of printing in britain, printing was a monopoly of the printers guild- and prior to that ‘copying’ was largely a ‘monopoly’ of monasteries. That printers monopoly collapsed in about 1695 and there was a brief period of free for all , before the Queen Anne act restored rule of law to the area.


“but during the manuscript era there were no legal impediments to this copying.”

Economically speaking, this is because the labor (cost) involved in making a hand made copy is as much as labor(cost) involved in making the ‘original’ hand made copy.

It is exactly because it requires little labor to make and distribute digital copies when compared to the, often years of, labor involved in making the original copy that copyright matters… from a community benefit perspective. Getting the results of others labor for nothing (or removing from some citizens the right to refuse to supply labor) degrades all, not just the people on the short end of the stick.

It is pretty plain that the digitizing libraries plan is only economically possible if it pays little or no attention to individual rights- I.e it is not a project that can be lawful, moral and economic at the same time.

Equally it is pretty plain that the extraordinary lengthening of the term of copyright that began in the UK in about 1850 is a problem that needs addressing.


Should have said - ’ digitizing libraries and wide distribution plan’ ; If it can really be restricted to ‘inside’ the library and every copy made appropriately paid for , no problem.


John,

If you are talking about ancient Rome, literary works tended to be produced via the patron-client system and copying of them was done by scribes, who were often slaves, but sometimes paid. In other words, there were methods of paying for the labor of creating works and of copying them. But these are not part of, and not appropriate for, our society.


Frances I was speaking of the situation in Europe prior to the wide up take of printing presses say about C1400.

Physical copying is a fairly skilled (and neat) sort of activity , sort of doubt that really unhappy slaves would have been much use?


Ps Have you read the short novel by Melville ,title escapes me, about a scrivener who would ‘prefer not’ to copy?


John,

Ancient Greece and Rome were slave economies. Many skilled tasks were carried out by slaves. And it was a lot better than being an agricultural slave.


Frances

This has gotten a long way from the issues around ‘libraries’ blurring into ‘publishing houses’.

Modern western economies date to about the time when the first Medici Banker connected up with mathematicians such as Fibonacci and created the beginnings of Modern finance , markets (and ‘Art’ for that matter). Obviously , Ancient Rome was not a modern economy and Ancient Greece could be said to be pre ‘economic’, in any modern sense of the word.

In the 20 years prior to the ongoing global financial crisis , we were living in a world where much/most economic activity in the west was just ‘activity’ ( velocity ). And during that time many lost sight of the fact that if you want people to do skilled creative work on things- value adding as apposed to inflation- you must pay for that work and that to set the price and make that payment you need a market and a market that does not include the various forms of economic property rights that underpin the systems of symbolic exchanges that is a economy, is not a functioning market (or economy).

If the Libraries (or anybody else) were to flood the market with un-authorized ‘free’ copies, it would destroy the market in a way similar to the way that a flood of counterfeit or sub-prime bank notes would do. From a community wide perspective it is not a good idea, whatever the apparent short term benefits.


John,

I absolutely agree with you. I am just pointing out that even in pre-modern economies with no copyright laws, there were methods of paying writers and artists for work, such as the patron-client system. I would argue that a system where a patron supports an artist or writer in (usually) return for some level of control over the work is in fact an economic system; it is just not our current one.

I would argue that every system where people exchange labor or goods for something else, even barter/no currency, is an economic system.

But, if there were no copyright protection, we cannot expect our modern economy to bring back any older methods of paying writers. Currently, writers and artists are responsible for making their work pay; few are subsidized. Therefore we have to give them the exclusive rights that enable them to sell their work.


Actually the old methods of payment for creative acts are mostly still around, some arrangements suit some situations better .

There is little new in things human ; ” ‘it’ is different this time” is a very old belief. The idea of endless self replenishing magic pudding is a wonderful children’s story but we all have to grow up sometime.

Re ‘economies’, Some of us grew up with sealed clay pots and some of us could only afford a ’ hole in ground’ to group things by , I used ’ Modern Economic systems’ to contain/ represent economic systems capable of representations of representations.


Thomas Macaulay, in the House of Commons in 1841, argued against the extending of copyright to 60 years from the death of the author, the speech ended :

“At present the holder of copyright has the public feeling on his side. Those who invade copyright are regarded as knaves who take the bread out of the mouths of deserving men. Everybody is well pleased to see them restrained by the law, and compelled to refund their ill-gotten gains. No tradesman of good repute will have anything to do with such disgraceful transactions. Pass this law: and that feeling is at an end. Men very different from the present race of piratical booksellers will soon infringe this intolerable monopoly. Great masses of capital will be constantly employed in the violation of the law. Every art will be employed to evade legal pursuit; and the whole nation will be in the plot. On which side indeed should the public sympathy be when the question is whether some book as popular as Robinson Crusoe, or the Pilgrim’s Progress, shall be in every cottage, or whether it shall be confined to the libraries of the rich for the advantage of the great-grandson of a bookseller who, a hundred years before, drove a hard bargain for the copyright with the author when in great distress? Remember too that, when once it ceases to be considered as wrong and discreditable to invade literary property, no person can say where the invasion will stop. The public seldom makes nice distinctions. The wholesome copyright which now exists will share in the disgrace and danger of the new copyright which you are about to create. And you will find that, in attempting to impose unreasonable restraints on the reprinting of the works of the dead, you have, to a great extent, annulled those restraints which now prevent men from pillaging and defrauding the living…”

For what its worth I see ‘some’ interconnected problems with copyright that will provably need sorting out over the next decades.

• Copyright has become over extended , both in how long it applies for and in what it applies to, it has often been conflated into varieties of hypothecated taxes .

• Copyright in the ,now ended, age of Mechanical Reproduction lost much of its direct linking to individual creators, it became blurred into a right of ‘special’ groups to receive a ‘rent’ from property rights that they had little connection to and had not labored on. Arbitrary rights to receive ‘rent’ are a problem in representative democracies, they create special privileged groups with spare money and lots of spare time, that can be profitably occupied in lobbying for more Rights for the special group to receive even more rent. And it results in wide general disrespect for individual property rights: law becomes but “A fig for those by law protected” ; the massive property crime problems of the Regency period in the UK are a good example.

• The problems created for copyright legislation by the convergence of media(s’, the printing press, the television station, radio station , web browser , the library (and the kitchen sink) are all now ‘the same but different’.

•And last but not least , the problems created when reading something is apparently the same as printing a new copy.


Francis Grimble says:

“And then, any one student uploading a copy to a torrent site might well kill sales.

People are taking the attitude of “If the neighbors leave their lawnmower out for a couple of weeks, they must not be using it, I can use it, therefore I am entitled to take it.” This is theft, with intellectual as well as physical property.”

Honestly. I am sickened by this comment. “Intellectual property”? True intelligence is measured by the wisdom it carries. And wisdom should cost nothing at all. It is a God given obligation to help educate all fellow men in knowing. By worldly standards, a torrent may be against the law. But it is unethical to sell a book written in 1894. And HathiTrust is basically doing just that by hourding the books for special people with login credentials. Make them free like Google does. They are here for all to read. If not, then they aren’t worth the paper they were printed on.


Joe, HathiTrust makes public-domain books (and books published in 1894 are in the public domain) available free online for anyone to read. (Here’s an example.) And the reason these books aren’t simply available en masse as PDF downloads is that Google forbade it when it gave the libraries the scans. So it seems like the target of your criticism is more properly Google.


Thank you Joe M. The difference between permitting a researcher to use a book nobody has printed or sold in 25 years and making off with your neighbors lawnmower should be obvious to anyone who cares to honestly engage in the debate. In many societies, if your neighbor leaves something lying around for 25 years (or a lot less) and using it cannot possibly reduce its value (in fact, the use of an “orphaned” out of print work can only increase the work’s future value), it would certainly be fair game to treat it as community property.