Advances in Book Scanning

Google engineer Dany Qumsiyeh has been using his 20% time to build a better mousetrap, where by “mousetrap” I mean book scanner. He’s focused on the problem of page-turning, which typically requires either a human to superintend the scanning process or expensive robotic technology. His innovation is to slide the book back and forth over a prism connected to an ordinary vacuum cleaner, gently sucking one page at a time from right to left. He describes the resulting scanner in a Google tech talk video. I wouldn’t put my money on this particular technology being one for the ages, but it’s another data point on the remarkable level of innovation being poured into making book scanning faster and cheaper.

With services like 1DollarScan digitizing books for between 2 and 3 dollars, how much cheaper do you want to go? Scanning is clearly something better done by service bureaus, in volume, rather than with DIY equipment.

I don’t see why the average person would want to scan books. Especially since Google is doing it for them at no cost or trouble to the average person, and bearing the cost of the resulting legal battles. And publishers wanting to do reissues just go to service bureaus.

Stating absolute falsehoods such as “Google is [scanning books] for them at no cost or trouble to the average person” does nothing to advance the discussion.It merely destroys one’s credibility as an intelligent commentator on anything related to Google Books.

The bottom line: Google scanning does not make the full text of books accessible to the average person. All Google’s indexing does is make it easier for the average person to identify titles that users then have to acquire elsewhere.

If you want to format shift your personal library, you have to rely upon a commercial service such as 1DollarScan or do it yourself with commercial or DIY book scanning equipment such as James highlights. Google has nothing to do with it.

After I made my last comment, I learned that a company called Mocavo is offering free scanning services through the end of the year. Books do have to be out of copyright.

And I fear that my language may have been intemperate and could cause offense. I don’t want that, and I apologize if I have overstepped.


I’m not offended, but it is a fact that Google has scanned millions of books, copyrighted as well as out of copyright. And Google has in fact made the full text of many public domain books, at least, available.

Personally, I would not scan a copyrighted book and I would not subject my collection of rare public-domain books to a vacuum cleaner.

I will add that publishers and authors not only have the choice of using a scanning service for books they legitimately wish to reprint, but that many books have been in some kind of digital form since the 1980s. Authors often save their digital files, so reuse of those files can be an excellent choice, even if the files require some massaging. After all, a well-scanned book should also be proofread and the images cleaned up.