The Laboratorium
May 2007

The Band of Theseus


I’m a little troubled by truth in music bills, which would generally prohibit using the name of a musical group to advertise a concert not featuring any of the original members of that group. There are plenty of musical organizations whose membership changes organically over time. Making a fetish out of the original membership misses the point. Are concertgoers who see the New York Philharmonic being misled because there aren’t any “original” members still in the group? And what if the newer members are better? Any examples of bands that only became great after undergoing a near-total change in membership?

Mehr Als Sudoku


On my recent day of eighteen straight hours of multi-modal travel, the one thing standing between me and insanity was a German puzzle magazine by the name of “Mehr Als Sudoku.” I skipped the initial sudoku, the puzzle-world equivalent of square dancing: geometric and wholesome but mechanical and deeply lame. There followed, however, a variety of puzzle genres that kept me entertained for the many hours of my trip.

I was already familiar with Slitherlink and Hashi, both of which hit the sweet spot between too unconstrained and too automatic. There were also some puzzles I’d never seen before. I spent about an hour trying to figure out the rules to Hitori based on the German instructions and the solutions in the back.

The most entertaining genre, and one on which, I swear, I spent something like four hours, was called “Labyrinth.” The goal is to link a start square to an end square with a single path that goes through every square in the grid. Every edge in the grid is either a wall or is crossed by the path. You’re given clues about the location of the walls in paint-by-numbers stye: next to each line across and down is a sequence of numbers telling you how long the various consecutive wall segments in that row or column are, in order.

I found filling in this genre of puzzle to be particularly satisfying. There’s a lot of switching back and forth between reasoning about the path and reasoning about the walls. Every small deduction helps; there are sometimes surprisingly non-local consequences of incremental bits of progress. The four or five Labyrinths supplied with the magazine were just enough for me to start piecing together a small mental library of common patterns. I found it an extremely good puzzle for those times when my concentration was shot but I needed something to take my mind off the boredom of travel.

I would be very grateful to anyone who could help me track down (a) the official name for this puzzle in English or in Japanese, and (b) a source for more of them.

China Crafts Cyberweapons


Is the headline “China Crafts Cyberweapons” from

(A) A Terranova post about cut-rate Chinese power-levelers cornering the market for exotic loot in MMORPGs;

(B) A Slashdot post about Chinese hackers developing viruses for use against enemy computer systems; oe

(C) A BoingBoing post about how to make cool-looking futuristic zap guns out of porcelain?

Answer here.

Why I Have Not Blogged Yesterday or Today, a Partial Explanation


Riddle me this: why did the mofo who took out a Kohl’s charge card using my name (misspelled) and bought $500 of housewares also sign up for “Account Ease” protection, which would cancel his (read: my) balance if he (read: I) were hit by a bus?

Mathematicians’ Revenge


It’s Search Engine Week here at the Batcave. Now, to be honest, since I started writing on the subject, pretty much every week is Search Engine Week. But this week is Search Engine Week in the sense that I’m reading multiple books on search engines. After the Jeanneney, I’m now into Google’s PageRank and Beyond, a moderately rigorous mathematical discussion of PageRank and Kleinberg’s HITS. The question that it brings to mind is:

Until a few years ago, would you ever in your wildest dreams have thought that one of the most lucrative businesses in the world would be based on computing the eigenvectors of linear operators on billion-dimensional vector spaces?

Scan Them All and Let Google Sort Them Out


Towards the end of Jean-Noöl Jeanneney’s Google and the Myth of Universal Knowledge, he writes:

In practical terms, what criteria will govern the decision to digitize certain works? With respect to the vast legacy of works now in the public domain … we should favor the great founding texts of our civilization, drawing from each of our countries: encyclopedias; journals of scholarly societies; major writings that have contributed to the rise of democracy, to human rights, and to the recent unification of the Continent; writings that have fostered the development of literary, scientific, legal, and economic knowledge, as well as artistic creation. We should add to these, as I’ve already suggested, works that have appeared in numerous translations, thus attesting to their influence. The same guidelines, probably with less rigid specifications, can be followed for the more recent period. (p. 78)

A little earlier, he explains why this rigorous process of selection is necessary:

A final observation: maybe one of the reasons that the top managers of Google never seriously broach the question of how works are to be digitized is that they maintain the conviction—or rather the illusion—that they can digitize all the books that have ever been printed since the time of Gutenberg. In this fantasy world, there would be no need to worry about selection, and the performance of the digital library would depend only on the quality of the search engine (or engines). But since this perspective is beyond what we can reasonably envision (Is this a bad thing?), we must find the means not only to furnish Internet users with organized knowledge but to indicate its limitations. (p. 73)

This is, in a word, oldthink. Jeanneney repeatedly assumes that comprehensive digitization of our print archives is a pipe dream, from which it follows that the selection processes governing digitization acquire enormous cultural and political importance. I certainly agree with him that the selection is critical, particularly in the shorter run. It is, for example, of great importance that Google’s Book Search scanning project be accompanied by equally ambitious projects for non-Anglophone collections. But that’s as far as this particular observation belief to go.

We are going to have the capability of scanning everything, and we should. Scan it, OCR it, check it, stick it online, and open it up to lots of search tools. (Not just one: he and I agree on this, as well.) The initial Google announcement involved some fifteen million books. The total number of titles printed in the West since Gutenberg is somewhere upward of a hundred million. Google’s proposal is ambitious but clearly realizable; aiming for a hundred million is somewhat more ambitious but not unreasonably so. It will seem more and more plausible with time, as scanning and indexing technology continue to improve.

Things will be chaotic, certainly. There will be duplicated scans, scans of different editions of the same book, scans of translations and pirated foreign editions, scans of books missing pages, and so on and so forth. But these are not dealbreakers. These are exactly the sort of gnarly semistructured data analysis that drove the last few rounds of stunning innovation in Web search. Get the corpus out there and search algorithms will arrive to take advantage of it. Call it a Say’s law of data: put an interesting dataset online and someone will find something interesting to do with it. The point is that massive scanning helps create raw material on which the complexity-increasing dynamism of the Internet feeds. Committees of experts can help us decide what to scan first, but they should not have to decide what to scan at all.

Jeanneney, president of the Bibliothèque nationale de France, clearly loves the potential of digital archiving and appreciates the value of search. But he doesn’t get search. Again and again he complains: “An indeterminate, disorganized, unclassified, uninventoried profusion is of little interest.” (p. 7) “Under these conditions, an undertaking of this kin, attractive as it appears, can hardly be pursued effectively other than within a restricted community capable of ensuring quality under cooperative control.” (p. 51) “The fantasy of exhaustiveness dissipates in the need for choices.” (p. 71) “Hasty classification of a list, following obscure criteria of classification, must be replaced by a whole range of modes, classification modes for responses and presentation modes for results, to allow for many different uses.” (p. 72) “And we must help their teachers by protecting them from disorganized information.” (p. 87)

Exactly, I would say. That’s exactly what good search does. It turns a profusion of scattered information into accessible, organized forms. Jeanneney is right to demand diversity both in the information accessible and in the tools used to access it. But he doesn’t seem to get the idea that the best way to create useful order online is to embrace the chaos. Wikipedia’s lack of a “restricted community” helps it produce more reliable information, not less. Google works because it indexes everything, rather than picking and indexing a subset of high-quality sites. Jeanneney sees a cluttered desk and assumes a disorganized mind.

There is much else to say about this fascinating, baffling, brilliant, confused, maddening, and thoroughly Gallic sliver of a book, but this is the thought that stuck most in my mind as I read it.

The Yiddish Policemen’s Union I give it 5 stars


Some years ago, after reading and loving Michael Chabon’s The Amazing Adventures of Kavalier and Clay, I poked around his web site and came across a remarkable essay by the name of “Say it in Yiddish.” It starts as a set of wistful reflections on a Yiddish phrasebook for travelers, and then spins out into a fantasia on the idea of a Yiddish-speaking Jewish homeland in Alaska.

Chabon came through Seattle for a reading later that month. When I got to the head of the receiving line, I used my few seconds of author time to tell him how much I had liked “Say it in Yiddish.” He was visibly surprised. Only after a moment of you-didn’t-really-just-say-that bafflement did his expression turn to gratitude. It was as though his natural emotional reaction had been filtered through a disbelief filter. He thanked me, saying that he didn’t often hear that. Then he signed my book with a flourish and a sketch of a key (the symbol of the Escapist from the comics in Kavalier and Clay.)

At the time, I chalked his reaction up to the obscurity of the essay. Only later did I learn that that the essay had attracted a fair amount of controversy. Several leading scholars of Yiddish thought that it was an attempt to make fun of them or of the language. I think now that what surprised him about my praise for the essay was not that I was mentioning it at all, but that I was praising it.

Well, Chabon has now turned the essay into a novel, for which my praise knows no bounds. The Yiddish Policemen’s Union turns his conceit of an Alaskan Jewish homeland into the setting for a detective story. In this alternate universe, the Federal District of Sitka was carved out of Alaska as a temporary resettlement area for European Jews during the Second World War; now, sixty years later, it is a few months away from “Reversion,” and everyone is nervously awaiting the unknown next stage in the ongoing exile of the Jews. Against this backdrop, down-on-his-luck homicide detective Meyer Landsman washes up in a cheap residence hotel, where one of his neighbors turns up dead. More out of a sense of personal affront than anything else, Landsman starts poking his nose around, discovers that powerful unknowns want him off the case, turns up some unexpected connections to an insular messianic Hasidic sect, and deals with the usual assortment of beatings and surprises any detective protagonist must endure.

Chabon’s Sitka is a gloomy, cantankerous place. An atmosphere of decay and depression pervades the novel, a sense of desperation as this dark and cold homeland is running out its days. He has a talent for tossing off scene-setting details casually, as though they are simply a part of the background knowledge that everyone shares: a snack of pickles dipped in sour cream, the Big Macher department store, a leftover landmark from the 1977 Sitka World’s Fair now locally known as the “Safety Pin.” It is such a perfectly realized place that both the characters and the plot grow naturally in its frigidly alien soil.

The writing is also spectacular. Of course there are Yiddishisms everywhere, from colorful words like noz and shtarker to phrases that are clearly English renderings of Yiddish originals: “sweetness” (from bubeleh) and “bang me a kettle” (from hak mir nisht ken tshaynik). The dialogue is florid and insult-laden, and Chabon is good enough at the rhythms of Yiddish complaint that you can tell the genuine invective from the disgruntled banter that his characters speak as a matter of idiom. He intends for the whole novel to read as though it were a loose translation from a Yiddish original, and it does.

Add to these virtues of atmosphere and language the usual qualities one expects from a Chabon novel: a compelling plot, sympathy for all of his characters, moving reflections interlaced here and there, a memorable sentence at least once a page, an instinct for universal human weaknesses and surprising strengths. The Yiddish Policemen’s Union is neither better nor worse than Kavalier and Clay. Both are as good as one could hope for in a novel, each in its own way. I didn’t read this one in one sitting, but I wish that I could have.

Wanted: A Bookmark Button Standard


Chase posted a ridiculous screenshot of “share this post” buttons. You know how some blog posts end with a little icon inviting you to “Digg this” post, or to bookmark it with Del.icio.us? Chase found a blog that provides buttons for some twenty-three different bookmarking, link-sharing, and other Web 2.0 services. (It reminds me a little of Jason Kottke’s Metadazzle overfizzle.)

I can understand how displaying these buttons can be in the karma-whoring interests of bloggers. Being easily Digg-able helps your chances of being Digged. The same goes for being easily Furl-able, and so on down the line to the more obscure forms of popularity: Scuttle-ishness, DZone-hood, Fleck-itude, and so forth. You might have some readers using Jim-Bob’s Bookmark Service, so why not add a Jim-Bob-This button? The trouble is that the Jim-Bob Bookmark users don’t use SquidShare, and vice-versa. Each individual reader might care about one or three of the buttons, but not the rest.

The problem, however, would be easy to fix with a bookmark button standard. A simple three-step dance among bloggers, bookmark services, and browsers would allow readers to see those and only those bookmarking buttons of interest to them. Here’s a quick sketch of how it would work.

In step one, bloggers would add a bit of metadata to their blog posts. Each entry displayed on a page would include a little bit of extra data indicating the permalink of the entry. This would be easy to automate with blog software.

In step two, users would tell their browsers what their favorite bookmark services are. This would involve a Firefox plugin; I’m sorry that it would be harder to do in IE, but if you’re using IE, your browser sucks. The telling mechanism could be almost completely automated. Once you had the plugin installed, it could be configured to recognize “add this bookmarking service” buttons from the appropriate bookmarking service sites. All the sites would need to do would be to create a small XML payload of their own telling the plugin what the format of their bookmarking buttons was: an icon, a name, and instructions for consing up a URL from the metadata supplied by bloggers.

In step three, browsers would recognize the bookmarklet metadata and automagically replace with with the appropriate button for the user’s bookmark service of choice. If you’ve told your browser that you’re a Digg and SquidShare user, you see Digg and SquidShare buttons. If you’re a Jim-Bob Bookmark fan, you see a Jim-Bob Bookmark button.

The great thing about such a system, in addition to the decrease in screen clutter, is that bloggers wouldn’t need to know about each and every bookmarking service out there. Merely by exposing a little data in a standard format, they would enable any bookmarking service, present or future, to interoperate with their posts. It’s a more Semantic-Web-y way of doing things, and it better respects modularity.

What about the users who haven’t installed the plugin, you may ask? Aha! I have an answer for you. (Surprise, surprise.) It’s okay to leave the current button soup in place. Just wrap it in another piece of metadata that tells the browser where to find the current welter of buttons. Users who don’t have the plugin just see the existing mess. But for users who do, the plugin hides the mess of buttons at the same time as it displays the specific buttons the user wants to see.

This staging strategy makes the plugin the essential piece of technology. Once the plugin exists, it creates a de facto standard. Bloggers code their pages to match what the plugin expects; so do bookmarking services. Neither bloggers nor bookmarking services need to abandon their current techniques; they can just add support for the plugin. It might or might not catch on like wildfire, but I don’t see it doing any harm.

Any coders out there interested in cleaning up some Web 2.0 litter?

ICANN HAS CHEEZBURGER.PL


This joke will be funny to an extremely small number of people. I would say possibly zero, except that I find it hilarious.

Click the image for a larger version.

Expiration Date I give it 4 stars


This is, unquestionably, a Tim Powers novel. That means that it features a moderately large cast, ranged along a continuum from the mostly heroic to the wholly villainous, a slightly under-motivated romance, a lead character who makes some serious early mistakes that nearly get him killed and leave him in a difficult predicament, and some stunning reinterpretations of the world as we know it in terms of the supernatural.

This time around, the conceit is that the bums who wander the streets of Los Angeles are in fact ghosts who have taken on substance, and that the real drug scene in L.A. involves inhaling ghosts for a rush of their memories. Here as in his other novels, Powers takes his conceits seriously, spinning out a wealth of subplots and details in a demented and yet utterly believable fashion. Addicts attract ghosts with palindromes and bottle them for later use? Sure. Thomas Edison invented a device for talking to ghosts? Of course. My only caveat is that the ending is a bit more of a rolling stop than a bang.

You Don’t Love Me Yet I give it 2 stars


Quite a step down from Lethem’s spectacular Fortress of Solitude. I suppose it’s a novel about the lives and loves of an ultra-indie band in L.A. But there’s also some bizarre installation art an a kidnapped kangaroo. The farce feels forced, and characters’ attraction to each other should be motivated by something more convincing than the narrator’s say-so.

PrawfsBlawg And Me


Another reason things have been a little quiet here is that I’m guest-blogging over at PrawfsBlawg, the most frequently misspelled blog in all of legal academia.

The Virtues of Moderation, Version 0.1


I’ve put my slides from my presentation at the Commons Theory Workshop online. This was my first serious experiment in giving a presentation without bullet points. I was strongly influenced by Matt Haughey’s stunning Making Money Blogging presentation. (He cites Beyond Bullet Points as the source of his style, but I found it unhelpfully rigid. Better just to look at Matt’s presentation and ask yourself why it works.) For art, I used pictures from Flickr, mostly those under Creative Commons licenses. I hope soon to write up my experiences in clearing the photo permissions.

Be warned first that the file is 7.8 megabytes because of all the pretty pictures, and second that due to my being a blockhead and leaving my video dongle at home, I wound up not having my notes in front of me as I gave the presentation, so that the words on the screen bear only a distant familial resemblance to what I actually said.

Revising the draft paper to which the presentation pertains is one of my projects for the summer. The paper itself is still only in private alpha release, but if you’re intrigued like to be added to the alpha test group, please squirt me an email and I’d be happy to send it along.

The Children of Húrin I give it 4 stars


The tale of Túrin Turambar, told more briefly in The Silmarillion, is a tragic epic in the old-fashioned Germanic tradition. Think of the Siegfried components of Wagner’s Ring Cycle, but with the gods well offstage. As with much of the Silmarillion, it doesn’t read well if you expect a narrative with modern pacing, economy of plot, or dialogue. It succeeds quite well as Tolkien probably intended it: a convincing imitation of a fragment from an enormous and partially lost corpus of myth and history. Someone ought to turn it into an opera.

Planes, Trains, und Automobiles


I’ve just returned from a Commons Theory Workshop at the Max Planck Institute for the Study of Collective Goods. It was great, and so was Bonn, but I just flew in from Germany, and boy are my arms tired. Yesterday, I traveled on two trains, a monorail, two taxis, two subways, an airplane, and a bus. I woke up just now with no clear idea of what time it was or where I was. Realizing that it was “morning” at “home” was a very pleasant surprise.

From the Annals of Bad Architecture


I saw the following design travesties while looking at apartments this week:

  • Of a carpeted apartment, “The current occupant is a smoker. Don’t worry; we fully clean all the apartments before new tenants move in.”

  • A ninth-floor apartment with eastern exposure, overlooking a train yard, and beyond that, the Hudson. The apartment has large glass exterior windows in the living room. The bedroom, further inside, opens onto the living room with a pair of French (i.e. mostly glass) doors.

  • Being told not to worry about the guy sleeping on a futon in the living room.

  • A duplex with the living room and kitchen upstairs on the entry level, and the bedroom downstairs on the garden level. There’s a bathroom on each floor. The bathtub is in the upstairs one.

  • A building-wide wireless network; no Ethernet jacks.

  • Of an apartment listed as two-bedroom, “There’s a second door there because the state of New Jersey says that it can’t be a bedroom unless it has a window, so the door is there to make it a den.”

  • French doors opening out onto a two-foot-wide strip of grass surrounding the building. The strip is fenced about with a mostly-open ironwork lattice; the sidewalk is about a foot or two below the strip.

If you can see what’s wrong with these pictures, then you’re doing better than the owners.

A Cheer for KSR?


I’m swamped with other stuff right now (What kind of dolt looks at half a dozen apartments and doesn’t check the water pressure in any of them? Yes, that would be me.), but it strikes me that the Supreme Court’s patent decision in KSR v. Teleflex may be a brilliant move to trim back software patenting. The key phrase may be that an invention is obvious and thus unpatentable if it requires only “ordinary innovation” and “does no more than yield predictable results.”

That second phrase could drive a huge wedge between software patents and other kinds. Lots of things in programming are predictable once you have the idea. If you know how to sorted list and you know how to construct a quizmarunk, then the code to construct a sorted list of quizmarunks would be a first-year programming assignment. Not necessarily easy to get exactly right, but predictable. Obvious. Consider, on the other hand, oh, say, drug patents. There, when you start analyzing a thousand potential molecules, their actual folding and interactions may be completely unknown initially. Only extensive simulation and lab work will tell you what the candidate chemicals actually do. Not necessarily easy to get right. Not predictable. Not obvious.

This is just a preliminary thought. I need to read the opinion (and not just the crib sheet at the start), and who knows how other courts will interpret it. But this could be a way of trimming back on dumb software patents without getting into the morass of defining what software is or isn’t. I’m optimistic.