The Laboratorium

A new essay of mine, First-Class Objects, has just been published in the Journal on Telecommunications and High Technology Law. It’s an extension of presentations I gave last fall at the Yale ISP Privacy and Innovation Symposium and the Colorado Silicon Flatirons Privacy and the Press conference. My core argument is that something significant changes for privacy purposes when computer systems become “about” people, and that this transition takes place when those systems represent people using unique identifiers. Starting with Twitter as an example, I explain the computer science between unique identifiers in databases, and then explain how using unique identifiers for people has technical, social, and humanistic consequences. I conclude with a brief musing on the relevance another suggestive phrase from computer science, “first-class object.”

June 23, 2011 at 12:07 AM

Eric Hellman

James,

One small correction. Twitter counts characters, not bytes. You get 140 characters, not 140 bytes; this has to do with the gory details of SMS. If you tweet in 2 byte languages such as Chinese or Japanese, you get 280 bytes!1

June 23, 2011 at 12:24 AM

john walker

James The essay feels like it finishes with a ‘to be continued’. Can we expect another installment? New Scientist feedback has a lot of fun with nested, nested acronyms. To what degree are unique identifiers unique? And to what degree are they nested uniques? Thank you, it is a nice bit of writing.

June 23, 2011 at 9:38 AM

Eric Hellman

I re-read the article, and see that everything written there is correct. The sentence that prompted me to comment was “In the standard UTF-8 encoding used by Twitter, they would take up 14 and 9 bytes, that is, 112 and 72 individual ones and zeros.”

While it’s true that Twitter uses UTF-8 encoding in its web-service APIs, the SMS system that it’s built around doesn’t. It’s a bit of a non-sequitur to jump from a sentence that talks about the number of bytes taken up by a name to the premium on space in Twitter, measured in characters. A Chinese character such as 中, for example, takes 3 bytes in UTF-8, 2 bytes in SMS, but just one character in a tweet.

All this is irrelevant to issues of identity (which is really quite magical), but I’m a Unicode wonk and I can’t help myself sometimes!