The Statistics of Love


Another interesting news item I found through Blue Lines, this one a BBC article reporting the not-all-that-surprising "discovery" that being bombarded with images of movie stars and other excessively good-looking celebrities is unhealthy. What sets this particular study apart is its methodology -- they used a mathematical model to estimate how "happy" "people" would be if "paired up" with other "people" from their rank-ordered list of ideal "partners." [Imagine some sort of "assume a spherical cow of uniform density" definition for each of those terms in quotation marks]. When people's sense of attractiveness was randomly determined and wholly arbitrary, pretty much everyone was pretty happy. But when their standards of attractiveness had even a slight societal component, everyone started chasing after the same few "supermodels" of the mathematical model, and the collective happiness dropped like a rock in response.

I mention this because I've had first-hand experience with how the iron laws of statistics can screw up romantic happiness. Back in college, I worked on the Valentine's Day Datamatch project for a couple of years. The way it worked was that each participant would fill out a multiple-choice questionaire, we'd feed all the answers into a computer, and the computer would spit out a list of your "top ten matches." Hookups and happiness would then ensue.

Well, the X factor in this equation is that foggy terra incognita between when you feed the raw data into the computer and when it feeds back the sorted and collated matchup lists. My first year on the project, the folks in charge of the matching code sat down and thought up a pretty sophisticated algorithm. The idea was that each question would define some axis of answers -- question 23, for example, might measure extroversion, so answer A would be the extreme-introvert answer and E would be the extreme-extrovert answer, with B,C, and D ranged roughly equally in between. We'd take your answers and the answers of a potential match, and compute the standard Euclidean distance between them.

Quick example for the non-mathematically inclined. If I put down A,B,E for my answers to the first three questions, and you chose C,D, and D, then on question 1, we're two answers apart, and we're two answers apart on question 2, and one answer apart on question 3. We want to weight all the questions equally, so we take your usual map distance (Pythagorean theorem time, kids), which is the square root of the sum of squares of the distances on each individual axis. Well, 2 squared (which is 4), plus 2 squared (4 again), plus 1 squared (that's 1, folks), is 9. And the square root of that is 3, so the two of us are about "3" apart overall. If you put our answers inside a big cube, five inches to a side, ranging from A to E, then we'd be three inches apart. So imagine a version of this cube with the 1500 Datamatch participants all listed in it. We just looked for the ten closest people to each person, wrote them down, and that was that. Of course, there were 25 "sides" to this cube, but if we lived in 25 dimensions, we'd still be using the natural notion of distance for 25 dimensions.

Okay. Any mathematicians in the audience should take a moment to try and predict what actually happened. [Hint one: compare the volume of a 25-dimensional cube with that of the 25-dimensional sphere inscribed in that cube. Hint two: what's the average distance between two randomly-chosen points in the 25-dimensional cube? Between one randomly-chosen point and the center of the cube?]

What happened in practice was that after we'd stuffed the envelopes and sent out the emails, some people on the Datamatch team started sharing their own lists with each other. The same names popped up on a bunch of lists, and at first, we figured, hey, we did a good job, it's pairing geeks up with each other. Then someone violated their professional ethics a bit, went into the database of results, and determined that one Mr. X., despite appearing on the lists of the several girls on the team, himself saw none of them listed on his own. The same for Mr. Y. Miss A, Miss B, and Miss C, who were common to the lists of all the guys on the team -- same deal. None of the guys for whom Miss A was their number-one matchup even made her top ten.

I ran some statistical analyses at this point -- grep, wc, sort, and uniq, basically -- and uncovered the full horrifying truth. A full third of the hundreds of girls taking part were told that Mr. X. was one of their top ten matches. A small population of superstars grabbed dozens of spots for themselves, far more than the ten times you'd expect each person to show up. On the other end of things, there were over two hundred people who appeared on nobody else's list, or on only one list. The realization was shocking -- we were encouraging people to call up their computer-assigned romantic prospects, but those partners wouldn't know the unfortunate caller from a hole in the ground. The best possible result for us would be that people would just crumple up their Datamatch results and throw them out. This was when I understood that the whole thing was some sort of cosmic joke.

Okay. Even you non-mathematicians should have a crack at this one: what went wrong? Why was it that Joe Schmoe could be one of the ten closest matches for Jane Doe, but there were a hundred or more better matches for her than that poor Schmoe? It wasn't even an issue of gender imbalance: that year the number of males and females involved were approximately equal, and there were about as many girls as boys with computer-assigned unrequited crushes.

Well, technically, the "problem" is that our intuition about 25-dimensional space sucks. Think about it this way: every question we added provided a new way to be a freak: each A or E answer moves you to the "outside" of the space. It's only on one question out of 25, but if most of your answers are kind of average, the few questions you gave strong answers still tend to push you well away from the norm, because that whole squaring process exaggerates differences. On a 25-question form, almost everyone is a freak on something or other. There were a few people who gave almost completely innocuous answers, who were very close to the "average" across the campus. These people were reasonable matchups for absolutely everyone else participating. Not great, but kind of reasonable. Two people who were freaky in exactly the same way would get each other, but if you and I are even slightly different kinds of freaks, our two freakinesses add rather than cancelling out. Result: both of us get Miss A (or Mr. X.) on our forms sooner than we get people like each other. There are several hundred of us and only ten spaces on her list; it's just our dumb luck that she never sees us. All this can be formalized to a ridiculous degree, but the basic problem is that Euclidean ("map") distance makes a lot of sense in two and three dimensions, but it's actually a horrible notion of similarity for this kind of matching problem because it treats one big difference a lot more seriously than it treats a dozen small ones.

The next year, Dave and I volunteered right up front to do the programming, basically so we could rip out the Algorithm of Injustice and put something sensible in its place. In the end, we cheated. We took those useless Euclidean numbers and instituted a program of strict rationing. We forced reciprocity above all else: if X is on Y's list, than Y should be on X's list. We took those precious matchings with the highly-desirable people and handed them out as though we were running the NBA draft: everyone picks their top choice from those still available, then repeat as necessary, so that nobody could fill up their ten slots all at once with the campus hotties. Think of it as Communist matchmaking; we were determined that there was to be no hoarding, even if there was grain rotting in the fields.

The punch line of all this is that it wasn't even some abstract notion of attractivenes or some sort of socially-constructed notion of beauty that was causing these pack-mentality situations. It wasn't the jocks and the players and the tall-and-leggy who were getting all the attention; it was the people who were the most "average" as reflected by the mid-point answers of the questions written by the survey team, questions that reflected the quirkiness of the writers themselves. In other words, we might as well have thrown darts at a campus directory to pick out the Datamatch King and Queen, with probably around the same results. In the end, it was the mathematical structure of high-dimensional inner-product spaces that ruined Datamatch as a happiness-producing institution, rather than the beauty myth or the celebrity-obsessed media.

It's not every day you get to pin romantic problems on inner-product spaces.