Seven Wikipedia Fallacies

Please pardon the tone of this philippic. I’ve read too many ignorant complaints about Wikipedia recently, and I’m of a mind to set some things straight. Without further ado, please allow me to present rejoinders to seven common but fallacious claims about Wikipedia.

Wikipedia Modifies Entries

We begin with a category error. Wikipedia is “the free encyclopedia that anyone can edit,” not “the free encyclopedia that edits itself.” Every addition, change, or deletion is carried out by some individual Wikipedia contributor. To say that it was Wikipedia that made the modification is to confuse the encyclopedia with the editor. It’s like saying that New York City mugged you.

In other contexts, this kind of verbal slippage is harmless. There’s nothing particularly fallacious about saying that “the New York Times reported so-and-so” rather than naming the reporter, editor, and printing-press operator. If something shows up in the Times, then N times out of N+1, the Times as a bureaucracy has made a conscious decision that it ought to be printed. Under these circumstances, it makes sense to treat the actions of the individual as the actions of the entity.

Wikipedia doesn’t work this way. It’s open. Anyone can edit it. It does not necessarily follow that because some contributor made a particular modification, it must be the case that the modification reflects an official position of the Wikimedia foundation, a consensus among the Wikipedia community, absolute truth, or anything else. It might. Frequently, it does not. Asking whether it does is the beginning of wisdom, because now you are engaged with the often messy processes by which Wikipedia evolves. But as long as you speak of Wikipedia itself as the source of the change, you are hiding the ball from yourself.

It is fine to black-box Wikipedia if all you are doing is looking up information in it. But if you wish to make statements about how it does or doesn’t work, it is necessary to look under the hood.

The Latest Word is the Last Word

More times than I would like to remember, I have seen someone complain that Wikipedia gets something wrong, as though that were the end of the matter. In the time it took to write your mournful post about the brokenness of the Wikipedia model, you could have gone in and made the necessary change. Indeed, in every case I have ever seen, the mere fact that someone took the time to complain has caused the Wikipedian platelets to swing into action and fix the mistake themselves. Pointing to specific examples of wrongness in Wikipedia is self-negating; in a month’s time, it is more likely that Wikipedia will have corrected the issue than that you will have acknowledged the correction.

This phenomenon is an instance of a more general point Wikipedia is as much a process as it is a product. The process is designed to produce an encyclopedia of constantly improving quality. The encyclopedia is a constantly moving target. Thus, while it is reasonable to say that at present Wikipedia is defective for X purpose and in its coverage of Y, one should be cautious with attempts to extrapolate these failings too far into the future. (After all, if it’s extrapolation we’re engaged in, any fair assessment of how quickly Wikipedia has become as useful as it is would suggest that within a decade it will easily be the most comprehensive and useful reference work of all time.)

This fallacy is closely related to the first, in that both treat Wikipedia as monolithic and wholly consistent in all it does. It is not. Any change can always later be undone; many are. Entries change course as editors smooth them over; sudden outbreaks of attention cause entries that have gone astray to be sharply reworked by more experienced hands; experts who discover factual muddles in their fields clean up some of the muddles. Sometimes the reverse happens, too; Wikipedia’s improvement is hardly monotonic.

You cannot make sweeping claims about Wikipedia’s behavior in the limit merely by looking at the latest change. Or, at least, you cannot make such claims with much hope of correctness.

Wikipedia is Chaotic

The freedom inherent in the Wikipedia model is confusing and frightening. If assuming that Wikipedia will always and forever say what it says now is a prevalent mistake, the opposite mistake also claims many victims. They assert that because of its openness, Wikipedia must be a roiling sea, caught in a neverending process of constant flux. They see a million monkeys and a million typewriters. That something as ordered and stable as an encyclopedia could emerge from such tumult seems inconceivable.

This error is endemic to popular understandings of evolutionary processes. The same argument would “prove” that biological evolution is impossible, that free markets cannot work, and that the human brain is no more capable of thought than a bowl of oatmeal. Wikipedia, like other complex adaptive systems, exhibits different properties overall than it does at the micro scale. Yes, any given article may swing back and forth between two equally wrong claims. Yes, a random new user make one change and muck up the grammar of the entry he touches. But the average edit, all in all, improves Wikipedia’s quality,—and there are a lot of edits.

One of things you quickly realize if you spend any significant time editing Wikipedia—or reading about Wikipedia—is that it has a rich and quite structured editing community. From its high-level editorial policies down through the nits of article naming conventions and linking styles, Wikipedians have an extensive library of best practices and wisdom-pooling processes to draw upon. The community makes decisions by persuasion, by consensus, by voting, and, if necessary, by fiat—but never forget that it makes decisions. That its decision-making takes place mostly bottom-up and on a self-directed as-needed basis does not mean that it doesn’t happen. Wikipedia is not an atomistic universe of monkeys each at its own typewriter; its contributors share, converse, debate, cajole, shout, and much much more. This surfeit of collective (and occasionally dictatorial) decision-making may irk some—and has led to some high-profile defections over the years—but it has also enabled Wikipedia to set any number of policies for itself. None are perfectly observed (nor could they be in such an open editing model), but again, on average they add order and direction.

To talk about Wikipedia as an encyclopedia and ignore the community is to miss much of the point.

The Vandals Will Have Their Way

It is also tempting to look at Wikipedia’s openness and assume that it cannot work. (That it has worked, and remarkably well, should have been proof against such temptation. The flesh is weak, it would appear.) If anyone can edit it, well then, what’s to stop the jerks from coming in and trying to trash the place? There are, after all, an awful lot of jerks out there.

Well, there are a lot of jerks out there, and they do try a lot of fairly antisocial tricks, but they don’t make much headway, all in all. Why not? First, because it is exceedingly hard to mess up Wikipedia or any individual entry in a way that cannot easily—trivially, even—be fixed. Keeping complete histories of every page (an underappreciated characteristic of Wikipedia to which we shall return) means that actual destruction is out of the question; the worst your average intruder can do is mess up the current state of a page. But in the revert war that will soon follow, Wikipedia enjoys a second advantage. There are far more people who are motivated to help Wikipedia than who are motivated to hurt it. The griefers, vandals, and script kiddies are up against committed and conscientious monitors who look both for widespread damage and for questionable edits. Third, the good guys have interior lines and the high ground. From IP banning to recent change monitoring, they have a toolkit that has been designed, in substantial part, to help them preserve the encyclopedia from its foes.

If you look at Wikipedia as it actually is, these factors together make it largely vandalism-free. While the number of malicious edits and the number of edits to fix the damage may be comparable, the average time that articles spend in damaged states is much smaller than the average time that they spend in fixed states. Large attacks are quickly detected and fixed; while small and malevolent changes may last longer, they are comparatively few and far between. (Anything more systematic would draw enough attention to itself that it would be quickly rooted out.) Thus, while it is always possible that a vandal, a propagandist, or prankster will have come through recently, it is usually highly unlikely.

Once again, Wikipedia has good statistical properties. The correctness of any given claim it makes is not guaranteed; it is merely likely. And, as we have noted, that likelihood is growing with time.

Wikipedia is Unaccountable

Perhaps the favorite anti-Wikipedia talking point of those who have spent significant time in journalism is that Wikipedia cannot be authoritative because its editing model cannot properly vouch for the assertions it makes. This claim says less about Wikipedia than it does about the mental blocks of those who make it.

Is the problem that Wikipedia is anonymous, that each article does not prominently bear the byline of an author willing to stand behind it? The Economist is anonymous, and so is the Oxford English Dictionary. Indeed, Wikipedia is less anonymous than many stalwart fact-transmitting institutions. Just have a look through the edit history of your favorite page. You can see exactly who added which words, and with a few more clicks, what other articles they’ve contributed to. That’s a level of transparency that few other institutions achieve. Most of the time, you know exactly who made a particular assertion—and whether others have questioned or qualified it.

Is the problem that Wikipedia’s contributors aren’t credentialled experts? Scholars understand that transparency matters more than credentials. Wikipedia may not cite sources often, but it cites them more often than most of the competition.

Is the problem that Wikipedia doesn’t have institutional credibility, the way that a newspaper or publisher would? If so, then the argument is circular. If Wikipedia’s ability to return mostly-reliable facts quickly doesn’t help it build institutional credibility, what exactly would? It’s not always right, but it’s usually right.

Or is the problem simply that Wikipedia is a free encyclopedia that anyone can edit, rather than a large media corporation or a professional working for one? The last few years have not been good for scholars and journalists who say, “Trust me; I’m a professional.” Perhaps the Wikipedia model might have something to offer, here. Not so much “Trust me; I’m an amateur” as “Trust us, we’re a whole huge bunch of amateurs, and our biases and blind spots mostly cancel out.”

Wikipedia Pays Too Much Attention to Trivial Topics

It is sometimes noted that Wikipedia’s coverage is disproportionately heavy on pop culture and Internet phenomena. The sort of stuff that Wikipedia contributors would be likely to care about, as is frequently claimed. That is as may be; there’s no denying that Star Trek, say, receives far more extensive coverage in Wikipedia than in any traditional encyclopedia ever printed. It remains to be shown, however, why there’s anything wrong with extensive Star Trek coverage.

In the first place, it is not as though the entry on the Rules of Acquisition is taking up precious space that could have gone to expanding the entry on Ignatius Loyola Donnelly. They coexist in complete harmony; one’s gain need not come at the other’s expense. Such are the virtues of online publication.

Thus, perhaps the complaint is that all the time spent cataloguing the sayings of the Ferengi could have been better spent on the link between Donnelly’s Populism and his belief in Atlantis. Perhaps. But it is not as though Wikipedia has a budget that it squandered on the Grand Nagus. I sincerely doubt that there is a significant relationship between effort devoted to the one and effort devoted to the other. If Wikipedia were somehow to shut down its Star Trek section, it is not as though the editing pace in its other sections would pick up the slack. The limiting factor on revisions to Ignatius Loyola Donnelly’s entry is probably, well, the anemic level of interest in Ignatius Loyola Donnelly. Once again, the allegedly trivial entries neither help nor hurt the supposedly serious ones.

So maybe the argument is that the fluff somehow degrades the tone of the encyclopedia. But that can’t be much of a concern either. Are we really going to discredit something useful because it also chooses to be fun? No one ever forces you to read about Star Trek. If you want to know how a Geneva drive works, what matters to you is that Wikipedia have a damn good entry on it. That the same web site also contains a multi-part list of fictional cities is neither here nor there.

In the end, I suspect that this complaint is really based in a sense that certain topics are unworthy of serious attention, and that Wikipedia makes the world worse by giving them such attention. Put another way, certain stuff just doesn’t belong in an encyclopedia. To which I—and the thousands of Wikipedia contributors responsible for that stuff—say “Ack Thbbbt .” If this many people care about it, and care about it enough to curate extensive and well-organized expositions of it, who is to say that they are wrong? The argument that these topics degrade the quality of Wikipedia amounts to an argument not just that the plebs is wrong to care about the things it cares about, but that it should not be given the resources to learn about them. While I can be as snooty about my media diet as the next guy, you won’t find me saying that Entertainment Tonight should be banned—or that the Star Trek section of Wikipedia should be deleted.

Wikipedia is Perfect

Having just debunked six attacks on the possibility of Wikipedia, I may appear something of a booster. I had better reestablish my credibility as a realist with some pointed observations about the genuine issues that Wikipedia does face.

Wikipedia as it now exists has many problems. Many entries have small factual errors; many more are terribly organized. There are duplications and inconsistencies and strange taxonomies galore. Its coverage of many subjects is thin; its references to good further reading are often highly spotty. The great majority of entries could use a good copy-editing.

Wikipedia’s editing conventions are not particularly legible to outsiders or to new contributors. You can quickly figure out Wiki syntax, but it is less easy to figure out how you should categorize and cross-link a new entry. Do you need to figure out how to participate in a deletion debate? Good luck understanding all the relevant conventions your first time through. Many of the hurt feelings and crossed lines that contribute to the confusions above are side effects of this steep learning curve.

The Wikipedia community is very much a work in progress. Many of its purported policies are observed mostly in the breach. It has developed any number of healthy habits and productive practices, but it also has a fair number of difficult personalities and frustrating tics. Some of these unfortunate tendencies may be cleaned up as its norms evolve and solidify, but that same process of regularization may squeeze out some of the flexibility that has allowed Wikipedia to grow so quickly. It has a suspicion of expertise that is not just unwarranted but actively counter-productive. And, oh man, if I had a dollar for every time a Wikipedian has jumped to conclusions or been too abrupt in dealing with an outsider’s attempt to engage with Wikipedia.

I’m optimistic about Wikipedia’s future. It’s done great things in an astonishingly brief time. Every time I turn around, it’s just gotten better and better. I see no fundamental obstacles that would keep it from being the universal encyclopedia it aspires to be. But it also faces some significant challenges. There are security, social, legal, academic, cultural, technical, and financial shoals ahead. It is quite possible that any one of these could sink it.

But perhaps the best way to appreciate how remarkable the Wikipedia model and the Wikipedia community are is to ask what would happen if something did go catastrophically wrong and Wikipedia became unusable. We would still have the encyclopedia that Wikipedia is today. Thanks to the open license under which Wikipedia has been developed and made available, anyone with a snapshot of it—and there are many people with snapshots, even if most of them are link farmers—could continue to serve it up to the world. In the last half decade, almost entirely with volunteer labor, Wikipedia has created a quite credible first cut at an encyclopedia. That’s no mean feat. That the community is thriving and shows every sign of producing credible second, third, and further cuts … well, that’s just our extraordinary good fortune.

Wikipedia is great, and I find it both useful and entertaining, and I contribute myself when I have the chance. However, I think you’ve glossed over the ‘fallacy’ of its unreliability. It’s a real problem if you treat it as a serious reference source. If you have other expectations, it’s not so important, and perhaps the tradeoff (the possibility of unreliability in return for explosive growth and breadth) is acceptable. (I think there’s a place in the world for both a smaller and more scholarly Brittanica and a sprawling and less scholarly Wikipedia).

Is the problem that Wikipedia’s contributors aren’t credentialled experts? Scholars understand that transparency matters more than credentials. Wikipedia may not cite sources often, but it cites them more often than most of the competition.

Is the competition random web sites (in which case I agree that Wikipedia wins) or Brittanica and other encyclopedias or specialized reference books (in which case Wikipedia loses)?

A lack of careful citations matters more with non-experts. Although experts can make mistakes, they’re more likely to be more accurate and have a better sense of bias and importance than a layman. Scholars can also write skewed articles—my wife once had a revert battle with a credentialled expert who kept rewriting a very general article to focus on his own fringe contributions—but the editorial process of a good reference book keeps the bias from seeing print.

Not so much “Trust me; I’m an amateur” as “Trust us, we’re a whole huge bunch of amateurs, and our biases and blind spots mostly cancel out.”

That’s true, but it just underlines my point—even if the biases, etc. cancel out, and even if the amount of accurate information dwarfs the total information in any other reference, any individual entry at any given time is more likely to have a serious problem than the corresponding article in a good scholarly reference book.

On “the latest word is the last word”: Yes, I can fix any error I know about. The trouble is that when I’m reading Wikipedia for information I don’t already know, I don’t know what to trust. Any individual error will be fixed eventually, but at any given moment, there are likely to be unfixed errors. The fact that there is no “final word” isn’t an asset of the Wikipedia model, it’s a liability - there is no authoritative version.

If we really were going to talk about Wikipedia’s behavior “in the limit” in the mathematical sense, the limit turns out to be undefined, because the edits do not get smaller and smaller as time goes by. There isn’t one “final” version that the article will tend to approach more and more closely. Instead, there will always be some level of noise (from vandalism if nothing else) that is constantly being added and edited out.

Other encyclopedias contain errors, too. This issue doesn’t necessarily make other encyclopedias preferable to Wikipedia. But it does make them different from Wikipedia, because other encyclopedias stop their fact-collecting at some point before they stop their fact-checking; much like software companies having a pre-release code freeze. Readers have some assurance that every statement in an article has been reviewed. Wikipedia doesn’t work like that. New material is added at the same time old material is reviewed, and at any given moment, there’s almost always some unreviewed material on the page.

DMF and Matthew: you both raise, in different ways, a very important issue: the relationship between growth and accuracy. Wikipedia has shown that loosening controls can greatly open up the overall growth rate without making the overall accuracy catastrophic. But there is a serious role for disciplined, careful accuracy in more mature pages. Matthew’s point about fact collecting and fact checking suggests one way that institutions can drive towards accuracy—make the reverse change and (temporarily) lock down innovation while fixing the bugs. Analagous processes certainly work for many open source projects that use (new) code freezes in advance of preparing a release.

The media literacy aspects of using Wikipedia are something else I gave sort shrift to, and DMF appropriately calls me on it. In real life, not everyone using Wikipedia (especially those finding it through search engines) will understand its model or how best to check on assertions it makes.

My rant, I’ll admit, was mostly directed at claims of the form “Wikipedia can never work because of X.” When one gets into more modulated terrain, the questions of when Wikipedia works well and when it doesn’t have more subtle answers.

And oh, one more thing. I neglected to mention an example of what I consider to be more thoughtful and careful writing about Wikipedia. Roy Rosenzweig’s essay about it from an academic historian’s perspective is balanced and intelligent, and grapples appropriately with the social dynamics that drive Wikipedia’s editing process.

I found your post via Jessamyn’s blog, so I want to give her credit for leading me here.

You make some very interesting and mostly valid points about Wikipedia, but I think that you gloss over the many problems that inflict Wikipedia. Wikipedia is a useful reference tool for many different topics, but it has some fundamental problems.

I come to Wikipedia as a person with some experience as a reference librarian. I started one of the first digital libraries back in 1992 and for the past 11 years I’ve run, an alternative information resource. Infoshop currently operates four wikis and I run several others for internal projects. The Infoshop wikis rely heavily on seeded content from Wikipedia, so I’ve spent many hours looking at Wikipedia pages. I’ve also contributed original edits to Wikipedia, as well as substantial additions to a few pages.

Wikipedia has implemented many changes this year which address obvious problems with the project. They’ve made it harder for new users to edit pages. The project has tougher policies about sourcing of content. Users are subject to more policies and guidelines and Wikipedia culture. Many of these changes are important and make the project better.

People know that I have a love-hate relationship with Wikipedia. Most of my negative feelings are colored by interacting with people who post content to entries about me and my projects which are malicious and borderline libellous. Your take on Wikipedia would have it that the Wikipedia community would quickly fix these changes to maintain an accurate entry, but that just isn’t the way it works when it comes to contested content. Last year I had my editing privileges revoked after trying to prevent people from using the entry on me to spread rumors about me and violate my privacy. More lately, several folks have attacked me for political reasons and they’ve tried to use Wikipedia policies to keep their changes visible. On a positive note, I finally found several hardcore Wikipedians who helped me fend off these attacks.

One of these users enlisted a friend who is another Wikipedia editor to attack me by proxy. The primary attacker is exploiting Wikipedia’s anonymity feature to engage in unaccountable behavior towards me and towards the project. Lack of accountability is a serious issue with Wikipedia. Serious Wikipedians are more accountable because they have more time enlisted in the project and they enjoy their work on the project. Other people use anonymous accounts to vandalize entries and cause headaches for other Wikipedians. You argue that vandalism is quickly found and fixed, but in reality this is not the case for less noticed pages. Contested and popular pages have many eyes looking at them, so vandalism is confronted quicker.

Wikipedia is more chaotic than people realise. The project does have many standard templates, formats, styles and so on, but these things break down. For example, in recent days I was adding to the blibliography on the Infoshop OpenWiki and discovered that the “Further Reading” section on Wikipedia pages uses several different bibliographic styles. This may not be important to most readers, but the problem does exist. It’s probably the case fo people starting sections and not bothering to look up Wikipedia’s established style for bibliographies.

I found another curious problem on Wikipedia earlier this year, although this problem is the type most likely to be picked up by people who are writers and editors. While going through dozens of pages on science fiction films, I found more than a few entries which started out by talking about the film in the past tense. Films should be written about in the present tense. They may have been produced years ago, but they are intellectual products which still exist. This may seem minor, but I found this just by poking around for hours.

Wikipedia does pay too much attention to trivial topics. Wikipedia is very good when it comes to popular culture, but lacks substance when it goes off the beaten TV-pop culture track. Entries on video games are usually long and detailed, whereas entries on famous dead people are often pretty short. Even when it comes to pop culture, I found that entries on movies were generally longer than entries on books in the same genre. Of course, Wikipedia includes many excellent entries on important subjects, but it is generally known that Wikipedia has a bias towards pop culture and contemporary topics.

I’m not as optimistic as you are about Wikipedia’s future. The project as it stands now is pretty useful and exciting, but the project will eventually run into stagnation, bureaucracy and timidity due to legal issues. Tonight I ran into a problem that illustrates the troubled future that Wikipedia is facing. I uploaded an image of a record cover to provide a graphic to go on an entry about a band. Wikipedia is getting more restrictive about copyright, even if it shows more options. From what I know about copyright law, it shouldn’t matter if I get permission to use that graphic. The image is of a record cover. Newspapers and media outlets commonly use pictures of album and book covers without getting permission.

Anyway, a few of my thoughts.