They Only Need to Be Good on Average


I like to think that I have good statistical horse sense, but that’s no substitute for knowing the subject properly. I’m seeking suggestions on good statistics texts. Longtime readers know that I’m not afraid of math, but in this case, I’m not especially interested in reading proofs of the most general case or derivations of the closest bounds possible. Instead, I’d like to learn:

  • Essential theory and notation of random variables and distributions with a reasonable degree of rigor.
  • The most significant distributions in a statistician’s toolkit, with good coverage of their distinguishing properties.
  • A well-explained treatment of statistical inference that discusses choice among tests of significant and necessary assumptions.
  • Some queueing theory, with an emphasis on the decisions that inform one’s choice of model.
  • Across all of the above, a satisfyingly elegant treatment of the common mathematical techniques that statisticians use when trying to understand the world, predict outcomes, simplify messy calculations, and minimize error terms.

I could go to the library and start pulling, or go online and read lots of reviews, but I figure that the readership here may have some better suggestions of particularly nice books on these topics.


I cut my teeth on Morris DeGroot’s “Probability and Statistics”; it has a strongly Bayesian flavor, because DeGroot was strongly Bayesian; divide through as necessary. Also: get the earlier edition, not the edition edited by Mark Schervish. You will thereby save yourself a nontrivial amount of money.

Feller’s 2-volume work is, as you may know, canonical, but it is also remarkably rigorous and very British; I’ve never been able to make it far into Feller.

My CMU professor, Larry Wasserman, has published a couple books about which I’ve heard good things: “All of Statistics” and “All of Nonparametrics”; I’ve not checked them out. CMU’s best stats professor (at least while I was there — Cosma Shalizi is there now, and I’d have to bet that he kicks butt as a teacher), Chris Genovese, has had a book in production for a while. He had a draft PDF up on his site for a while, but it seems to have come down now. I TAed for him; his material was always extremely useful, starting right away with non-toy problems geared to engineers. (E.g., use statistical methods to figure out the running time of a compiler, based on some models of how frequently new tokens appear in the source-code stream.)

A couple other canonical ones, which may or may not be outdated by now: the two classic works by Erich Lehmann, late of UC Berkeley. They’re entitled “Testing Statistical Hypotheses” and “Theory of Point Estimation.”

That’s probably more information than you need. The quick takeaway is: DeGroot. You’ll get all the distributions you need, all the statistical testing, and so forth. You will be able to explain why the F-statistic looks the way it does, you will be able to describe the probability distribution of F-statistics, and you will learn a whole set of methods for deriving the distributions of other statistics. It’s a really great book.


Oh, I meant to include one that is somewhere between DeGroot and Lehmann: Casella/Berger’s “Statistical Inference.” Great text. Advanced undergrad or beginning graduate level.

Also generally check out anything by Andrew Gelman, whose book “Red State, Blue State, Rich State, Poor State” just came out. His (along with Carlin, Stern, and Rubin) “Bayesian Data Analysis” is a classic, though specialized.

Finally, since we’re talking specialization: you might enjoy Markov Chain Monte Carlo In Practice, if you’re interested in that kind of thing. It might tickle the computer scientist in you. The idea is that you want to simulate a complicated random variable, which you can generate only as the steady state of a long Markov chain. That’s quite specialized, but also quite fun. And if you’ve ever generated random variables on a computer, you’d certainly appreciate MCMC. (Historical bonus: the initial theory came out of the Manhattan Project. They needed to do complicated integrals, and it turns out that you can approximate a large class of integrals as the expected value of a specially chosen random variable.)


More lightheartedly, how about the Cartoon Guide to Statistics, by Larry Gonick?


I’ve got the Cartoon Guide to Statistics; I’m looking to kick it up a few notches from there.