Skeptophilia (skep-to-fil-i-a) (n.) - the love of logical thought, skepticism, and thinking critically. Being an exploration of the applications of skeptical thinking to the world at large, with periodic excursions into linguistics, music, politics, cryptozoology, and why people keep seeing the face of Jesus on grilled cheese sandwiches.

Thursday, November 11, 2010

The law of small numbers

Yesterday, I had a perfectly dreadful day.

The events varied from the truly tragic (receiving news that a former student had been killed in an automobile accident) to the awful but mundane (finding out that my son was not getting $2000 of financial aid we counted on, because the financial aid office screwed up and sent the papers to the wrong address) to the "I'll-probably-laugh-about-this-later-but-right-now-I'm-not" (finding out that my dog, Grendel, has figured out how to climb our chain-link fence, and so now has to be escorted outside on a leash every time he wants to go potty) to the completely banal (a school meeting that left me feeling like I'm ready to find another career).

All of this brought to mind the idea of streaks of bad (or good) luck -- something that you find people so completely convinced of that it's nearly impossible to get them to break their conviction that it sometimes happens.  We've all had days when everything seems to go wrong -- when we have what my dad used to call "the reverse Midas touch -- everything you touch turns to crap."  There are also, regrettably fewer, days when we seem to have inordinately good fortune.  My question of the day is:  is there something to this?

Of course, regular readers of this blog are already anticipating that I'll answer "no."  There are actually three reasons to discount this phenomenon.  Two have already been the subjects of previous blog posts, so I'll only mention them in brief.

One is the fact that the human brain is wired to detect patterns.  We tend to take whatever we perceive and try to fit it into an understandable whole.  So when several things go wrong in a row -- even when, as with my experiences yesterday, they are entirely unrelated occurrences -- we try to make them into a pattern.

The second is confirmation bias -- the tendency of humans to use insignificant pieces of evidence to support what we already believe to be true, and to ignore much bigger pieces of evidence to the contrary.  I had four bad things, of varying degrees of unpleasantness, occur yesterday.  By mid-day I had already decided, "this is going to be a bad day."  So any further events -- the school meeting, for example -- only reinforced my assessment that "this day is going to suck."  Good things -- like the fact that my classes actually went rather well, like the fact that lovely wife brought me a glass of red wine last night after dinner -- get submerged under the unshakable conviction that the day was a lost cause.

It's the third one I want to consider more carefully.

I call it the Law of Small Numbers.  Simply put:  in any sufficiently small data sample, you will find anomalous, and completely meaningless, patterns.

To take a simple model:  let's consider flipping a fair coin.  You would expect that if you flip said coin 1000 times, you will find somewhere near 500 heads and 500 tails.  On the other hand, what if you look at any particular run of, say, six flips?

In any six-flip run, the statisticians tell us, all possible combinations are equally likely; a pattern of HTTHTH has exactly the same likelihood of showing up as does HHHHHH -- namely, 1/64.  The problem is that the second looks like a pattern, and the first doesn't.  And so if the second sequence is the one that actually emerges, we become progressively more amazed as head after head turns up -- because somehow, it doesn't fit our concept of the way statistics should work.  In reality, if the second pattern amazes us, the first should as well -- when the fifth coin comes up tails, we should be shouting, "omigod, this is so weird" -- but of course, the human mind doesn't work that way, so it's only the second run that seems odd.

All of this brings up how surprisingly hard it is for statisticians to model true randomness.  If a sequence of numbers (for example) is truly random, all possible combinations of two numbers, three numbers, four numbers, and so on should be equally likely.  So, if you have a truly random list of (say) ten million one-digit numbers, there is a possibility that somewhere on that list there are ten zeroes in a row.  It would look like a meaningful pattern -- but it isn't.

This is part of what makes it hard to create truly randomized multiple-choice tests.  As a science teacher, I frequently give my classes multiple-choice quizzes, and I try to make sure that the correct answers are placed fairly randomly.  But apparently, there's a tendency for test writers to stick the correct answer in the middle of the list -- thus the high school student's rule of thumb, which is, "if you don't know the answer, guess 'c'."

Randomness, it would seem, is harder to detect (and create) than most people think.  And given our tendency to see patterns where there are none, we should be hesitant to decide that the stars are against us on certain days.  In fact, we should expect days where there are strings of bad (or unusually good) occurrences.  It's bound to happen.  It's just that we notice it when several bad things happen on the same day, and don't tend to notice when they're spread out, because that, somehow, "seems more random" -- when, in reality, both distributions are random.

I keep telling myself that.  But it is hard to quell what my mind keeps responding -- "thank heaven it's a new day - it's bound to be better than yesterday was."

Well, maybe.  I do agree with what my dad used to tell me: "I'd rather be an optimist who is wrong than a pessimist who is right."  I'm just hoping that the statisticians don't show up and burst my bubble.

2 comments:

  1. Very nicely explained; truly a difficult point to make clear
    And all this without mention of the amazingly counter-intuitive Benford's Law. That the first digit of most data-fields lists of numbers is much more likely to be a '1', with the probabilities approaching a known percentage as the base becomes larger. I'm not doing it justice here, of course. I only recall that I did understand, after hours of brow-wrinkling, why this should be so. Back years ago on a great hair day.

    ReplyDelete
  2. I don't know about Benford's Law -- I'll have to look that one up. I love those statistical effects -- I'm no expert in statistics, but that sort of thing explains so much in the way that we interpret what we experience.

    It also brings to mind what a statistican once said -- that the lottery was a "tax on people who don't understand statistics."

    Thanks so much for your thoughtful & interesting comments on my posts -- they make my day (and I've missed them!).

    ReplyDelete