Terms like “margin of error” and “standard deviation” are less known to the general public than, say, “average” or “median.” But based on controversies related to close election results in the U.S. over the last decade, these are terms we should start becoming familiar with.

To understand the fallibility of election results, we have to understand pure numbers. What is a pure number? When I say “one,” that’s a pure number. When I measure 1 cup of broth for a soup I’m making, that’s not a pure number. Why? Maybe I didn’t fill the measuring cup all the way to the top and ended up with 0.97 cup. Maybe the measuring cup manufacturer messed up and the container actually holds 1.05 cups.

If you have a child and you take him/her to multiple doctors, you might be familiar with this concept. Doctor #1 measures and weighs your child and tells you one thing, but then Doctor #2 performs the exact same measurements and has slightly different results. It’s because measurements of any kind will always be imperfect numbers.

But with elections it’s “one voice, one vote,” you might argue. True. You might fill out your ballot and say “That’s one vote for X.” But then you drop that vote into a box or in the mail and that’s when the trouble starts. ^{[1]}

In a chapter in Charles Seife’s 2010 book “Proofiness: The Dark Arts of Mathematical Deception,” Seife details the recounts in Minnesota’s contested 2008 Senate race between Republican Norm Coleman and Democrat Al Franken as well as the more widely known issues in the 2000 presidential election. If you are still chanting “one voice, one vote,” you might want to sit down.

In 2008, Franken actually squeaked to a narrow victory partially because absentee ballots that had been tossed aside for not including dates on the envelopes, which does not make the votes inadmissible under state law, had been allowed. This was not contested until the recount. Imagine, some poor Minnesotan filled out an absentee ballot without a date on the envelope thinking “That’s one vote for X” not knowing his/her vote wasn’t originally going to be counted at all.

And recounts are far from perfect. A 2012 study from Rice University and Clemson University found that recounts themselves can have an error rate of 1 or 2 percent. Why? Because vote counts, like measuring cups and weighing toddlers, are imperfect measurements prone to human error. It’s 2016, and boxes of votes still get accidentally misplaced in elections all around the world.

If election results and recounts are prone to error, why aren’t we using a margin of error to describe them? Well, first you have to figure out how to calculate that margin.

Going back to the doctor example, let’s say you have five height measurements of your kid: in inches, 35.7, 34.8, 35.4, 36 and 35.2. So how tall is your kid?? Your best bet is to take the average of those five numbers because the truth is somewhere around that. That number is 35.4 ^{[2]}. But that’s still not the truth. It’s an educated guess. Notice a couple of sentences back I wrote “somewhere around that.” This is where a margin of error comes in.

To figure out the margin of error, we need the standard deviation. Here’s what that looks like:

$$s = \sqrt{\frac{1}{N-1} \sum_{i=1}^N (x_i – \overline{x})^2}$$

It’s not as overwhelming as it looks. Starting from the far right, we’re squaring each number in the set (\(x_i\)) minus the mean (\(\overline{x}\)), adding it all up, then dividing it by the number of observations minus one ^{[3]}. That gives you the variance, or \(s^2\). You can also think of variance as the “spread.” Take the square root of that and you have the standard deviation.

In the height example, that gives us 0.5477226.

Now we need to know how confident we can be of your child’s true height. In statistics, there is something called the “68-95-99.7 rule” regarding confidence intervals and standard deviation. Basically, 68% of your observations will fall within one standard deviation from the mean, 95% will fall between two standard deviations, and 99.7% will fall between three. In other words, you can say with 95% confidence that your child’s true height falls somewhere between 34.3 and 36.4 inches. That’s your margin of error.

Now some caveats. Everything I described above is based on a reliable average and a good sample size. The average is not reliable if your data doesn’t follow what is called a normal distribution — more commonly known as the bell curve. For the height example, I chose numbers that approximate a normal distribution.

If your distribution is skewed — imagine if one of the height observations were 38 inches — then your average is no longer reliable, which means your standard deviation is no longer reliable. It messes up everything! This actually happens a lot with small sample sizes. ^{[4]}

What does this mean for election results? The higher the number of vote counts (a larger sample size), the more accurate our measurement will be. And elements of the vote counting would have to be randomized, like you could never have the same group of people counting every time.

In an ideal world, we could run 100 recounts and be satisfied with the results. But we’re in the real world and that’s a very expensive proposition. (However, it would be worth it for identifying elections where the margins of error between two candidates overlap, potentially triggering a new election. Or it could raise serious concerns if there is a high variability in results, like the absentee ballots that were tossed out in Minnesota).

The questions are: How accurate do we want our election results to be and how much money are we willing to spend to achieve that? If the United States seriously believes in “one voice, one vote,” we apparently have to try a little harder. Right now, we’re basing incredibly important decisions on approximations with zero confidence.

[1] Unless you’re old enough to remember the 2000 presidential election and what a “hanging chad” is. The trouble may have started before you even slipped your ballot into a box.

[2] Sum of the heights divided by the number of observations, in case you’ve forgotten.

[3] I didn’t want to go even farther in the weeds, but \(n – 1\) is a way of accounting for the fact that you don’t know the true population mean. Because math.

[4] See the law of large numbers.