### October 10th, 2018

# ↳ Significance-testing

## Who cares about stopping rules?

## Can you bias a coin?

*Challenge:* Take a coin out of your pocket. Unless you own some exotic currency, your coin is fair: it's equally likely to land heads as tails when flipped. Your challenge is to modify the coin somehow—by sticking putty on one side, say, or bending it—so that the coin becomes biased, one way or the other. Try it!

How should you check whether you managed to bias your coin? Well, it will surely involve flipping it repeatedly and observing the outcome, a sequence of h's and t's. That much is obvious. But what's not obvious is where to go from there. For one thing, any outcome whatsoever is consistent both with the coin's being fair and with its being biased. (After all, it's possible, even if not probable, for a fair coin to land heads every time you flip it, or a biased coin to land heads just as often as tails.) So no outcome is decisive. Worse than that, on the assumption that the coin is fair any two sequences of h's and t's (of the same length) are equally likely. So how could one sequence tell against the coin's being fair and another not?

We face problems like these whenever we need to evaluate a probabilistic hypothesis. Since probabilistic hypotheses come up everywhere—from polling to genetics, from climate change to drug testing, from sports analytics to statistical mechanics—the problems are pressing.

Enter significance testing, an extremely popular method of evaluating probabilistic hypotheses. Scientific journals are littered with reports of significance tests; almost any introductory statistics course will teach the method. It's so popular that the jargon of significance testing—null hypothesis, $p$-value, statistical significance—has entered common parlance.