Copyright © 2015 jsd

Tack Tossing
An Exercise in Probability

John Denker

1  A Simple Experiment: Running Average

1.1  Procedure

Here’s an exercise in basic probability. I’ve seen it used at every level from 5th grade through undergraduate. As you might imagine, the level of analysis goes up with the older students, but the basic idea is the same.

Get some upholstery tacks. I think they’re properly called “nails” but everybody I know calls them tacks. They’re cheap and readily obtainable from the local fabric store. Anything similar to the ones shown in figure 1 will work fine.

Each student gets 5 tacks and a small box to toss them in. The box is important; otherwise you get tacks escaping all over the place. For example, you could use the cardboard box that a loaf of cheese comes in. The possible orientations for each tack are named in figure 1.

Figure 1: Tack Orientations

The procedure is:

  1. Put a batch of 5 tacks in the box.
  2. Shake the box to randomize the orientation.
  3. Count the number of heads, and enter it into the spreadsheet.
  4. Return to step 2 and repeat until you have tossed 50 batches (250 independent tack-tosses).
  5. Plot the running average. It should look roughly like figure 2.

The first part of the spreadsheet looks like this:

          batch   batch   heads   total   total   running
          #       size    obsrvd  tossed  heads   average
                                  0       0
          1       5       2       5       2       40.0%
          2       5       3       10      5       50.0%
          3       5       3       15      8       53.3%
          4       5       1       20      9       45.0%
          5       5       3       25      12      48.0%
          6       5       2       30      14      46.7%
          7       5       2       35      16      45.7%
          9       5       3       45      21      46.7%
          8       5       2       40      18      45.0%
Figure 2: Tack Toss Statistics

You can also plot a histogram of what the batches are doing. With only 50 batches, the histogram itself is noisy. See figure 3. For the spreadsheet used to produce this figure, see reference 1

Figure 3: Tack Toss Histogram

1.2  Discussion

Remark: As you might expect, given the gross asymmetry, the probability of heads is not 50% ... but it may turn out to be closer to 50% than you might have guessed.

Remark: It is a feature (not a bug) in this experiment that nobody knows the "right" answer. Nobody knows the probability of heads until they do the experiment. This is a refreshing contrast from the usual end-of-chapter exercises where the goal is to get the "right" answer by hook or by crook.

Remark: Students are always amazed at how noise the data is, and how slowly the running average converges to its asymptote. FWIW I’ve seen data like this a gazillion times, and I’m still glad to have a reminder, every so often, of how slowly it converges. 1/√x is a very slow function.

Remark: This is real data. This is a real experiment that students can do for themselves.

Remark: A very great deal of what we call “modern physics” aka “20th century1 physics” depends on probability. This includes thermodynamics aka statistical mechanics, quantum mechanics, et cetera.

Even if it had nothing to do with physics, it would still be important. Bad guys are continually trying to bamboozle voters, jurors, et cetera using various all-too-common statistical fallacies. This is far more relevant to the average citizen than (say) anything on the Force Concept Inventory.

Detail: Using five tacks per batch is about right. With more than that, they become hard to count. A batch-size of 1 gives more information, but it is a waste of time, it makes the data even noisier, and it causes confusion. (There are more than a few students who have only a shaky grasp of the concept of zero, and they get confused if the first toss gives them zero heads.)

Remark: This experiment does not teach you everything you need to know. It is only a small first step. In particular, it doesn’t say anything about on conditional probability. See reference 3.

Fine point, for relatively advanced students: The yellow shaded region in figure 2 is a model of the uncertainty. It is a rather crude model (based on Gaussian statistics) because I was too lazy to do the proper binomial statistics. The model serves to make an important point: The uncertainty comes from the model. There is an uncertainty band associated with the model. The data points are correctly shown with no error bars whatsoever. This is discussed in reference 4.

Raw data points do not have error bars.

2  Random Walk

2.1  Procedure

The general procedure is the same as in section 1.1. We just interpret the data differently, as explained in section 2.2.

2.2  An Example

Figure 4 shows another way of using the tack-toss data. Imagine a random walk, where every time the tack comes up tails, the walker takes a step to the right, and every time it comes up heads, he takes a step to the left. (Every step has unit length.) Since for our tacks, tails is slightly more likely, the walker will exhibit a steady trend to the right, on average. However, because the outcome of any particular toss is random, there will be a lot of noise on the data.

Figure 4: Trend and Fluctuations

Figure 4 shows the outcome of 8 separate experiments. Each experiment is shown with a different color. The vertical axis shows the number of tosses. The horizontal axis shows the position after that many tosses. The probability of heads is 35%. Theory tells us that after 60 tosses, the distribution is centered at 18. You can see that the actual outcomes are more-or-less centered around 18, but there is a lot of scatter.

In more detail: Let’s talk about the distribution, rather than any particular outcome. The center of the distribution is trending to the right at a rate of 0.3 steps per toss. This is the average behavior. It is linear in the number of tosses. Meanwhile, the width of the distribution is growing in proportion to the square root of the number of tosses. For large X, we know that √X is very much smaller than X. So in the long run, if the number of tosses is large, eventually the trend will outrun the fluctuations. However, in the short run, the fluctuations can be bigger than the trend. In the figure, you can see that one of the traces spends considerable time in negative territory.

These ideas have many applications in the real world:

Here are some additional ways of looking at the data. Suppose we repeat the tack-tossing experiment 200 times, with 60 tosses per experiment, and examine only the final position of the walker. Theory says that the results should be distributed according to a binomial distribution. This is shown in figure 5. The blue curve represents the ideal binomial distribution. There are 200 data points, shown in red, indicating the outcome of the experiments, formatted as a diaspogram. As explained in reference 3, the horizontal axis indicates the outcome of the experiment. The data is spread out in the vertical direction to make it easier to see what’s going on. The amount of spread is proportional to the theoretical probability, so there should be a uniform random distribution of points under the curve.

bino-dist-35-diaspo   bino-dist-35-scatter
Figure 5: Binomial Distribution : Diaspogram   Figure 6: Binomial Distribution : XY Scatter Plot

Meanwhile, figure 5 is a plain old scatter plot. Each black point represents the outcome of two experiments: The horizontal position represents one experiment, and the vertical position represents another. All the outcomes are of course integers (indeed even integers), but in figure 5 I offset some of the points slightly to make it easier to detect when multiple points are sitting on top of each other. When you see such points, imagine them all rounded to the nearest even integer.

In both diagrams, you can see that the distribution is centered around 18 units. For the spreadsheet used to prepare these figures, see the binomial-distribution page in reference 5.

2.3  Varying the Length of the Walk

The black curve in figure 7 shows the distribution of lengths after a 6-step walk. The probability of heads is 35%, as before. The distribution is centered around 1.8 units, so it peaks at 2 units. Notice that even though the walker is moving to the right on average, there is a significant fraction of the probability to the left of the origin.

Meanwhile, the blue curve in figure 7 shows the distribution of lengths after a 60-step walk. This is the same as the blue curve in figure 5. The distribution is centered around 18 units. There is only a tiny percentage of the probability to the left of the origin.

bino-dist-6-60   bino-dist-6-60-600
Figure 7: Random Walks of Two Different Lengths   Figure 8: Random Walks of Three Different Lengths

Figure 8 is a zoomed-out version of figure 7. The black curve and the blue curve are the same as before. The red curve shows the distribution of lengths after a 600-step walk. The distribution is centered around 180. There is an utterly negligible, infinitesimal amount of probability to the left of the origin.

Note that the absolute width of the distribution grows in proportion to the square root of the length of the walk, whereas the center of the distribution moves steadily in direct proportion to the length. As a consequence, if the walk is long enough, the trend will always swamp the fluctuations. To say the same thing another way, the relative width (as a fraction of the mean) gets smaller as the walk becomes longer. It may seem odd that the relative width is getting smaller even as the absolute width is getter larger, but that’s the way it is.

The random walk is related to the running average (as discussed in section 1) as follows:

2.4  Discussion

The idea of “trend plus fluctuations” has immensely important practical ramifications. A vast amount of real-world data behaves this way, to a good approximation.

2.5  Terminology

As usual, experts have their own terminology.

  1. The motion we have been calling a steady “trend” is called drift, in the technical sense. This is a systematic, steady drift. This stands in contrast with the homespun usage of the word, where “drift” sometimes means aimless, erratic drift.
  2. The pattern we have been calling “trend plus fluctuations” is technically called a biased random walk. In this context, the bias is what produces the trend, i.e. the steady drift. A biased coin is one where the odds are not 50/50. This stands in contrast to the homespun usage of the word, where bias refers to injecting personal opinion into some place where it doesn’t belong.

3  Pedagogical Considerations

4  References

John Denker,
“Spreadsheet to Calculate Tack-Tossing Probabilities”

Ludwig Boltzmann,
“Vorlesungen über Gastheorie”

John Denker,
“Introduction to Probability”

John Denker,
“Uncertainty as Applied to Measurements and Calculations”

John Denker,
“Spreadsheet to Calculate Simple Probability Distributions”

I define the 20th century to start in 1898 (reference 2).
Copyright © 2015 jsd