D. The Chi-Square Test

About the Chi-Square Test

Generally speaking, the chi-square test is a statistical test used to examine differences with categorical variables. There are a number of features of the social world we characterize through categorical variables - religion, political preference, etc. To examine hypotheses using such variables, use the chi-square test.

The chi-square test is used in two similar but distinct circumstances:

  1. for estimating how closely an observed distribution matches an expected distribution - we'll refer to this as the goodness-of-fit test
  2. for estimating whether two random variables are independent.

The Goodness-of-Fit Test

One of the more interesting goodness-of-fit applications of the chi-square test is to examine issues of fairness and cheating in games of chance, such as cards, dice, and roulette. Since such games usually involve wagering, there is significant incentive for people to try to rig the games and allegations of missing cards, "loaded" dice, and "sticky" roulette wheels are all too common.

So how can the goodness-of-fit test be used to examine cheating in gambling? It is easier to describe the process through an example. Take the example of dice. Most dice used in wagering have six sides, with each side having a value of one, two, three, four, five, or six. If the die being used is fair, then the chance of any particular number coming up is the same: 1 in 6. However, if the die is loaded, then certain numbers will have a greater likelihood of appearing, while others will have a lower likelihood.

One night at the Tunisian Nights Casino, renowned gambler Jeremy Turner (a.k.a. The Missouri Master) is having a fantastic night at the craps table. In two hours of playing, he's racked up $30,000 in winnings and is showing no sign of stopping. Crowds are gathering around him to watch his streak - and The Missouri Master is telling anyone within earshot that his good luck is due to the fact that he's using the casino's lucky pair of "bruiser dice," so named because one is black and the other blue.

Black Die Blue Die

Unbeknownst to Turner, however, a casino statistician has been quietly watching his rolls and marking down the values of each roll, noting the values of the black and blue dice separately. After 60 rolls, the statistician has become convinced that the blue die is loaded.

Value on Blue Die Observed Frequency Expected Frequency
1 16 10
2 5 10
3 9 10
4 7 10
5 6 10
6 17 10
Total 60 60

At first glance, this table would appear to be strong evidence that the blue die was, indeed, loaded. There are more 1's and 6's than expected, and fewer than the other numbers. However, it's possible that such differences occurred by chance. The chi-square statistic can be used to estimate the likelihood that the values observed on the blue die occurred by chance.

The key idea of the chi-square test is a comparison of observed and expected values. How many of something were expected and how many were observed in some process? In this case, we would expect 10 of each number to have appeared and we observed those values in the left column.

With these sets of figures, we calculate the chi-square statistic as follows:

Mathematical Markup

Using this formula with the values in the table above gives us a value of 13.6.

Lastly, to determine the significance level we need to know the "degrees of freedom." In the case of the chi-square goodness-of-fit test, the number of degrees of freedom is equal to the number of terms used in calculating chi-square minus one. There were six terms in the chi-square for this problem - therefore, the number of degrees of freedom is five.

We then compare the value calculated in the formula above to a standard set of tables. The value returned from the table is 1.8%. We interpret this as meaning that if the die was fair (or not loaded), then the chance of getting a ?2 statistic as large or larger than the one calculated above is only 1.8%. In other words, there's only a very slim chance that these rolls came from a fair die. The Missouri Master is in serious trouble.

Recap

To recap the steps used in calculating a goodness-of-fit test with chi-square:

  1. Establish hypotheses.
  2. Calculate chi-square statistic. Doing so requires knowing:
    • The number of observations
    • Expected values
    • Observed values
  3. Assess significance level. Doing so requires knowing the number of degrees of freedom.
  4. Finally, decide whether to accept or reject the null hypothesis.

Testing Independence

The other primary use of the chi-square test is to examine whether two variables are independent or not. What does it mean to be independent, in this sense? It means that the two factors are not related. Typically in social science research, we're interested in finding factors that are related - education and income, occupation and prestige, age and voting behavior. In this case, the chi-square can be used to assess whether two variables are independent or not.

More generally, we say that variable Y is "not correlated with" or "independent of" the variable X if more of one is not associated with more of another. If two categorical variables are correlated their values tend to move together, either in the same direction or in the opposite.

Example

Return to the example discussed at the introduction to chi-square, in which we want to know whether boys or girls get into trouble more often in school. Below is the table documenting the percentage of boys and girls who got into trouble in school:

  Got in Trouble No Trouble Total
Boys 46 71 117
Girls 37 83 120
Total 83 154 237

To examine statistically whether boys got in trouble in school more often, we need to frame the question in terms of hypotheses.

1. Establish Hypotheses

As in the goodness-of-fit chi-square test, the first step of the chi-square test for independence is to establish hypotheses. The null hypothesis is that the two variables are independent - or, in this particular case that the likelihood of getting in trouble is the same for boys and girls. The alternative hypothesis to be tested is that the likelihood of getting in trouble is not the same for boys and girls.

Cautionary Note

It is important to keep in mind that the chi-square test only tests whether two variables are independent. It cannot address questions of which is greater or less. Using the chi-square test, we cannot evaluate directly the hypothesis that boys get in trouble more than girls; rather, the test (strictly speaking) can only test whether the two variables are independent or not.

2. Calculate the expected value for each cell of the table

As with the goodness-of-fit example described earlier, the key idea of the chi-square test for independence is a comparison of observed and expected values. How many of something were expected and how many were observed in some process? In the case of tabular data, however, we usually do not know what the distribution should look like (as we did with rolls of dice). Rather, in this use of the chi-square test, expected values are calculated based on the row and column totals from the table.

The expected value for each cell of the table can be calculated using the following formula:

Mathematical Markup

For example, in the table comparing the percentage of boys and girls in trouble, the expected count for the number of boys who got in trouble is:

Mathematical Markup

The first step, then, in calculating the chi-square statistic in a test for independence is generating the expected value for each cell of the table. Presented in the table below are the expected values (in parentheses and italics) for each cell:

  Got in Trouble No Trouble Total
Boys 46 (40.97) 71 (76.02) 117
Girls 37 (42.03) 83(77.97) 120
Total 83 154 237

3. Calculate Chi-square statistic

With these sets of figures, we calculate the chi-square statistic as follows:

Mathematical Markup

In the example above, we get a chi-square statistic equal to:

Mathematical Markup

4. Assess significance level

Lastly, to determine the significance level we need to know the "degrees of freedom." In the case of the chi-square test of independence, the number of degrees of freedom is equal to the number of columns in the table minus one multiplied by the number of rows in the table minus one.

In this table, there were two rows and two columns. Therefore, the number of degrees of
freedom is:

Mathematical Markup

We then compare the value calculated in the formula above to a standard set of tables. The value returned from the table is p< 20%. Thus, we cannot reject the null hypothesis and conclude that boys are not significantly more likely to get in trouble in school than girls.

Recap

To recap the steps used in calculating a goodness-of-fit test with chi-square

  1. Establish hypotheses
  2. Calculate expected values for each cell of the table.
  3. Calculate chi-square statistic. Doing so requires knowing:
    1. The number of observations
    2. Observed values
  4. Assess significance level. Doing so requires knowing the number of degrees of freedom
  5. Finally, decide whether to accept or reject the null hypothesis.