The chi-square distribution is used in many statistical tests, including the chi-square test for goodness of fit of observed sample data to theoretical models, analysis of variance tests, and several others. It is also commonly used in the analysis of contingency tables. A chi-square test can include any statistical hypothesis test where if the null hypothesis is true, the sampling distribution of the test statistic is a chi-squared distribution, or for large enough samples, can approximate a chi-squared distribution. If a chi-squared test is referred to without any modifiers, it is often the case that the intended chi-square test being referenced is Pearson's chi-square test, which will be discussed further in this application.
Pearson's chi-square test for goodness of fit analyses if an observed frequency distribution is different than a theoretical distribution. This tests the null hypothesis that there is no statistically significant differences in the frequency distribution versus the alternative hypothesis that there is at least one statistically different frequency among the frequency distribution being examined.
The chi-square test statistic can be calculated using the formula:
where Oi represents the observed frequency and Ei represents the expected frequency of an observation.
At School A, 5 students are enrolled in Arts and 5 students are enrolled in Science. At School B, 4 students are enrolled in Arts and 6 students are enrolled in Science. To test the null hypothesis that the distribution of the number of students enrolled in Arts is similar to the number of students enrolled in Science at two different schools, let:
a = # of students at School A enrolled in Arts
b = # of students at School A enrolled in Science
c = # of students at School B enrolled in Arts
b = # of students at School B enrolled in Science
To begin calculating the test statistic, the expected proportions for a, b, c and d are calculated first:
Expected a = 9⋅1020 = 4.5
Expected b = 11⋅1020 = 5.5
Expected c = 9⋅1020 = 4.5
Expected d = 11⋅1020 = 5.5
Now that both the observed and expected proportions for a, b, c and d have been determined, the chi-square test statistic can be calculated using the formula:
In the following interactive example, try to change the proportion of students in the arts and sciences in order to see the effect on the test statistic.
Example: 2x2 Contingency Table
In this example, you are given a 2x2 contingency table. Adjust the sliders to change the proportions of students that attended School A and School B and the number of students that majored in the Arts and Sciences.
Note that a 2x2 contingency table has the following form:
Click the checkboxes if you would like to see the expected proportions or the chi-square test statistic.
To determine if this tests statistic is significant at the 5% significance level, the number of degrees of freedom (note that the number of rows divided by the number of columns does not include the total) must be calculated:
Degrees of freedom = # of rows−1# of columns−1
Degrees of freedom = 2−12−1 = 1
The chi-square value for 1 degree of freedom at the 5% level of significance is:
which is greater than the test statistic. Thus, for the example given above the contingency table, the null hypothesis that there is a statistically significant difference among the examined frequency distribution cannot be rejected.
Download Help Document