Student/Statistics/ChiSquareIndependenceTest/overview - Maple Help

Student[Statistics][ChiSquareIndependenceTest] Overview

overview of the chi-square independence test

Description

 • Chi-Squared Independence Test is used when there are two categorical variables for a single population; it tests if the two variables are independent.
 • Requirements for using Chi-Squared Independence Test:
 1 Here, the goal is to test if two attributes within a population are independent of one another.
 2 The data provided are formatted as a Matrix with at least two rows and two columns. The rows represent the levels of one attribute and the columns represent the levels of the other attribute; the entries in the Matrix are counts of observations with the given combination of levels.
 3 This test is performed within a single population.
 • The formula is: $\underset{j=1..K}{\sum _{i=1..N}}\frac{{\left({M}_{i,j}-{E}_{i,j}\right)}^{2}}{{E}_{i,j}}$, where $M$ is the matrix of observations, and $E$ is the matrix of expected data, which is computed as: ${E}_{i,j}=\frac{{\mathrm{rowsum}}_{i}{\mathrm{columnsum}}_{j}}{\mathrm{matrixsum}}$.
 In turn, ${\mathrm{rowsum}}_{i}$ is computed as $\sum _{j=1}^{K}{M}_{i,j}$; ${\mathrm{columnsum}}_{j}$ is computed as $\sum _{i=1}^{N}{M}_{i,j}$; and $\mathrm{matrixsum}$ is computed as $\underset{j=1..K}{\sum _{i=1..N}}{M}_{i,j}$.
 where $N$ is the sample size of the observed and the expected samples, and ${X}^{2}$ follows a Chi-Squared distribution with $N-1$ degrees of freedom.

Example

The number of students enrolled into the Math Faculty, Art Faculty, and Environment Faculty of a university is shown as follow:

 Math Art Environment Row total Male 250 120 180 550 Female 150 300 150 600 Column total 400 420 330 1150

Now we want to test if there is a difference between preferences towards these three faculties from male students to female students.

Notice: The matrix we build up for the test for this case should be

 $\left[\begin{array}{rrr}250& 120& 180\\ 150& 300& 150\end{array}\right]$
 1 Determine the null hypothesis:
 Null Hypothesis: Gender and preferences to these three faculties are independent.
 2 Compare the expected data and the observed data:

 Observed Expected O[1,1] = 250 E[1,1] = $\frac{550\cdot 400}{1150}$ = 191.30435 O[1,2] = 120 E[1,2] = $\frac{550\cdot 420}{1150}$ = 200.86957 O[1,3] = 180 E[1,3] = $\frac{550\cdot 330}{1150}$ = 157.82609 O[2,1] = 150 E[2,1] = $\frac{600\cdot 400}{1150}$ = 208.69565 O[2,2] = 300 E[2,2] = $\frac{600\cdot 420}{1150}$ = 219.13043 O[2,3] = 150 E[2,3] = $\frac{600\cdot 330}{1150}$ = 172.17391

 3 Substitute the information into the formula:
 x =  = 102.891
 4 Compute the p-value:
 p-value = $\mathrm{Probability}\left({X}^{2}>102.891\right)$ = 0   (a small value very close to 0)
 ${X}^{2}˜\mathrm{ChiSquare}\left(3\right)$.
 5 Draw the conclusion:
 This statistical test provides evidence that the null hypothesis is false, so we reject the null hypothesis.