brettscaife.net

χ2 tests

You can access a LibreOffice spreadsheet that calculates the results for a χ2 test for various sizes of contingency table by clicking on the link below. (For a 2 x 2 table it will also calculate odds and risk ratios.)

Doing a χ2 test by hand

If we have the following set of results (the observed values):

Caries
Yes No Total
Fluoridated 77 29 106
Non-fluoridated 95 31 126
Total 172 60 232

We want to compare these with the set of values that would have been most likely if the null hypothesis were true - the expected values. The calculation of the expected value in each cell is quite straightforward:

Row total x Column total ÷ Overall total

So for the top-left cell we have:

Expected value = 106 x 60 ÷ 232 = 78·59

Carrying out the same procedure for the other three cells gives us the following table of expected values:

Caries
Yes No Total
Fluoridated 78·59 27·41 106
Non-fluoridated 93·41 32·59 126
Total 172 60 232

The χ2> statistic is a measure of how far the observed values are from the expected values. Each cell contributes the following amount towards the total value of χ2:

(Observed - Expected)2÷Expected

So, for the top-left cell we get:

(77 - 78·59)2÷78·59 = 0·032

After doing the same for the other three cells we get the following four values:

0·032     0·092
0·027     0·078

χ2 is simply the sum of these values. So, in this case:

χ2 = 0·228

Note that although I wrote rounded figures in the tables above I did not round off my calculations until the final step - this will account for any observed discrepancies.

We now compare our value of χ2 with critical values from statistical tables. If you look at a set of tables for χ2 tests you will see a different one for each degree of freedom (similar to the t test tables we encountered in an earlier unit. The degree of freedom associated with a chi-squared test is given by:

(Rows - 1) x (Columns - 1)

In this case:

d.o.f. = (2-1) x (2-1) = 1

As usual we set our level of significance: α = 0·05. Looking in the tables for 1 d.o.f and α = 0·05 gives us a critical value of 3·841. We compare our calculated value of χ2 with this critical value. Our value is less than the critical value so P is greater than 0·05:

P>0·05

We have a non-significant result

Yates's correction

(Also called the 'continuity correction'

For small sample sizes the χ2 test is not quite accurate enough - it tends to spot differences where none really exist. Fortunately there is a simple correction that removes this tendency - Yates's correction. We simply amend the formula above so that the contribution of each cell towards χ2 is now:

(ABS(Observed - Expected)-0·5)2÷Expected

ABS(Observed - Expected) means subtract the expected from the observed and drop the minus sign if there is one.

So for the first cell:

Observed - Expected = 77 - 78·59 = -1·59
ABS(Observed - Expected) = 1·59
ABS(Observed - Expected)-0·5 = 1·59 - 0·5 = 1·09
(ABS(Observed - Expected)-0·5)2 = 1·19
ABS(Observed - Expected)-0·5)2÷Expected = 0·015

After doing the same for the other three cells we get the following four values:

0·015     0·043
0·013     0·036

χ2 is now the sum of these values:

χ2 = 0·107

Comparing this with the appropriate critical value still gives us a non-significant result.

Although Yates's correction is absolutely necessary for small sample sizes I recommend using it for all χ2 tests. It will not make a test on a large sample size less valid and it saves us having to worry about what is 'small'

Note that using Yates's correction does not remove the requirement to meet the conditions for the χ2 test: