Solution to question 2


Expected values:

Dentist 2
Halitosis Healthy Total
Dentist 1 Halitosis 6·21 20·79 27
Healthy 16·79 56·21 73
Total 23 77 100

Actual proportion of agreement = (15 + 65)÷100 = 0·8

Expected proportion of agreement = (6·21 + 56·21)÷100 = 0·6242

κ = (0·8 - 0·6242)÷(1 - 0·6242) = 0·47

There is moderate agreement between the two dentists


By itself the result does not prove that the treatment prevents halitosis, what has been established is that it is highly likely that there is an association between halitosis experience and whether or not treatment is received. In particular it appears that there is an association between receiving treatment and not suffering from halitosis.

With such a small P value we can be confident that the association is a strong one there are, however, other factors that we would need to take into account to establish a causal relationship. The fact that the cure seems to have followed on from the treatment is a factor helping to lead us to believing in a causal link. We might also look at similar studies with similar treatments, if they exist, to see if our results are broadly in agreement; this would strengthen our belief in a causal link between treatment and cure.

We would like a biologically plausible explanation of how the treatment could effect the cure; whilst it is possible for a treatment to work in a way we don't understand we will have more confidence in a treatment that can be explained using current medical knowledge. Again, if similar sorts of treatment have had success in curing halitosis in the past then we would have more confidence in declaring a causal link in this instance.

Finally, we might wonder if the fact that dentist 1 carried out all the measurements had an effect on the results. Although there was moderate agreement between the two dentists and the P value was small we might prefer to examine the raw data before coming to a firm conclusion.


This would not have been a good idea. Essentially the problem here is that the 500 measurements are not independent of each other. Firstly five measurements are carried out on each patient, so these measurements are not independent. Secondly there is time-dependent factor: if a patient was cured of halitosis in week 1 then they would be likely to remain cured in weeks 2 to 4; if a patient was still suffering in week 3 it is likely that they suffered in weeks 1 and 2 as well.

There may be some value in measuring the patients repeatedly but we would have to analyse the results differently. We need a summary measure that gave us useful information. Perhaps if the time it took for half the patients in the group to be cured; or the area under the graph of 'Week' against 'Proportion of patients halitosis free'. In this particular study, however, it seems likely that the best summary measure would be the final number of patients halitosis-free at the end of the intervention, which is what was analysed in the χ2 test in part (b).

Back to questions