### Solution to question 4

#### (a)

Percentage agreement takes no account of random agreement. Sometimes the clinicians will agree on a diagnosis by chance. We need to use a measure of agreement that makes an allowance for chance agreements. Kappa is an appropriate measure of agreement here.

Expected values:

 Dentist B No disease or low level of disease Moderate or severe level of disease Total Dentist A No disease or low level of disease 9 15 24 Moderate or severe level of disease 15 25 40 Total 24 40 64

Expected proportion of agreement = (9 + 25) ÷ 64 = 0·53125

Observed proportion of agreement = (16 + 32) ÷ 64 = 0·75

= (0·75 - 0·53125) ÷ (1 - 0·53125) = 0·47

(NB. rounding to 2 decimal places)

This represents moderate agreement

#### (b)

• The difference in pocket reduction of 0·23mm should have had a confidence interval (normally 95%) reported with it to aid clinical interpretation. For example, we would interpret "0·23mm (95% CI from 0·01 to 0·45)" differently to "0·23mm (95% CI from 0·21 to 0·25)". A result can be statistically significant (with a P value less than 0·05; yet it may have no real clinical significance if the lower bound of the 95% CI is too close to zero.
• The P value should not have been reported to so many decimal places; it gives a spurious air of accuracy. In this case two decimal places would have sufficed, and giving more than three is not good practice.

#### (c)

Bias can be defined as the distortion of the estimated effects caused by a systematic difference between the groups being compared.

If you don't know whether an effect is caused by the variable you are interested in (e.g. a drug or smoking) or by another variable (e.g. age or sex) then the other variable is called a confounder and it is said to cause confounding. To be a confounder a variable must be associated both with the exposure you are interested in and the outcome.

Confounding can be source of bias but there are other sources of bias. For example, the study could have been biased if patients were recruited through a poster campaign in the dental surgery so that only volunteers for study were measured. This might mean that only the more motivated sections of both the smoking and non-smoking groups were recruited. This could, potentially, mean that differences in the underlying populations were obscured. (The best 10% in each population may not differ but the other 90% may.)

In this study, if the measuring device used for the smokers after the training program gave readings that were 0·2mm too large, this would cause bias.

The study could have been confounded if all of the smokers ate lots of sticky, starchy foods and the non-smokers didn't. We wouldn't know if the differences we saw in the groups was due to smoking or the eating of sticky foods.

Back to questions