### Solution to question 3

#### (a)

- Standard deviation = 0·75mm
- Difference we want to detect = 0·25mm

We assume:

- That pocket depths are normally distributed in both groups
- The standard deviations in both groups are similar

- We set the significance level, α, to 5%. (5% is chosen as being the minimum significance level that is generally accepted.
- We set the power at 80%, this is the minimum significance that is generally acceptable. (It might have been better to use 90% or, preferably, 95% but 80% is would be accepatable to most journals and ethics committees for most studies.)

This gives a 'magic number' of **7·8**

m = 2 x 0·75 x 0·75 x 7·8 ÷ 0·25 ÷ 0·25

m = 140·1

m = 141 to nearest whole number

Total sample for whole study = 2 x 141 = 282

We would round up this figure to 290 or 300 to be on the safe side. We might also make explicit allowance for dropouts, if we suspect this might happen, by making the sample size even larger. The size of the increase for dropouts would normally depend on our experience with previous similar studies.

#### (b)

The four main factors affecting sample size are:

- Variability of the samples
- If standard deviation is higher then sample size is bigger
- If the standard deviation of pocket depth had been 1mm then we would have needed more patients
- What difference do we want to detect
- If we need to detect a smaller difference then we need a bigger sample
- If we decided that 0·1mm would be an important clinical difference in pocket depth then we would have needed more patients
- What level of α we chose
- A smaller α requires more subjects
- If we wanted to be more certain of not getting a false positive result above we could have altered the significance level to α = 0·01, this would have given us a bigger sample size
- What power we choose
- A more powerful comparison requires more subjects
- If we wanted to be more certain of detecting any difference which really exists between the groups we could have used a power of 95% rather than 80%

#### (c)

Pockets on the same person are not independent. Pockets measured on the same person are more likely to be similar to each other than they are to pockets on another person. To *some* extent we are repeatedly measuring the same quantity when we measure several pockets on the same person. If we measured 10 pockets on each of 30 patients we would, typically, see less variation than if we measured 1 pocket on each of 300 patients.

One way of dealing with this problem would be to take a summary measure for each patient, perhaps the mean pocket depth for each patient, and analyse these summary measurements.

A better, but more difficult, method of analysis would be to take some sort of *modelling* approach, where we could look at *within* patient and *between* patient variability at the same time.