brettscaife.net

Solution to question 4.6

Using the formula:

s.e. (proportion) = square root of proportion multiplied by one minus proportion divided by sample size

We get the following results for the confidence intervals:

95% confidence interval
Material Proportion From To
M 0·164 0·066 0·261
C 0·154 0·056 0·252
P 0·093 0·015 0·170

Strictly speaking our confidence interval for material P is invalid. The formula we have used for the standard error is an approximation. It is a good approximation in almost all circumstances but it should not really be applied if the number of failures is 5 or less. (Or if 'sample size minus the number of failures' is 5 or less.) In this case we'll carry on as if it does't matter. If you come across this problem in your own research you'll have to seek further advice.

Now we need a diagram which describes our results in an honest and useful manner. You will sometimes see the results for a proportion (or a mean) presented in the style of a bar chart as so:

Wrong diagram for presenting proportions

This is an incorrect way to present this sort of data. Firstly, it has no representation of how good the estimates of the means are: no confidence intervals. Secondly, bar chart type graphs are not suitable for presenting the values of single quantities such as proportions or means. They are best suited to representing counts. The brain's first interpretation of a bar chart tends to be along these lines.

It is also difficult to represent the precision of an estimate on a bar chart. Sometimes charts such as this are seen:

Wrong diagram for presenting proportions with confidence intervals

(These sort of plots are sometimes known as detonator plots after their resemblance to a classical plunger-type explosives detonator.) There is an attempt to represent the confidence interval here but it only goes one way! There is yet another version where a mirror image of the plunger goes down below the top of the column. At this point the graph starts to get cluttered and the effort of reading it starts to counteract the prime purpose of the graph: to simplify presentation.

A much better way of presenting the results is as so:

Diagram presenting proportions with confidence intervals

Here we see a clear point representing the proportion observed and lines representing the confidence intervals. It represents the data in a way that is easy to use. It aids our interpretation of the figures we calculated above.

It is clear that we are unable to see if there is any real difference between the three types of material. There is a wide range of values (from about 0·07 to about 0·17) which could be the proportion of failures for all three materials.

Note that there is a formal statistical test that can be carried out to confirm whether or not the the differences seen betwen two proportions are significant.

This result is not particulary surprising. If we want to detect the difference between two proportions then either the differences have to be very large or our sample sizes have to be very large.

In this case we might wonder if the difference we saw in failure rates for material P (about 9%) and the other two materials (about 15%) was a real difference, and the only reason we didn't see it as significant was because of the small sample size. We may want to reccomend a further study to see if there really is a 6% difference in failure rates between material P and the other two materials. I calculated that we would need about 450 to 500 patients in each group to stand a reasonable chance of spotting this sort of difference. Obviously, we would have to be convinced of the potential clinical benefits of a 6% improvement in failure rates before we started such a large study. We will be looking at the problem of how big a sample needs to be in a future unit

Back to questions