Unite Against Fascism brettscaife.net

Critical appraisal


By the end of the series of discussion seminars (having prepared for and participated in the discussion) students should be able to:

  1. Demonstrate awareness of the quality of published research
  2. Explain the importance of critical appraisal in dental research
  3. Identify the main components of a critical approach to reading dental research papers
  4. Contribute to constructive discussion of research papers
  5. Identify the strengths and weaknesses of studies
  6. Evaluate the relative importance of different aspects of a study
  7. Reach a balanced conclusion on the basis of the evidence


Altman, D.G., 1982. Interpreting results, in eds. S.M Gore, & D.G. Altman, Statistics in practice : articles published in the British Medical Journal pp. 18-20. British Medical Association, London.

A brief discussion of the main points in critical appraisal.

Altman, D.G., 1991. Practical Statistics for Medical Research, pp. 477-499. Chapman & Hall, London.

A good overview of how to interpret the medical literature critically.

Campbell, M.J. & D. Machin, 1993. Medical Statistics: A Commonsense Approach, pp. 1-31. John Wiley & Sons, Chichester.

A good introduction to the use and misuse of statistics and to study design. Most chapters in the book include things to look out for when reading the literature.

Tufte, E.R., 1983. The Visual Display of Quantitative Information. Graphics Press, Cheshire, Connecticut.

The classic account of how graphics should and should not be used to present quantitative data.

Misuse of statistics in the medical literature

Number of errors in use of statistics in papers published in Arthritis and Rheumatism:

(n = 47)
(n = 74)
Undefined method 14 (30%) 7 (9%)
Inadequate description of measures of location or dispersion 6 (13%) 7 (9%)
Repeated observations treated as independent 1 (2%) 4 (5%)
Two groups compared on more than 10 variables at 5% level 3 (6%) 28 (38%)
Multiple t tests instead of analysis of variance 2 (4%) 18 (24%)
Chi-squared tests used when expected frequencies too small 3 (6%) 4 (5%)
At least one of the above errors 28 (60%) 49 (66%)

Felson, D.T, L.A. Cupples & R.F. Meenan, 1984. Misuse of statistical methods in Arthritis and Rheumatism. 1982 versus 1967-68, Arthritis and Rheumatism 27: 1018-1022.

Summary of review of 86 therapeutic trials in perinatal medicine (% of studies fulfilling criteria):

Yes Unclear No
Statement of purpose 94 6 0
Clearly defined outcome variables 74 1 25
Planned prospective data collection 48 30 22
Predetermined sample size 3 16 71
Sample size specified 93 6 1
Disease/health status of subjects specified 51 20 29
Exclusion criteria specified 46 9 45
Randomisation appropriately performed 9 12 79
Blinding used, if necessary 49 47 4
Adequate sample size 15 44 41
Statistical methods used appropriately 26 0 74
Conclusions justified 10 71 19

Tyson. J.E., J.A. Furzon, J.S. Reisch & S.G. Mize, 1983. An evaluation of the quality of therapeutic studies in perinatal medicine, J. Pediatr. 102: 10-13.

Where would we look for errors?


Bad design can lead to biased studies with over-optimistic results. Often a retrospective study, using data collected for another purpose, ends up with a bad design.

Symptoms of bad design may include: variations in methods of evaluation; unequal numbers of observations for different subjects; many missing observations; a vagueness about the rationale for the study.


The study protocol has not been followed properly. Perhaps the randomisation has failed, perhaps too many missing values.


The statistical techniques have been used improperly, see other notes for details.


The wrong level of numerical precision can imply a study is more accurate than it is. (P = 0·053673, 96·567894% of cases, r = 0·9999569 are almost certainly examples of spurious precision.)

Giving the wrong information: means without indications of variability; standard errors for descriptive information; only giving the P value.

Graphs can get over the message of an analysis very well but can also distort the meaning. Tufte (1983) gives an excellent overview of the problems. If numerical data is not given alongside a graphic it is often impossible for readers to confirm the analysis for themselves.


Ask, amongst other things: Have the P values been given the correct interpretation. Has a false causal link been implied from a statistical association? Does the sample analysed relate properly to the population under study?


Have all the techniques used been specified?

Has the design been adequately described?

Is it clear if they are using standard deviations or standard errors?

Has all the information used to reach the conclusion been presented?

Appraising a paper

Be constructive

Note what is good as well as what is bad - remember, research is not easy!

Look for other explanations

Did the results occur by chance? (low power? multiple testing?) Is there bias or confounding? Is there another cause which could explain the effect?

How could the study be improved?

Better design? Different statistical techniques? Better reporting?

How important are the errors?

Are the flaws so bad that you don't believe the result or are they merely 'cosmetic' and don't shake your faith in the overall conclusions.

A checklist can help

Two are given in Altman (1991: 495-497).

Use the checklists as a guide, however, going down the list ticking off the relevant sections does not constitute a critical appraisal. The checklists are an aid to critical appraisal.

Ethical considerations

Statistical errors can have serious effects on patients and other research.