Critical appraisal

Objectives

By the end of the series of discussion seminars (having prepared for and participated in the discussion) students should be able to:

Demonstrate awareness of the quality of published research
Explain the importance of critical appraisal in dental research
Identify the main components of a critical approach to reading dental research papers
Contribute to constructive discussion of research papers
Identify the strengths and weaknesses of studies
Evaluate the relative importance of different aspects of a study
Reach a balanced conclusion on the basis of the evidence

Bibliography

Altman, D.G., 1982. Interpreting results, in eds. S.M Gore, & D.G. Altman, Statistics in practice : articles published in the British Medical Journal pp. 18-20. British Medical Association, London.

A brief discussion of the main points in critical appraisal.

Altman, D.G., 1991. Practical Statistics for Medical Research, pp. 477-499. Chapman & Hall, London.

A good overview of how to interpret the medical literature critically.

Campbell, M.J. & D. Machin, 1993. Medical Statistics: A Commonsense Approach, pp. 1-31. John Wiley & Sons, Chichester.

A good introduction to the use and misuse of statistics and to study design. Most chapters in the book include things to look out for when reading the literature.

Tufte, E.R., 1983. The Visual Display of Quantitative Information. Graphics Press, Cheshire, Connecticut.

The classic account of how graphics should and should not be used to present quantitative data.

Misuse of statistics in the medical literature

Number of errors in use of statistics in papers published in Arthritis and Rheumatism:

Error	1967-68 (n = 47)	1982 (n = 74)
Undefined method	14 (30%)	7 (9%)
Inadequate description of measures of location or dispersion	6 (13%)	7 (9%)
Repeated observations treated as independent	1 (2%)	4 (5%)
Two groups compared on more than 10 variables at 5% level	3 (6%)	28 (38%)
Multiple t tests instead of analysis of variance	2 (4%)	18 (24%)
χ² tests used when expected frequencies too small	3 (6%)	4 (5%)
At least one of the above errors	28 (60%)	49 (66%)

Felson, D.T, L.A. Cupples & R.F. Meenan, 1984. Misuse of statistical methods in Arthritis and Rheumatism. 1982 versus 1967-68, Arthritis and Rheumatism 27: 1018-1022.

Summary of review of 86 therapeutic trials in perinatal medicine (% of studies fulfilling criteria):

	Yes	Unclear	No
Statement of purpose	94	6	0
Clearly defined outcome variables	74	1	25
Planned prospective data collection	48	30	22
Predetermined sample size	3	16	71
Sample size specified	93	6	1
Disease/health status of subjects specified	51	20	29
Exclusion criteria specified	46	9	45
Randomisation appropriately performed	9	12	79
Blinding used, if necessary	49	47	4
Adequate sample size	15	44	41
Statistical methods used appropriately	26	0	74
Conclusions justified	10	71	19

Tyson. J.E., J.A. Furzon, J.S. Reisch & S.G. Mize, 1983. An evaluation of the quality of therapeutic studies in perinatal medicine, J. Pediatr. 102: 10-13.

Where would we look for errors?

Design

Bad design can lead to biased studies with over-optimistic results. Often a retrospective study, using data collected for another purpose, ends up with a bad design.

Symptoms of bad design may include: variations in methods of evaluation; unequal numbers of observations for different subjects; many missing observations; a vagueness about the rationale for the study.

Execution

The study protocol has not been followed properly. Perhaps the randomisation has failed, perhaps too many missing values.

Analysis

The statistical techniques have been used improperly, see other notes for details.

Presentation

The wrong level of numerical precision can imply a study is more accurate than it is. (P = 0·053673, 96·567894% of cases, r = 0·9999569 are almost certainly examples of spurious precision.)

Giving the wrong information: means without indications of variability; standard errors for descriptive information; only giving the P value.

Graphs can get over the message of an analysis very well but can also distort the meaning. Tufte (1983) gives an excellent overview of the problems. If numerical data is not given alongside a graphic it is often impossible for readers to confirm the analysis for themselves.

Interpretation

Ask, amongst other things: Have the P values been given the correct interpretation. Has a false causal link been implied from a statistical association? Does the sample analysed relate properly to the population under study?

Omission

Have all the techniques used been specified?

Has the design been adequately described?

Is it clear if they are using standard deviations or standard errors?

Has all the information used to reach the conclusion been presented?

Appraising a paper

Be constructive

Note what is good as well as what is bad - remember, research is not easy!

Look for other explanations

Did the results occur by chance? (low power? multiple testing?) Is there bias or confounding? Is there another cause which could explain the effect?

How could the study be improved?

Better design? Different statistical techniques? Better reporting?

How important are the errors?

Are the flaws so bad that you don't believe the result or are they merely 'cosmetic' and don't shake your faith in the overall conclusions.

A checklist can help

Two are given in Altman (1991: 495-497).

Use the checklists as a guide, however, going down the list ticking off the relevant sections does not constitute a critical appraisal. The checklists are an aid to critical appraisal.

Ethical considerations

Statistical errors can have serious effects on patients and other research.

Patients in an invalid study are subjected to procedures that produce no advances in knowledge
Future patients may receive inferior treatment
Other investigators may be led into false line of investigation
Further research may go unfunded because a false 'solution' has been found
Resources have been wasted
Poor statistical methods may, if unchallenged, be used in future studies.