Critical appraisal
Objectives
By the end of the series of discussion seminars (having prepared for and participated in the discussion) students should be able to:
- Demonstrate awareness of the quality of published research
- Explain the importance of critical appraisal in dental research
- Identify the main components of a critical approach to reading dental research papers
- Contribute to constructive discussion of research papers
- Identify the strengths and weaknesses of studies
- Evaluate the relative importance of different aspects of a study
- Reach a balanced conclusion on the basis of the evidence
Bibliography
Altman, D.G., 1982. Interpreting results, in eds. S.M Gore, & D.G. Altman, Statistics in practice : articles published in the British Medical Journal pp. 18-20. British Medical Association, London.
A brief discussion of the main points in critical appraisal.
Altman, D.G., 1991. Practical Statistics for Medical Research, pp. 477-499. Chapman & Hall, London.
A good overview of how to interpret the medical literature critically.
Campbell, M.J. & D. Machin, 1993. Medical Statistics: A Commonsense Approach, pp. 1-31. John Wiley & Sons, Chichester.
A good introduction to the use and misuse of statistics and to study design. Most chapters in the book include things to look out for when reading the literature.
Tufte, E.R., 1983. The Visual Display of Quantitative Information. Graphics Press, Cheshire, Connecticut.
The classic account of how graphics should and should not be used to present quantitative data.
Misuse of statistics in the medical literature
Number of errors in use of statistics in papers published in Arthritis and Rheumatism:
Error |
1967-68 (n = 47) |
1982 (n = 74) |
---|---|---|
Undefined method | 14 (30%) | 7 (9%) |
Inadequate description of measures of location or dispersion | 6 (13%) | 7 (9%) |
Repeated observations treated as independent | 1 (2%) | 4 (5%) |
Two groups compared on more than 10 variables at 5% level | 3 (6%) | 28 (38%) |
Multiple t tests instead of analysis of variance | 2 (4%) | 18 (24%) |
χ2 tests used when expected frequencies too small | 3 (6%) | 4 (5%) |
At least one of the above errors | 28 (60%) | 49 (66%) |
Felson, D.T, L.A. Cupples & R.F. Meenan, 1984. Misuse of statistical methods in Arthritis and Rheumatism. 1982 versus 1967-68, Arthritis and Rheumatism 27: 1018-1022.
Summary of review of 86 therapeutic trials in perinatal medicine (% of studies fulfilling criteria):
Yes | Unclear | No | |
---|---|---|---|
Statement of purpose | 94 | 6 | 0 |
Clearly defined outcome variables | 74 | 1 | 25 |
Planned prospective data collection | 48 | 30 | 22 |
Predetermined sample size | 3 | 16 | 71 |
Sample size specified | 93 | 6 | 1 |
Disease/health status of subjects specified | 51 | 20 | 29 |
Exclusion criteria specified | 46 | 9 | 45 |
Randomisation appropriately performed | 9 | 12 | 79 |
Blinding used, if necessary | 49 | 47 | 4 |
Adequate sample size | 15 | 44 | 41 |
Statistical methods used appropriately | 26 | 0 | 74 |
Conclusions justified | 10 | 71 | 19 |
Tyson. J.E., J.A. Furzon, J.S. Reisch & S.G. Mize, 1983. An evaluation of the quality of therapeutic studies in perinatal medicine, J. Pediatr. 102: 10-13.
Where would we look for errors?
Design
Bad design can lead to biased studies with over-optimistic results. Often a retrospective study, using data collected for another purpose, ends up with a bad design.
Symptoms of bad design may include: variations in methods of evaluation; unequal numbers of observations for different subjects; many missing observations; a vagueness about the rationale for the study.
Execution
The study protocol has not been followed properly. Perhaps the randomisation has failed, perhaps too many missing values.
Analysis
The statistical techniques have been used improperly, see other notes for details.
Presentation
The wrong level of numerical precision can imply a study is more accurate than it is. (P = 0·053673, 96·567894% of cases, r = 0·9999569 are almost certainly examples of spurious precision.)
Giving the wrong information: means without indications of variability; standard errors for descriptive information; only giving the P value.
Graphs can get over the message of an analysis very well but can also distort the meaning. Tufte (1983) gives an excellent overview of the problems. If numerical data is not given alongside a graphic it is often impossible for readers to confirm the analysis for themselves.
Interpretation
Ask, amongst other things: Have the P values been given the correct interpretation. Has a false causal link been implied from a statistical association? Does the sample analysed relate properly to the population under study?
Omission
Have all the techniques used been specified?
Has the design been adequately described?
Is it clear if they are using standard deviations or standard errors?
Has all the information used to reach the conclusion been presented?
Appraising a paper
Be constructive
Note what is good as well as what is bad - remember, research is not easy!
Look for other explanations
Did the results occur by chance? (low power? multiple testing?) Is there bias or confounding? Is there another cause which could explain the effect?
How could the study be improved?
Better design? Different statistical techniques? Better reporting?
How important are the errors?
Are the flaws so bad that you don't believe the result or are they merely 'cosmetic' and don't shake your faith in the overall conclusions.
A checklist can help
Two are given in Altman (1991: 495-497).
Use the checklists as a guide, however, going down the list ticking off the relevant sections does not constitute a critical appraisal. The checklists are an aid to critical appraisal.
Ethical considerations
Statistical errors can have serious effects on patients and other research.
- Patients in an invalid study are subjected to procedures that produce no advances in knowledge
- Future patients may receive inferior treatment
- Other investigators may be led into false line of investigation
- Further research may go unfunded because a false 'solution' has been found
- Resources have been wasted
- Poor statistical methods may, if unchallenged, be used in future studies.