# Bias and confounding

#### Objectives

At the end of the lecture students should be able to:

1. Explain the concept of bias and identify some potential sources of bias
2. Define the concept of confounding and identify potential confounders
3. Describe some potential solutions to confounding

### Bibliography

Altman, D.G., 1991. Practical statistics for medical research, pp. 74-106. Chapman & Hall, London.

This chapter, on designing research, includes discussion of various types of bias and techniques for avoiding it.

Bland M., 1995. An Introduction to Medical Statistics, 2nd ed., pp. 5-25. Oxford Medical Publications, Oxford.

This chapter discusses many of the issues involving bias and confounding.

Campbell M.J. & D. Machin, 1993. Medical Statistics, A Commonsense Approach, 2nd ed., pp. 100-103. John Wiley & Sons, Chichester.

A brief outline of the use of multiple regression to deal with confounding variables.

### Bias

#### Problem

In a study 20 'normal' people take the standard treatment for bad breath (drug A). 20 garlic eaters take drug B. The results of the study indicate that the people taking drug A have better breath. Is drug A better than drug B?

#### Solution

Not necessarily, our study is biased.

Bias can be defined as the distortion of the estimated effects caused by a systematic difference between the groups being compared.

Potential sources of bias include

• Selection of study subjects (e.g. if the subjects decided themselves which treatment they wanted)
• Collection of information on study subjects (e.g. people with a disease are more likely to remember exposure to dangerous chemical)
• Not adjusting for confounding (see below)

#### Solutions to the problem of bias

• Design the study properly. This is by far the best way to avoid bias and is often the only way.
• Ensure that you have thought of all the possible ways that your results could suffer from bias to ensure that you don't make unfair comparisons
• If an assessment of the magnitude of bias is impossible, it is sometimes possible to guess the direction of any potential bias. If our results go against this bias we may well be able to accept them as indicating an effect even if we cannot deduce the size of the effect.
• Remember, in many situations the information needed to assess the bias in a study is unavailable

### Confounding

If you don't know whether an effect is caused by the variable you are interested in (e.g. a drug or smoking) or by another variable (e.g. age or sex) then the other variable is called a confounder and it is said to cause confounding.

To be a confounder a variable must be associated both with the exposure you are interested in and the outcome you are analysing.

If we consider the example above which was biased because a confounder was not adjusted for:

• Garlic eating was associated with exposure (which drug the subjects took)
• Garlic eating was associated with the outcome (bad breath)

Garlic eating was a confounder, which caused the results to be biased.

#### Controlling for potential confounders

• Restrict the study to a narrow range of admission criteria at the design stage (e.g. if you think age is a potential confounder restrict the study to subjects aged, say, 40 to 50)
• Matching in the design, followed by stratification in the analysis. (e.g. match on age and use difference within each pair in the analysis)
• Stratification in the analysis without matching (e.g. do analysis within five year age groups, 31-35, 36-40 etc., and statistically pool the results.
• Minimisation in the design
• Use multivariate methods

#### Multivariate methods

• These are sometimes called modelling techniques or the results can be referred to as the estimate "adjusted for the confounder"
• Often the name of the technique will be preceded by the word "multiple"
• Multiple regression is a regression analysis adjusting for certain potential confounders
• Multiple logistic regression is equivalent to a χ2 test for variables with only two possible outcomes e.g. dead/alive