Approaches to dealing with multiple comparisons

Print this Topic

Perspective on dealing with multiple comparisons at once

Let's consider what would happen if you did many comparisons, and determined whether each result is 'significant' or not. Also assume that we are 'mother nature' so know whether a difference truly exists or not.

In the table below, the top row represents the results of comparisons where the null hypothesis is true -- the treatment really doesn't work. Nonetheless, some comparisons will mistakenly yield a 'significant' conclusion. The second line shows the results of comparisons where there truly is a difference. Even so, you won't get a 'significant' result in every experiment.

A, B, C and D represent the numbers of comparisons, so the sum of A+B+C+D equals the total number of comparisons you are making.

 

 

"Significant"

"Not significant"

Total

No difference.

Null hypothesis true

A

B

A+B

A difference truly exists

C

D

C+D

Total

A+C

B+D

A+B+C+D

 

Three approaches can be used to deal with multiple comparisons:

Approach 1: Don't correct for multiple comparisons

Use the standard definition of 'significance' so you expect the ratio of A/(A+B) to equal alpha, which is usually 5%. In other words, if the null hypothesis of no difference is in fact true, there is a 5% chance that you will mistakenly conclude that the difference is statistically significant. This 5% value applies to each comparison separately, so is per comparison error rate.

When using this approach, you have to beware of over interpreting a 'statistically' significant result. You expect a significant result in 5% of comparisons where the null hypothesis is true. If you perform many comparisons, you would be surprised if none of the comparisons resulted in a 'statistically significant' conclusion.

This approach is sometimes called "Planned comparisons".

Approach 2: Correct for multiple comparisons

With this approach, you set a stricter threshold for significance, such that alpha is the chance of obtaining one or more 'significant' conclusions if the all the null hypotheses are true. In the table above, alpha is the probability that A will be greater than 0. If you set alpha to the usual value of 5%, this means you need to set a strict definition of significance such that -- if all null hypotheses are true -- there is only a 5% chance of obtaining one or more 'significant' results by chance alone, and thus a 95% chance that none of the comparisons will lead to a 'significant' conclusion. The 5% applies to the entire experiment, so is sometimes called an experimentwise error rate or familywise error rate.

The advantage of this approach is that you are far less likely to be mislead by false conclusions of 'statistical significance'. The disadvantage is that you need to use a stricter threshold of significance, so will have less power to detect true differences.

Approach 3: False Discovery Rate

The two approaches already discussed ask: If the null hypothesis is true what is the chance of getting "significant" results? The False Discovery Rate (FDR) answers a different question: If the comparison is "significant", what is the chance that the null hypothesis is true? If you are only making a single comparison, you can't answer this without defining the prior odds and using Bayesian reasoning. But if you have many comparisons, simple methods let you answer that question (at least approximately). In the table, above the False Discovery rate is the ratio A/(A+C). This ratio is sometimes called Q. If Q is set to 10%, that means the threshold for dividing 'significant' from not significant comparisons is established so we expect 90% of the 'significant' results to truly reflect actual differences, while 10% to be false positives.

Prism does not use the concept of False Discovery Rate, except indirectly as part of our method to define outliers in nonlinear regression.



Copyright (c) 2007 GraphPad Software Inc. All rights reserved.
URL: http://www.graphpad.com/help/Prism5/Prism5Help.html?stat_approaches_to_dealing_with_mul.htm