Multiple testing

A typical microarray (In 2015, at least) can contain up to million separate SNP's tested. If all of them were used in a genome-wide association study (GWAS), this would result in a million tested hypothesis, ie. one for each SNP being potentially associated with sought trait. With a p-value cutoff of for example 0.05 (ie. 5% odds the found effect is by random chance) an average five thousand SNP's would be expected to have significant association by random chance alone, ie. a type 1 error (false positive).

P-values quoted in most studies should have been corrected for multiple testing in some way, though there are several different ways for correction. In the simplest method, the Bonferroni correction, this means simply dividing the p-value expected for significance by the number of hypothesis tested, so that the 0.05 cut-off becomes 5*10^-8 for a million SNP study. However, this sort of correction is counter-intuitive as the interpretation of a finding depends on number of tests performed. Indeed, this method leads to high rate of type 2 errors (false negatives) and is usually not advised.

A popular alternative is False Discovery Rate, FDR, which is the expected rate of associations deemed significant that are nonetheless false. Thus an FDR of 0.05 would imply 5% of the associations deemed significant at that level would be expected to be by chance (in simpler words, it has a 5% chance of being a false discovery), as opposed to the unmodified p-value of 0.05 where the 5% would be out of the whole set of hypothesis (SNP's) tested.

A similar problem occurs when individuals one million SNP's are compared against a large database. Some small percentage of those SNP's have been mis-called, but even at 0.5% that would still be 500 SNP's out of a million, though as most SNP's have no known function it's rare but possible for these to show up as malignant variations. A larger amount of the associations in the database will be false positives, however. If ten thousand studies have been performed with one million SNP's each, this amounts to thousand million or 10^10 hypotheses tested. Assuming each one to be independent, an unmodified p-value of 10^-10 would still mean one expected false positive for an individual. This cautions strongly against simply looking at a list of matched associations, though most studies present some theory for how the association works, and the best ones replicate the finding in a second group significantly reducing the probability of false positive by random chance.