A Unified Computational Framework to Compare Direct and Sequential False Discover Rate Algorithms for Exploratory DNA Microarray Studies

by Danh V. Nguyen

Journal of Data Science, v.3, no.4, 331-352

Abstract

The problem of detecting differential gene expression with microarray data has led to further innovative approaches to controlling false positives in multiple testing. False discovery rate (FDR) has been widely used as a measure of error in this multiple
testing context. Direct estimation of FDR was recently proposed by Storey (2002, Journal of the Royal Statistical Society, Series B 64, 479-498) as a substantially more powerful alternative to the traditional sequential FDR controlling procedure, pioneered by Benjamini and Hochberg (1995, Journal of the Royal Statistical Society, Series B 57, 289-300). Direct estimation to FDR requires fixing a rejection region of interest and then conservatively estimating the associated FDR. On the other hand, sequential FDR procedure requires fixing a FDR control level and then estimating the rejection region. Thus, sequential and direct approaches to FDR control appear very different. In this paper, we introduce a unified computational framework for sequential FDR methods and propose a class of more powerful sequential FDR algorithms, that link the direct and sequential approaches. Under the proposed unified compuational framework, both approaches simply approximate the least conservative (optimal) sequential FDR procedure. We illustrate the FDR algorithms and concepts with some numerical studies (simulations) and with two real exploratory DNA microarray studies, one on the detection of molecular signatures in {\it BRCA}-mutation breast cancer patients and another on the detection of genetic signatures during colon cancer initiation and progression in the rat.

Homepage | Table of Contents | Full Text of This Article