Mathematics Ph.D. Dissertations
Estimating the Proportion of True Null Hypotheses in Multiple Testing Problems
Date of Award
2016
Document Type
Dissertation
Degree Name
Doctor of Philosophy (Ph.D.)
Department
Statistics
First Advisor
Hanfeng Chen (Advisor)
Second Advisor
John Laird (Other)
Third Advisor
John Chen (Committee Member)
Fourth Advisor
Junfeng Shang (Committee Member)
Abstract
The problem of estimating the proportion, π0, of the true null hypotheses in a multiple testing problem is important in cases where large scale of parallel hypotheses tests are performed independently. While the problem is a quantity of interest in its own right in many applications, a reliable estimate of π0 is crucial when we want to assess and/ or control the false discovery rate in a multiple testing problem.
In this dissertation, we investigate the estimation problem coupled with assessing/controlling the false discovery rate. The dissertation develops a new estimating procedure under the two-component mixture model. The components of the mixture are the null and alternative distributions with mixing proportions π0 and 1- π0 respectively, where π0 is the unknown proportion to be estimated. We establish an innovative non-parametric maximum likelihood estimation of the p-values density, restricting the alternative to multinomial distribution family of k categories to address this problem.
To apply this approach, we need to settle two things first: (a) select an integer k, and (b) convert the continuous-type observations (p-values) into discrete data with k categories. As many authors have noticed, in applications, the p-values are highly skewed, so we recommend Sturges' rule modified for skewness in determining k.
We then propose an iterative optimization technique - EM algorithm to characterize the maximum likelihood estimate for an approximation to the maximum likelihood estimate of π0. Simulation studies are conducted to assess the performance of the proposed procedure. The simulation results show that our proposed procedure perform significantly better than the existing procedures. The new procedure is applied to the leukemia gene expression dataset and the inherited breast cancer cDNA dataset that were analyzed by many other statisticians. Again, our procedure provides an overall satisfactory performance.
Recommended Citation
Oyeniran, Oluyemi, "Estimating the Proportion of True Null Hypotheses in Multiple Testing Problems" (2016). Mathematics Ph.D. Dissertations. 29.
https://scholarworks.bgsu.edu/math_diss/29