Mathematics Ph.D. Dissertations

Title

On Modern Measures and Tests of Multivariate Independence

Date of Award

2015

Document Type

Dissertation

Degree Name

Doctor of Philosophy (Ph.D.)

Department

Statistics

First Advisor

Maria Rizzo (Advisor)

Second Advisor

Junfeng Shang (Committee Member)

Third Advisor

Wei Ning (Committee Member)

Fourth Advisor

Sung Chul Bae (Other)

Abstract

For the last ten years, many measures and tests have been proposed for determining the independence of random vectors. This study explores the similarities and differences of some of these new measures and generalizes the properties that are suitable for measuring independence in the bivariate and multivariate case. Some of the measures that brought interest to the statistical community are Distance Correlation (dCor) by Szekely and Rizzo (2007), Maximal Information Coefficient (MIC) by Reshef, Reshef, Finucane, Grossman, McVean, Turnbaugh, Lander, Mitzenmacher and Sabeti (2011), Local Gaussian Correlation (LGC) and Global Gaussian Correlation (GGC) by Berentsen and Tjøstheim (2014), RV Coefficient by Robert and Escoufier (1976), and the HHG test statistic developed by Heller, Heller and Gorfine (2012).

This study gives a state-of-the-art comparison of the measures. We compare the measures in terms of their theoretical properties. We consider the properties that are necessary and desirable for measuring dependence such as equitability and rigid motion invariance. We identify which of A. Renyi's postulates (1959) can be established or disproved for each measure. Each of the measures satisfies only two if not three properties of Renyi. Among the measures and tests explored in this paper, distance correlation is the only one that has the important characterization of being equal to zero if and only if two random variables or two random vectors are independent.

Several dependence structures including linear, quadratic, cubic, exponential, sinusoid and diamond, are considered. The coefficients of the dependence measures are computed and compared for each structure. The power performance and empirical Type-I error rates of the dependence measures are also shown and compared.

For detecting bivariate and multivariate association, dCov and HHG are equally powerful. Both are consistent against all dependence alternatives and the tests achieve good power for finite sample sizes. The RV coefficient is only as powerful as the two previous tests when the relationship is linear.

Dependence measures are applied to real data sets concerning stocks returns and Parkinson's disease.

COinS