A Note on Approximation of Likelihood Ratio Statistic in Exploratory Factor Analysis ()
1. Introduction
Factor analyis [1] [2] is used in various fields to study interdependence among a set of observed variables by postulating underlying factors. We consider the model of exploratory factor analysis in the form
, (1)
where is the covariance matrix of observed variables, is a matrix of factor loadings, and is a diagonal matrix of error variances with. Under the assumption of multivariate normal distributions for observations, the parameters are estimated with the method of maximum likelihood and the goodness-of-fit of the model can be judged by using the likelihood ratio (LR) test for testing the null hypothesis for a specified m against the alternative that is unconstrained. From the theory of LR tests, the degrees of freedom, , of the asymptotic chi-square distribution is the difference between the number of free parameters on the alternative model and the null model. In (1), remains unchanged if is replaced by for any orthogonal matrix. Hence, restrictions are required to elimi- nate this indeterminacy. Then, the difference between the number of nonduplicated elements in and the number of free parameters in and is given by
. (2)
2. LR Statistic in Exploratory Factor Analysis
2.1. Approximation of LR Statistiic
Let be the usual unbiased estimator of based on a random sample of size from the multi- variate normal population with. For the existence of consistent estimators, we assume that the solution of is unique. A necessary condition for the uniqueness of the solution up to multiplication on the right of by an orthogonal matrix is that each column of has at least three non-zero elements for every non-singular matrix ([3] , Theorem 5.6). This condition implies that.
The maximum Wishart likelihood estimators and are defined as the values of and that minimize
. (3)
Then, and can be shown to be the solutions of the following equations:
, (4)
, (5)
where. The motivation behind the minimization of in (3) is that
, (6)
that is, n times the minimum value is the LR statistic described in the previous section. Under (4) and
(5), and can be shown to hold. Hence,
.
From the second-order Taylor formula, we have an approximation of the LR statistic as
, (7)
by virtue of (5) [1] [2] . While the approximation on the right hand side of (7) shows how the LR statistic is related to the sum of squares of standardized residuals [4] , it does not enable us to investigate the distributional properties of hte LR statistic. To overcome this difficulty, we express the LR statistic as a function of.
Let and denote the terms of and linear in the elments of. Then we have the following proposition.
Proposition 1. An approximation of the LR statistic is given by
, (8)
where is defined by
, (9)
with.
Proof. By substituting, and into (4) and (5) and considering only linear terms, we have
, (10)
, (11)
where. From (10) we derive
,
,
where. Then
by virtue of. Thus,
(12)
By replacing in (7) with, we have
,
since. It follows from (11) and (12) that
, (13)
thus establishing the desired result.
2.2. Evaluating Expectation
For the purpose of demonstrating the usefulness of the derived approximation, we show explicitly that the expectation of (8) agrees with the degrees of freedom, , in (2) of the asymptotic chi-square distribution. We now evaluate the expectation of (8) by using
, (14)
see, for example, Theorem 3.4.4 of [1] . By noting, we see that the expectation of the first term in (8) is
(15)
To evaluate the expectation of the second term in (8), we need to express in terms of. Let the symbol denote the Hadamard product of matrices, and define by. Because is positive semidefinite, so is [5] . If is positive definite, then (13) can be solved for in terms of [3] . An expression of is
, (16)
where is a diagonal matrix whose diagonal elements are the i-th column (row) of [6] . An interesting property of is
, (17)
where is the Kronecker delta with if and if. Hence, we have
(18)
By combining (15) and (18), we obtain the desired result.