Tau-Proportionate Form of Congeneric Covariance Structure

From the view point of constructing a composite possessing the largest possible common covariability with all of its constituent tests, the concept of the best equi-covariable composite (BEC) has been introduced. When the covariance matrix of the constituent tests are so structured that its row (column) totals are all equal, the corresponding BEC is shown to be reducible to simple aggregated (or averaged) test. Under the purview of congeneric tests, it is shown that there exists a special form termed tau-proportionate congeneric form in which case the simple aggregated (or averaged) test is not only the BEC but also the most reliable composite. Statistical techniques of the estimation of parameters, testing and goodness-of-fit of the such structure are considered with the aid of an illustrative example.


Introduction
In classical test theory, it is assumed that the score obtained by an examinee in a test is the sum of a true score and error score where those are uncorrelated.For a set of p tests purportedly aiming at measuring the same trait, the scores x 1 , x 2 , • • • , x p with corresponding true scores τ 1 , τ 2 , • • • , τ p is said to be congeneric if every pair of true scores τ g and τ h has unit correlation.Such a set of observed test scores is represented in a vector form as (Jöreskog, 1971, Sec.2) where X = (x 1 , x 2 , • • • , x p ) ′ is the vector of p scores, β = (β 1 , β 2 , • • • , β p ) ′ is a vector of regression coefficients, e = (e 1 , e 2 , • • • , e p ) ′ is the vector of error scores, τ is a random varible without any loss of generality scaled to zero mean and unit varince, providing β g τ as the true score for any typical test g.The error scores e 1 , e 2 , • • • , e p are assumed random variables having zero expectations and zero correlations with τ and as well as within themselves.They may have possibly different variances, say ψ 11 , ψ 22 , • • • , ψ pp .Thus, τ and the components of X and e are all regarded as random variables for a populations of examinees (Zimmerman, 1975).The p-component vector µ is treated as the mean score vector of X.
Typically for any two tests g and h, τ g = β g τ, τ h = β h τ, v(τ g ) = β 2 g = cov(x g , τ), v(τ h ) = β 2 h = cov(x h , τ) and cov(τ g , τ h ) = β g β h implying corr(τ g , τ h ) = 1 (as assumed for congenericity of tests).The covariance matrix Σ = ((σ gh ))p × p under congeneric model ( 1) is given by where ψ = diag(ψ 11 , ψ 22 , • • • , ψ pp ).The precision, more clearly the reproducible capacity of the test g, is quantified by reliability coefficient(ρ gg ) being defined as the proportion of the observed score variance 'accounted for' by the true score variance (Miller, 1995): As the true score variance remains unknown, ρ gg is obtained by constructing another test having the form "parallel" to the test g (say, test h) and applying upon the same group of individuals and finally correlating the two series of scores due to the fact that the correlation between x g and x h has the same expression as (3).The notion of parallel forms of a test aims at " making no difference which is used".The statistical criterion of parallelity is to have "equality of means, equality of variances and equality of covariances of the tests" (cf.Gulliksen, 1950, Ch.14).Parallel tests can be used interchangeably.
They have equal reliabilities and equal validities in predicting a given criterion.The requirement of equal means will not be considered in this paper.The covariance matrix (Σ) arising from a group of p parallel forms of the typical test g would take the form Such structure (4) is otherwise called "uniform" or "intraclass" structure (ICS) (Geisser, 1964;Srivastava, 1965).Here, V(x g ) = β 2 0 + ψ 00 and Cov(x g , x h ) = β 2 0 , g h.E p is the (p × p) matrix with all elements unity.I p is the identity matrix of order p. (4) is a positive definite matrix involving only two parameters β 2 0 and ψ 00 .The reliability of the test g under parallel form of tests (PFT) is thus The congeneric model ( 1) may be looked upon as an equivalence to the factor analytic model arising from Spearman's unifactor theory.The true score (τ) is treated as an unknown latent variable ( f ), called "common" factor (trait) assumed to be a random variable.The vector of error scores (e) has the components called "unique" or "specific" factor -another random variable associated with the individual test score, but uncorrelated with f .The coefficients β 1 , β 2 , • • • , β p are rather called "factor loadings" amounting the importance of f to explain the test scores.Thus the unifactor model is with the assumptions on f and e analogous to those as stated in (1), providing with the same covariance matrix Σ as expressed by (2).The reliability of any test g as defined in (3) is rather interpreted as the common factor's contribution to the test variance being termed communality of the test g (Jöreskog, 1971, Sec.2).
As the reliability of a test cannot be determined from a single test administration, it requires the use of a parallel form.More often than not, parallel forms are not available.In such situations, a common practice is to obtain indirect information by calculating the lower bound of the reliability with an implication that when the lower bound is high enough, the reliability is adequate (Ten Berge & Socan, 2004).Keeping in view the unifactor theory, when a battery of tests (items) focuses on a single idea or construct (factor), instead of assessing reliability of a test, the quality of the composite formed by the tests conforming the battery is measured by what is called "internal consistency reliability".Cronbach's alpha coefficient(1951) is generally used for this purpose indicating indirectly the degree of interrelatedness among the tests (items) measuring different substantive areas within a single construct (unifactor) (Cortina, 1993;Zinbarg, Yovel, Revelle, & Mcdonald, 2006).
Instead of a single test, when a number of tests is administered to a group of students, the main problem is to have an appropriate method of combining their scores in terms of a linear composite.The weights used in constructing a linear composite depend upon the type of judgment or criterion fixed up beforehand.As a common practice, however, simple aggregation or average of the test scores(AVT) is used as a composite possessing two simple properties.One, it attaches no special weightage to any particular test among p tests.Two, it does not take into account the structure of covariability among p tests.
Optimum property of the AVT such as minimum variance with unbiasedness, is found to hold so long as the PFT is undertaken.As the associated covariance matrix under PFT has the intraclass covariance structure (ICS), the variance of the AVT would be equally affected no matter which one of the p constituent tests is removed from the composite.
The tests belonging to a battery are most often combined into a suitably weighted composite aiming at increasing the reliability.The AVT under PFT set-up is the "most reliable" composite (MRC) attaining the reliability while under the congeneric model (2), where β = ∑ g β g /p and ψ = ∑ g ψ gg /p.The present investigation aims at searching the situation where the AVT ( x) may be claimed to be the "most reliable" composite amongst all linear composites.
In Section the present investigation, the concept of equi-covariability as well as of the best equi-covariable composite (BEC) is introduced.It is established that the AVT may be BEC when ( 2) is restricted to equal row (column) sums(ERS).Another restricted congeneric model, termed tau-proportionate congeneric one, has been developed.The AVT under such model is shown to be not only the most reliable composite (MRC) but also the best equi-covariable one (BEC).The performance of the internal consistency reliability as measured by alpha is compared with that of the reliability of the MRC within the purview of ERS congeneric tests.A statistical test has been forwarded to check a ERS Σ from a sample data set.The maximum likelihood estimation of the parameters involved in tau-proportionate congeneric structure of Σ, is considered along with some related results.An illustrative example, based on real data, is put forward showing the compatibility of tau-proportionate congeneric structure , followed by some discussion finally.

Equi-covariable Composite
Definition: 1.A linear composite y = γ ′ X is said to have the property of equi-covariability with respect to a random Construction: The coefficient vector γ of y = γ ′ X is the solution to the system of linear equations Σγ = cJ where the scalar c is the common covariance of y with the variables and Σ is the covariance matrix of the random vector X.J is the p-component vector of ones.Under the assumption of positive definiteness of Σ, γ = cΣ −1 J.A class of equi-covariable composites may be demarcated for various choices of the scalar c(> 0).The "best" equi-covariable composite (BEC) is a member belonging to this class, indicating the largest covariability which happens when c is chosen as the variance of y.On fixing c = Var[(cΣ −1 J) ′ X], the optimum value of c would be (J ′ Σ −1 J) −1 providing the expression of BEC as Once the entries of Σ are so structured that its row (or column) sums are all equal (ERS), ΣJ ∝ J as well as Σ −1 J ∝ J giving rise to the BEC composite (y 0 ) being simply reducible to x.Thus, the averaged test would be the BEC if and only if Σ is ERS.For instance, for an ICS Σ (vide (4)), the corresponding BEC is no other than x.
It appears that for a congeneric ERS covariance matrix, where ψ = the average of ψ 11 , • • • , ψ pp and T 0 = common value of row or column total.More explicitly, involving only (p + 1) parameters.It is interesting to note further that if the test score variances (σ gg , s as the diagonal elements of Σ) along with T 0 are numerically specified, ψ gg , s can be determined by a convergent iterative scheme [vide Appendix].For such congeneric model ( 13), BEC composite (11) simplifies to averaged test ( x).
Note: In the theory of statistical estimation where there is a number of competent estimators estimating the same parameter unbiasedly, their optimum unbiased linear combination having the least variance is termed best linear unbiased estimator (BLUE).Eventually, the expression (11) would be identical to BLUE when all the test means are equal.

Reliability of BEC under Congeneric Set-up
Under congeneric set-up, the variance and the reliability of any typical (linear) composite y = γ ′ X are given by As the ratio (γ ′ β) 2 γ ′ ψγ would attain its maximum when γ ∝ ψ −1 β, the "most reliable" composite (MRC) would be expressible as y m = k(ψ −1 β) ′ X where k is a positive constant.Consequently, and In the context of equi-covariability criterion, the best equi-covariable composite (BEC) y 0 as expressed in (11), would have and .
Clearly, ρ y 0 y 0 ≤ ρ y m y m .The equality holds if and only if β ∝ J, i.e., iff β g 's are all equal (say, β 0 ) in which case the corresponding composite would be the most-reliable-cum-best equi-covariable composite (MRBEC) being expressible as 0 signifies the common true score variances of each of p tests, indicating a special situation, called essentially tauequivalent congeneric tests in the sense of Lord & Novick(1968).Such composite ( 17) is obtainable when each component of X is given a weight inversely proportional to its error variances (ψ gg 's) that are usually different (Jöreskog, 1971, Sec.2).Now, as a natural query, what situation could make the AVT( x) MRC?On equalizing the reliability formulae corresponding to x and y m (vide (10) and ( 15)).
Consequently, the covariance matrix would have the following structure.
Thus the structure (19) may be revealed to stem from such congeneric tests as would have the key feature that the covariance of any constituent test (x g ) with the averaged test ( x) is proportional to its error variance (ψ gg ).Such tests may be termed "tau-proportionate" congeneric tests.
As the reliability (ρ gg ) of any g th test is (1−ρ gg )ψ gg (being equal to k 0 ) will remain invariant in respect of any g th test belonging to tau-proportionate congeneric tests.
If additionally, (19) possesses the ERS feature, corresponding MRC averaged test would be lifted to MRBEC averaged test.Clearly, the MRBEC averaged test would be available when (ψ −1 J) ∝ J (vide (17)) in which case ( 19) is reducible to a ICS structure indicating the parallelity of the congeneric tests.

Internal Consistency Reliability and MRC under ERS Congeneric Set-up
Recalling Cronbach's alpha (vide (8)) as a measure of internal consistency reliability, where a = average of the variances and b = average of the covariances.
In the case of ERS congeneric tests (vide( 13)), a and b would take the expression as follows. where Clearly, for more variations among ψ gg 's, a would be expected larger while b and, in effect, α would be expected smaller.
In view of maximum possible reliability of any composite (MRC) under ERS congeneric tests, let us recall ( 15) and obtain where ψ h denotes p/Σ p g=1 (1/ψ gg ), the harmonic mean of ψ gg 's.It is to be noted that ψ ≥ ψ h and that more the variations among ψ gg 's, more would be the quantities ψ − ψ h and s 2 ψ producing larger ρ y m y m but smaller α due to second terms of ( 24) and ( 23) respectively.Thus, for a fixed value of ψ, α, as a lower bound indicator of reliability, would never exceed 1−ψ T 0 while ρ y m y m of the related MRC would be at least 1−ψ T 0 .Clearly, their equality would be possible when ψ gg 's are all equal implying trivially a ICS structure of Σ.

A Statistical Test for the Tenability of ERS Σ
A statistical test procedure may now be forwarded to test the tenability of ERS structure of Σ under the assumption of multinormality of the score vector X(p × 1) comprising p scores obtained by an individual when treated by p congeneric forms.Let S be the data covariance matrix based on the (p × n) data set comprising the score vectors of n individuals.

As J/
√ p is necessarily an eigenvector of any ERS matrix, the tenability of ERS structure of Σ may be assured equivalently by the tenability of J/ √ p as an eigenvector of Σ.Following the testing procedure by Mallows(1961), an F-statistic is computed by the expression as follows.
where s gh and s gh are the typical elements of S and S −1 respectively.If the computed value of F is less than F α (p−1, n− p), the tenability of a ERS Σ is asserted at 100α% level of significance.
Consequently, the common row total (T 0 ) of ERS Σ may be estimated by noticing that the variable (J/ √ p) ′ X is distributed as univariate normal with variance T 0 .Considering the joint distribution of n such variables in respect of n individuals, the maximum likelihood estimate (MLE) of T 0 may be obtained as where xi is the average score obtained by i th individual while x is the overall averaged score ( 1 n ∑ n i=1 xi ).Noticeably, the least squares estimate of T 0 based on p row totals of S would be found as 1 p ∑ g ∑ h s gh which is reducible to T0 .

ML Estimation of Tau-proportionate Congeneric Structure and Some Related Results
Under usual multinormality assumptions, the maximum likelihood estimates (MLE) of (p + 1) parameters, e.g.k 0 and ψ 11 , • • • , ψ pp as involved in ( 19) are obtained by solving the following likelihood equations. and where S is the sample covariance matrix with the divisor n, the sample size.Noting that 27) and ( 28) are reducible to following implicit equations. and where c g is the g th column total in S. Starting with suitable trial solutions for k 0 and ψ gg 's, ( 29) and ( 30) may be solved by iterations.Keeping ∧ mark over any parameter as a generic notation to indicate its MLE, some straight forward results are furnished below.
iii) α It may be noted that the above results are rather trivial when Σ = S which happens for a non-structured Σ.But in the case of a structured Σ, as the number of independent parameters is less than the number of non-duplicated elements, Σ would no longer be equal to S. As such, the findings as furnished above are quite structure-dependent.Some one or more may likely to hold for any other structure(s) also.
In practice, sample covariance matrix ( 6) is used in the computation instead of its population counterpart, which is essentially never available.It is well-known that model-based estimated covariance matrix such as Σ, can be more efficient compared to S (Bentler, 2009).

Analysis of Bock's Vocabulary Data
To study the evolution of the vocabulary of children, the relevant data were drawn by Bock (1975) from the test results on file in the Records Office of the Laboratory School of the Chicago.They consist of scores, obtained from a cohort of pupils from the 8 th through 11 th grade levels, on alternative forms of the vocabulary section of the Co-operative Reading test.
For a sample of 64 (n) pupils scaled scores (after suitably changed origin and unit) are shown in the form of (64 × 4) array in Härdley and Hlávka (2007).Table 1 shows the computed covariance matrix (S).As the scores under consideration are concerned with the measurements of the same trait (vocabulary) through various grade levels, the present analysis aims at studying the nature of congenericity of the tests.
Table 1.Covariance matrix of Bock's Vocabulary data X 1 X 2 X 3 X 4 X 1 3.5654 X 2 3.3082 4.7980 X 3 3.5503 3.6659 4.7035 X 4 2.8523 3.1248 3.3886 3.7075 Recalling Mallows test (vide Section 5), computed F statistic has the value 2.2184 which is less than F .05 (3, 60) having the value 2.7580 leading to acceptance (tenability) of ERS general structure of the covariance matrix of Bock's vocabulary data.However, for a rigorous study on ERS structure in a congeneric reference formula, we consider various hypothesized structures of Σ in connection with congenericity, e.g., intraclass (ICS) reflecting the parallelity of the tests, tau-equivalent congeneric, tau-proportionate congeneric, Equal-row-sum congeneric (ERS congeneric) being selected as the variants of congenericity.It may be noted that ERS-congeneric one is also a special case of equal-row-sum structure of the covariance matrix (ERS general, vide Sec.2).In Table 2, these structures are arranged in accordance with the number of involved parameters in increasing order.The involved parameters are all estimated by the method of maximum likelihood (ML) under the assumption of multinormality of the data.The most of the likelihood equations are not explicitly solvable.However by iterative procedure (vide Appendix), the convergent ML solutions are obtained.For testing the tenability of the hypothesized structures, the conventional likelihood ratio criterion provides the test statistic (n − 1)log( | Σ| |S| ) being distributed asymptotically as a chi-square with appropriate degrees of freedom (d.f.) (ν), Σ and S being respectively the MLE of Σ and the sample covariance matrix.The columns 3, 4 and 5 of Table 2, show respectively the computed chi-square, d.f. and P-value.On comparing the P-values with 5% level of significance, it is clear that except intraclass structure (ICS), all other structures are tenable.
In order to critically judge the competency of the structures, the assessment of their goodness-of-fit is now to be used to supplement the chi-square test (Hu & Bentler, 1998).
A high goodness-of-fit index value may be an encouraging sign that the structure is useful even when it fails to fit exactly on statistical ground and/or there stands a number of competent structures tenable.Among many, the most widely used goodness-of-fit indices proposed by Jöreskog and Sörbom (1981) are computed by the following formulae.
and adjusting the degrees of freedom (ν) of the hypothesized structure modified version of GFI, According to a recommended thumb rule (Hu & Bentler, 1998), computed GFI and AGFI both exceed the "cut-off" point 0.90 (vide Table 2) for only two hypothesized structures,e.g.congeneric and tau-proportionate congeneric indicating their closeness of fit to a greater extent compared to other structures.
We further consider the measures of goodness-of-fit based on the deviations between the elements Σ and S.Those are Average Deviation (AD) by Werts, Pike, Rock and Grandy (1981), Root-mean-square Residual (unstandardized) (RMR) by Jöreskog and Sörbom (1981) and Standardized Root-mean-square Residual (SRMR) by Bentler (1995) having the following expressions.
where σi j and s i j are the (i, j) elements of MLE Σ and S respectively.The corresponding computed figures are shown in the last three columns of Table 2, amongst which too low values lie along the rows of congeneric and tau-proportionate congeneric structures.In particular, a cut-off value close to 0.08 (as recommended by Hu & Bentler,1998) for SRMR selects out both the structures.Although both the structures fit the data equally well, to break the tie, the tau-proportionate structure is preferable as it is more restricted with larger degrees of freedom (Forster,1998;Graham,2006;Kenny,Kaniskar & McCoach, 2011).Using the ML estimates of the parameters of the hypothesized structures, the ML estimates of the Cronbach's alpha, ρx x and ρy m y m are shown in the columns of Table 3.All the entries are recorded to five places of decimal after computing them up to six places for a vis-a-vis comparison.The column of alpha indicates the lower bound estimate of the internal consistency reliability for any structure.For the tau-equivalent congeneric structure, the averaged test has the reliability equal to corresponding alpha and thus is the "least reliable composite".Contrarily, for tau-proportionate congeneric structure, the situation is reverse indicating that the averaged test is the "most reliable composite".For parallel form of tests providing the intraclass structure, the averaged test is trivially a unique composite as its reliability meets the maximum and alpha both.

Discussion
The main concern in this article is to search for constrained congeneric test/item set due to which the aggregated/averaged test (AVT) (sometimes termed scaled score (Kano & Azuma, 2003) or, a composite with unit weights (Bentler, 2004) formed from the set would be the most reliable composite (MRC).Determination is indeed exclusively within the framework of classical test theory (CTT).In the case of intraclass covariance structure arising from parallel form of tests (PFT), Cronbach's alpha estimates exactly the reliability of the AVT with an attainment of the maximum possible reliability for any composite.However, in the case of (essential) tau-equivalent structure arising from 'weak' PFT, corresponding alpha although estimates the reliability of the AVT exactly, that tends to underestimate the test reliability as such (Novick & Lewis, 1967).Present investigation indicates that there may exist a composite which is most reliable but different from AVT (e.g.,y m of ( 15)).The set of tau-proportionate congeneric tests is, of course, a set of constrained congeneric tests establishing AVT as not only the MRC but also as the best equi-covariable composite (BEC).

)
Sometimes the forms of the tests are considered less restrictive compared to parallel forms, e.g., tau-equivalent tests as well as essentially tau-equivalent tests have equal true score variances but possibly different error variances.Thus, here β g 's are equal while ψ gg 's are different.Such tests cannot be used interchangeably.The test with smallest error variance (smallest ψ gg ) is the most reliable test and each test however indicates a different validity in predicting a given criterion.The corresponding covariance matrix for tau-equivalent tests would have the structure

Table 2 .
Goodness-of-fit measures of different hypothesized structures for Bock's data

Table 3 .
ML Estimation of Alpha and Reliability of the Averaged Test