Extended Marginal Homogeneity Model Based on Complementary Log-Log Transform for Square Tables

For square contingency tables with the same ordinal row and column classifications, McCullagh (1977) gave the marginal cumulative logistic model, which is an extension of the marginal homogeneity (MH) model using the logit transform. The present paper proposes a different extension of the MH model using the complementary log-log transform. In addition, the present paper gives the theorem that the MH model is equivalent to the proposed model and the equality of row and column marginal means holding simultaneously. In data analysis, if the MH model fits the data poorly, the theorem may be useful for seeing the reason for the poor fit. As example, the occupational status data for British father-son pairs are analyzed.


Introduction
Consider a square contingency table with the same ordinal row and column classifications.In the data in Table 1 taken from Bishop, Fienberg & Holland (1975, p.100), each observation is a pair of father's occupational status with his son's occupational status.For such data, statistical independence between the row and column classification generally does not hold due to concentration of observations on main diagonal cells.Instead of independence, we are interested in whether there is a structure of symmetry in the table.For example, Stuart (1955) gave the marginal homogeneity (MH) model which states the row marginal distribution is identical to the column marginal distribution.It is known that the MH model is expressed as the equality of marginal cumulative probabilities of row and column.For the data in Table 1, the MH model indicates the probability that a father's status is i equals the probability that his son's status is also i for any category i.
In data analysis, when the MH model fits the data poorly, many statisticians may be interested in a comparison of the two marginal distributions of row and column variables, say X and Y.One of such analyses is inferring whether X tends to be stochastically less than Y or vice versa.We are especially interested in applying the extension of the MH model, for example, the marginal cumulative logistic (ML) model (McCullagh, 1977;Agresti, 1984, p.205) based on the logit transform.The ML model states that one marginal distribution is a location shift of the other marginal distribution on a logistic scale.If the ML model fits the data poorly, we are then interested in other extension of the MH model based on the complementary log-log transform rather than logit transform.Miyamoto, Niibe & Tomizawa (2005) gave the theorem that the MH model holds if and only if the ML model and the equality of row and column marginal means hold simultaneously.We refer to such relation as a decomposition of model (i.e., the MH model is decomposed into the ML model and the equality of row and column marginal means).Also, see Tahata & Tomizawa (2008) and Kurakami, Tahata & Tomizawa (2013) for the decompositions of the MH model.We are interested in whether the decomposition with the ML model replaced by the proposed model holds or not.When the MH model fits the data poorly, it may be useful for seeing the reason for the poor fit of it.
In this paper, Section 2 proposes a new model which is an extension of the MH model based on the complementary log-log transform.Section 3 gives the decomposition of the MH model using the proposed model.Section 4 refers to the goodness-of-fit test.Section 5 analyzes the father's and his son's occupational mobility data in Britain.We show that the new model and decomposition are useful for inferring relationships between marginal distributions with the example.

Models
For an r × r square contingency table with ordered categories, let p i j denote the probability that an observation will fall in the ith row and jth column of the table for i = 1, . . ., r; j = 1, . . ., r.The MH model is defined by (Stuart, 1955;Tahata & Tomizawa, 2014).This model indicates the structure that satisfies the identity of marginal distributions of row and column.Let F X i and F Y i denote the marginal cumulative probability of X and Y, respectively; namely The MH model may also be expressed as Let L X i and L Y i denote the marginal cumulative logit transforms of X and Y, respectively; namely The MH model may further be expressed as The ML model (McCullagh, 1977) is defined by , where the parameter δ is unspecified.The ML model is one of the extensions of the MH model.This model indicates that the odds that X is i or below instead of i + 1 or above, is exp(δ) times higher than the odds that Y is i or below instead of i + 1 or above, for i = 1, . . ., r − 1. Therefore this model states one marginal distribution is a location shift of the other marginal distribution on a logistic scale.

Let C X
i and C Y i denote the marginal cumulative complementary log-log transforms of X and Y, respectively; namely The MH model may be expressed as We shall consider now the marginal cumulative complementary log-log (MCL) model which is defined by , where ∆ is unspecified.This model indicates that the probability that X is i + 1 or above, is equal to the probability that Y is i + 1 or above to the power of ∆, for i = 1, . . ., r − 1.Thus this model states one marginal distribution is a location shift of the other marginal distribution on a complementary log-log scale.Note that if ∆ = 1, then we have the MH model.We see, under the MCL model, ∆ > 1 is equivalent to Therefore the parameter ∆ in the MCL model reflects the degree of inhomogeneity between {F X i } and {F Y i }.

Decomposition
Consider the specified scores {u k } may be assigned to both rows and columns satisfying , where at least one strict inequality holds.Using the function g(k) which is g(k) = u k for k = 1, . . ., r, consider the marginal mean equality (ME) model defined by We now obtain the following theorem.

Theorem 1. The MH model holds if and only if both the MCL and ME models hold.
proof.If the MH model holds, then the MCL and ME models hold.We assume that both the MCL and ME models hold, and then we show that the MH model holds.For , Similarly, we have Since the ME and MCL models hold, we have and (2) Equations ( 1) and ( 2) lead to Thus we obtain ∆ = 1, i.e., the MH model holds because d k ≥ 0 (or d k ≤ 0) for all k = 1, . . ., r − 1, with at least one of the {d k } being not equal to zero.The proof is completed.

Goodness-of-fit Test
Let n i j denote the observed frequency in the ith row and jth column of the r × r table with n = ∑ ∑ n i j , and let m i j denote the corresponding expected frequency for i = 1, . . ., r; j = 1, . . ., r.We assume that a multinomial distribution applies to the table.The maximum likelihood estimates (MLEs) of expected frequencies under each model can be obtained using the Newton-Raphson method in the log-likelihood equation (see Appendix for the log-likelihood equation).The likelihood ratio chi-squared statistic for testing the goodness-of-fit of model M is given by where mi j is the MLE of m i j under the model.The numbers of degrees of freedom (df) of statistics for testing the goodness-of-fit of the MH, ML, MCL, and ME models are r − 1, r − 2, r − 2, and 1, respectively.Consider two nested models, say M 1 and M 2 , such that if model M 1 holds, then model M 2 holds.For testing the goodness-of-fit of model M 1 assuming that model M 2 holds, the conditional likelihood ratio statistic is given by G ).The number of df for the conditional test is the difference between the numbers of df for the models M 1 and M 2 .

Example
Consider the data in Table 1, relating the father's and his son's occupational status categories for a British sample again.
The smaller category number means the higher status.We analyze the data using the new model and decomposition of the MH model.

Table 2
gives the values of likelihood ratio statistic G 2 for testing the goodness-of-fit of models.We set u k = k for k = 1, ..., 5.The MH, ML and ME models fit the data poorly (G 2 (MH) = 32.80 with 4 df; G 2 (ML) = 9.75 with 3 df; G 2 (ME) = 20.28 with 1 df).The MCL model fits the data well (G 2 (MCL) = 4.26 with 3 df).Using Theorem 1 which is the decomposition of the MH model into the MCL and ME models, we shall consider the reason why the MH model fits the data poorly.According to Theorem 1 and Table2, the poor fit of the MH model is caused by the influence of the lack of structure of the ME model rather than the MCL model.Note that, using the decomposition of the MH model into the ML and ME models, it is difficult to consider the reason for the poor fit of the MH model because both the ML and ME models fit the data poorly.
Since the MCL model which is implied by the MH model fits well, we can test the goodness-of-fit of the MH model under the assumption that the MCL model holds, i.e., the hypothesis that ∆ = 1 under the assumption.The difference between the G 2 values for the MH and MCL models is G 2 (MH|MCL) = G 2 (MH) − G 2 (MCL) = 28.54 with 4 − 3 = 1 df, and thus the hypothesis that ∆ = 1 is rejected at the 0.05 significance level.It shows strong evidence of ∆ 1 in the MCL model.Therefore the MCL model is preferable to the MH model for the data.Under the MCL model, the MLE of ∆ is ∆ = 1.13.

Table 1 .
Occupational status for British father-son pairs(Bishop et al., 1975, p.100)Note:The parenthesized values are the MLEs of expected frequencies under the MCL model.

Table 2 .
Likelihood ratio chi-square values G 2 for models applied to the data in Table1 *Note: u k for the ME model is integer score.* means significant at the 0.05 level.