Proposed Distance-Based Test for Testing Multivariate Multiple Regression Coefficients under Restricted Alternatives

In constructing estimation and hypothesis testing procedures, it is important that all available information such as sign of parameter is used in order to maximize power of the test. Often prior information are known about the sign of regression coefficients (parameter) under test, the best example being that variances cannot be negative. Ignoring information about the signs of regression parameters can lead to loss of power in small samples. With this problem in mind, this paper concerned with developing restricted estimation and hypothesis testing approach in the context of multivariate multiple regression model. Developing the technique of estimating constraint regression coefficients and testing restricted parameters with the aid of information theoretic distance are the main contribution of this paper. The distribution of the existing two-sided test follows central chi-square distribution whereas the test statistic of our proposed distance-based one-sided test follows weighted mixture of chi-square distribution. Monte Carlo simulation indicates that our newly proposed test performs better than existing tests.


Introduction
In making decision it is essential to setup model which is based on some assumptions of the related field.Applications of models are numerous and occur in almost every field, including engineering, physical science, economics, management, life science etc. Model can be univariate or multivariate.Univariate analysis carried out separately for each variable and does not consider the correlation or inter-dependence among the variables.Multivariate analysis considers jointly or simultaneously all the variables and take into account interdependency among them (Kothari, 2004).The benefit of multivariate analysis over univariate analysis is that by applying it better decision can be made.
The subject of multivariate analysis deals with the statistical analysis of the data collected on more than one (response) variables.The consideration of statistical dependence among the response variables makes the modelling issue and is often described by their joint probability distribution.The multivariate multiple regression model is the extension of the multiple regression model which predicts the several outcome variables from the same set of independent variables (Johnson, 2002).Assessing the fit of a model should always be done in the context of the purpose of the modelling.It is essential to use appropriate estimation technique to fit model correctly.
Multivariate regression estimates the parameters using separate ordinary least square (OLS) regressions.Moreover, multivariate regression estimates jointly the between equation covariances.In order to test the regression parameters jointly in the context of multivariate multiple regression model, the existing multivariate test statistics are Wilks' lambda, Pillai's trace, Hotelling-Lawley trace and Likelihood Ratio (LR) tests.
Estimation and hypothesis testing play a significant role in statistical inference.Under null hypothesis, the existing tests follow central Chi-square distribution asymptotically (Johnson, 2002) and the tests are two-sided in nature.But, in real life problem, tests are not always exactly two-sided.It may be strictly one-sided or partially one-sided.Thus, in constructing estimation and hypothesis testing procedures, it is important that all available information such as sign of parameter is used in order to maximize the power of the test.
Often prior information are known about the sign of regression coefficients (parameter) under test, the best example being that variances cannot be negative.King and Smith (1986) showed that ignoring information about the signs of regression parameters can lead to loss of power in small samples.Clearly, this information should not be ignored, and as a consequence parameters are restricted, and lead to one-sided hypothesis testing.Hence, setting up an appropriate hypothesis, restricted parameter(s) estimation and restricted alternative(s) hypothesis testing in the context of multivariate multiple regression model are the key features of this study.
Consider the following multivariate multiple regression model, and  is a matrix of regression coefficients need to be estimated are given below, (see, for details, Johnson, 2002).
In the restricted estimation problem two-stage estimation methods are applied, in the first stage we estimate the parameter(s) by usual method (i.e., least square method, maximum likelihood method etc.,) and in the second stage we estimate the constraint parameter(s) by using sophisticated optimization subroutine (i.e., Newton-Raphson method) under the restriction.For testing the null hypothesis , 0 : 0   H the restriction may take the form as, In such a situation the usual two-sided test cannot be applied and the aim of this paper is to develop one-sided or partially one-sided testing approach for testing multivariate multiple regression coefficients under restricted alternatives and compare with the existing classical tests in terms of power properties by conducting Monte Carlo simulation.
The specific objectives of the study are i) to develop estimating technique to estimate restricted multivariate multiple regression coefficients or model by constraint optimization, ii) to develop one-sided testing approach for testing multivariate multiple regression coefficients under restricted alternatives and iii) to demonstrate the performance of our proposed restricted tests with respect to existing test in terms of power properties by Monte Carlo simulation.
The organization of the paper is as follows: Section 2 discusses proposed distance-based restricted estimation technique.In section 3, we discuss distribution of the proposed distance-based test statistic under restricted alternatives and determination of weights.Section 4 concerns with analyzing the performance of proposed test.Concluding remarks are made in section 5.

Proposed Distance-Based Restricted Estimation Technique
To estimate the optimal values of the parameters the shortest distance plays the key role.The statistical distance, is the very useful component of the multivariate analysis.To estimate the maximum likelihood estimators (MLE), the main concern is to minimize squared distance (3).Distance-based approach suggests that we have to determine whether the estimated parameters under test likely to be closer to null hypothesis or to alternative hypothesis.Majumder's (1999) approach for general testing problem is outlined below: is the information matrix.Following Shapiro (1988), Kodde andPalm's (1986), Majumder (1999) suggest that we should find out the closest point in the maintained hypothesis from the unconstrained point.This closest point is the solution of the following distance function or optimal function in the metric The closest point or optimized   can be used in any correct two-sided tests to obtain the corresponding distance-based one-sided and partially one-sided tests.The asymptotic null hypothesis distribution generally follows a mixture of the corresponding two-sided distributions (see for example, Majumder, 1999).

Application of General Approaches to Multivariate Multiple Regression Model
Suppose, we are interested in estimating the model (1) subject to restriction (2).In our approach, we estimate the parameters by least squares, denoted by According to (5) and using  ˆ and    I we can determine the optimal value   by minimizing (5) subject to restriction (2).
In our estimation, we provided additional information in the estimation methodology.As noted earlier, we expect our estimates are likely to be more efficient than unconstrained estimates.We illustrate our argument by Monte Carlo simulation.

Distribution of the Proposed Distance-Based Test Statistic under Restricted Alternatives
In hypothesis testing, to know the distribution of test statistic is one of the most important parts especially it is very essential to calculate critical value and power of the test.The general form of the usual two-sided Likelihood Ratio (LR) statistic is, where,   0  l and   a l  ˆ are the unrestricted and restricted maximized log-likelihood functions respectively.The asymptotic null hypothesis distribution of (6) follows a central chi-square distribution with ) ( q r m  degrees of freedom (Johnson, 2002).
However, the usual two-sided LR test is not appropriate when the alternative hypothesis becomes strictly one-sided.Hence, the proposed distance-based one-sided LR (DBOLR) test statistic is defined as, where,     2).Under the null hypothesis the distribution of the test statistic (7) follows asymptotically weighted mixture of chi-square distribution with ) ( q r m  degrees of freedom (Kodde and Palm, 1986;Shapiro, 1988;Majumder, 1999).Also, we can construct partially one-sided LR test for testing 3 a H and we get the partially one-sided LR statistic of the form (7), which is again asymptotically distributed as weighted mixture of chi-square distribution.
Similarly, for proposed distance-based one-sided Wilks' lambda (DBOWL), distance-based one-sided Pillai's trace (DBOPT) and distance-based one-sided Hotelling-Lawley trace (DBOHL) tests, we can determine the optimal value   according to the general formulation of distance-based approach and under null hypothesis the distribution of the test statistics DBOWL, DBOPT and DBOHL follow asymptotically weighted mixture of chi-square distribution.

Determination of Weights
The important issue related to estimation and testing is the determination of weights of the sampling distribution of test statistics when parameters are restricted.The test statistic -2lnLR has asymptotic distribution under null hypothesis , 0 : which is a probability mixture of independent chi-squared distributions, , 2 i  with different degrees of freedom.Similarly, the other test statistics (Wilks' lambda, Pillai's trace and Hotelling-Lawley trace) distribution can be represented as of the form given above in (8).However, we determine the weights In some cases, it is very difficult to determine the weights when more than seven parameters are to be estimated or tested.Wolak (1989) provides formula to determine these weights up to seven restricted parameters.Gourieroux et al. (1982) proposed numerical simulation to determine the weights.
Here, we use real and artificially generated explanatory variables ).(Z In order to carry out Monte Carlo Simulation preliminary take some positive value of  and some negative value of  and consequently we generate the multivariate multiple regression model (1).We discuss different combinations in the context of sample size ), (n explanatory variables ), (Z covariance matrices ) ( in the case of two, three response variables ) 3 , 2 (  m situations and also in the case of different characteristic roots (eigen values), such as covariance matrices with one or two largest eigen value(s) or equal eigen values.We perform 3000 replications to calculate (size corrected) simulated powers of the new and existing tests.

Experimental Design
In order to compare the power properties of the proposed optimized distance-based one-sided tests with the usual two-sided tests, we use two different types of design matrices namely, real and artificially generated data.We consider the following design matrices: : 3 D A constant dummy and the assembly of a driveshaft for an automobile requires the circle welding of controlled to be within certain operating limits where a machine produces welds of good quality.In order to control the process, one process engineer measures four critical variables: Voltage (volts), Current (amps), Feed speed (in/min) and inert Gas Flow (cm) (See Johnson, 2002, pp 244-245).
For testing the maintained hypothesis, all the design matrices  1-3 for the design matrices defined in section 4.2.In all tables, the estimated size of the considered tests is 0.05 when asymptotic critical values at five percent nominal level are used.Thus, all the tests have size-corrected power.  .We observe from Table 1 and Figure 1 that the simulated powers of our newly proposed optimized DBOLR, DBOWL, DBOPT, DBOHL tests are significantly higher than the usual two-sided existing tests, especially near null value(s).For example, the simulated power of the Wilks' lambda, Pillai trace, Hotelling-Lawley trace, LR test, and the proposed optimized DBOWL, DBOPT, DBOHL, DBOLR tests are 0.428, 0.425, 0.429, 0.428, 0.499, 0.496, 0.500 2).

Conclusions
This study develops distance-based one-sided DBOLR, DBOWL, DBOPT and DBOHL tests for testing regression coefficients in a multivariate multiple regression model under restricted alternatives.We compare the performance of our proposed one-sided tests with the existing two-sided multivariate tests in terms of power properties using real and artificially generated data by Monte Carlo simulation.Monte Carlo simulation results reveal that our newly proposed optimized distance-based one-sided tests perform better than existing two-sided tests in all cases of design matrices discussed in this paper.

: 1 D
A constant dummy and 42 measurements on air-pollution variables recorded at 12:00 noon in the Los Angeles area on different days.The air pollution variables are Wind, Solar radiation, CO, NO, NO 2 , O 3 and HC (See Johnson, 2002, pp 39-40).: 2 D A constant dummy and r variables from standard normal distribution (where ) 4 , 3  r which is taken by using Gauss program.

H
multiple regression model (1), we estimate simulated powers for testing restricted (strictly one-sided and partially one-sided) hypotheses in (2This section compares the powers of the newly proposed optimized distance-based one-sided LR (DBOLR), distance-based one-sided Wilks' lambda (DBOWL), distance-based one-sided Pillai's trace (DBOPT), distance-based one-sided Hotelling-Lawley trace (DBOHL) with usual two-sided Likelihood Ratio (LR), Wilks' lambda, Pillai's trace and Hotelling-Lawley trace tests for in the context of multivariate multiple regression model (1).The estimated simulated powers of these tests are presented in Tables

, 2 Z
containing two response variables and four explanatory variables, when the alternative hypothesis is of the form 0

.
of DBOLR, DBOWL, DBOPT, DBOHL and the usual two-sided multivariate existing tests using artificially generated data for 38  n containing two response variables and three explanatory variables are presented in Table 2, when the alternative hypothesis is of the The results reveal that the proposed optimized DBOLR, DBOWL, DBOPT and DBOHL tests perform better than the usual two-sided existing tests in terms of power of the test.The power curves presented in Figure 2 exhibit that the power of the proposed one-sided tests are very high near null value(s) than their counterparts.For instance, the simulated power of Wilks' lambda, Pillai trace, Hotelling-Lawley trace, LR tests are 0.364, 0.360, 0.364, 0.361, and for the proposed optimized DBOWL, DBOPT, DBOHL, DBOLR tests are 0.396, 0.392, 0.397, 0.397 respectively for, the simulated powers of DBOLR, DBOWL, DBOPT, DBOHL and the usual two-sided multivariate existing tests using real data for38  nWe observe that the simulated powers of our newly proposed optimized DBOLR, DBOWL, DBOPT, DBOHL tests are significantly higher than the usual two-sided existing tests.The power curves show that the powers of the newly proposed optimized one-sided tests are very high near null value(s) than their counterparts (Figure3).As an instance, the simulated power of Wilks' lambda, Pillai trace, Hotelling-Lawley trace, LR tests are 0.089, 0.088, 0.088, 0.089, and for the proposed optimized DBOWL, DBOPT, DBOHL, DBOLR tests are 0.125, 0.124, 0.124, 0Monte Carlo simulation study in all cases, our proposed DBOLR, DBOWL, DBOPT and DBOHL tests perform better than the usual two-sided existing tests.All the tables and figures show that the simulated powers of the proposed distance-based one-sided tests are always superior to the usual two-sided tests.In conclusion, if we use the proposed distance-based one-sided tests rather than their traditional counterparts do result typically more accurate in terms of power.
Let  ˆ be an appropriate estimate of  such that  ˆ is asymptotically distributed as normal with variance-covariance matrix ) ( p R