Parametric and Semiparametric Estimations of Bivariate Truncated Type I Generalized Logistic Models driven from Copulas

Truncated type I generalized logistic distribution has been used in a variety of applications. In this article, a new bivariate truncated type I generalized logistic (BTTGL) distributional models driven from three different copula functions are introduced. A study of some properties is illustrated. Parametric and semiparametric methods are used to estimate the parameters of the BTTGL models. Maximum likelihood and inference function for margin estimates of the BTTGL parameters are compared with semiparametric estimates using real data set. Further, a comparison between BTTGL, bivariate generalized exponential and bivariate exponentiated Weibull models is conducted using Akaike information criterion and the maximized log-likelihood. Extensive Monte Carlo simulation study is carried out for different values of the parameters and different sample sizes to compare the performance of parametric and semiparametric estimators based on relative mean square error.


Introduction
Truncated logistic distribution has been used effectively in different lifetime applications.It was first introduced by Kjelsberg (1962) and then Balakrishnan (1985) studied the half-logistic distribution and its use as lifetime model.AL-Angary (1997) introduced the truncated type I generalized logistic (TTGL) distribution.In addition, several researchers such as Al-Hussaini et al. (2006), Atea (2001), Al-Hussaini andAteya (2003, 2005), Ateya and Ahmed (2013), and Rao (2015) have studied the properties and inferences of TTGL distribution and suggested many applications in different fields.
The TTGL distribution with two parameters has the following cumulative distribution function (Cdf) . (1) The probability density function (Pdf) is where α is the shape parameter and σ is the scale parameter (Rao, 2015).
There are many lifetime applications where we need to consider bivariate lifetime distributions.Lately, copula has become a popular method to construct bivariate and multivariate distributions from given marginals due to its flexibility and practical use in a variety of fields, for example, biostatistics, financial, and actuarial fields (Nelsen, 2007;Trivedi & Zimmer, 2005).The main advantage of copula is it allows the marginal to be modeled and analyzed separately from the dependence structure.Also, the mathematical simplicity of copula is another advantage.Several authors considered different copula functions to construct bivariate and multivariate distributions.These include multivariate Gompertz-Type distribution (Adham & Walker, 2001), bivariate half-logistic-type distribution (Adham et al. 2009), bivariate Birnbaum-Saunders distribution derived from Gaussian copula (Kundu, Balakrishnan, & Jamalizadeh, 2010), bivariate Sinh-normal distribution (Kundu, 2014), Absolute continuous bivariate generalized exponential distribution derived from the clayton copula (Kundu & Gupta, 2011), multivariate generalized exponential distribution (Al-Hussaini & Ateya, 2006;Ateya & Al-Alazwari, 2013), and bivariate Burr Type X (Elaal, Mahmoud, EL-Gohary, & Baharith, 2016).
The main aim of this article is to establish bivariate truncated type I generalized logistic distributional (BTTGL) models driven from three different copula functions.These are the Gaussian, Frank, and Clayton which are widely used in the literature to fit bivariate lifetime data since they represent different dependence structures between variables (Nelsen, 2007).The marginals of these distributional models are univariate truncated type I generalized logistic.The flexibility of the BTTGL distribution arises from the presence of the five parameters and the different shapes of its joint Pdf that can be used quite effectively to analyze bivariate lifetime data.
The rest of this article is organized as follows.Section 2 introduces BTTGL distributional models driven from different copula functions.Parametric and semiparametric methods are used to estimate the parameters of the proposed distribution in Section 3. In Section 4, a goodness of fit test is presented to check the appropriateness of different copula functions.A real data set is analyzed to illustrate the performance of the proposed bivariate distributional models in Section 5.In Section 6, a simulation study is carried out to investigate and compare the performance of the different estimators.Finally, concluding remarks are presented in Section 7.

Bivariate Truncated Type I Generalized Logistic Distributions Driven from Copulas
The copula approach is derived from Sklar's theorem (Sklar, 1959), who stated that any multivariate distribution can be disintegrated to a copula and its continuous marginal.For the bivariate case, copulas are used to link two marginal distributions with joint distribution such that for every bivariate distribution function with continuous marginal , there exist a unique copula function as follows where C is a distribution function with uniform(0,1) margins.Therefore, the density function of the bivariate distribution can be written as where are the density functions corresponding to and is the copula density, see (Nelsen, 1999).
There are several copula functions that can be used to construct BTTGL distribution with TTGL marginals given by (1).In this article, we will use the Gaussian, Frank, and Clayton functions to construct BTTGL distributional models.
The Gaussian copula with copula parameter ɵ, which measures the degree of association and dependence is denoted by for this copula to indicate the relation with normal distribution, takes the following form , where are the marginals Cdf of TTGL distribution given by (1) for the random variables and , respectively, denotes the distribution function of a bivariate standard normal random variable, and represents the inverse of standard normal (Mardia, 1970).
The joint Pdf of the BTTGL distribution with TTGL marginals driven from Gaussian copula become , where is the Pdf of TTGL distribution given by Here, , represent the shape and scale parameters, respectively, is the copula parameter, and .
The Frank and Clayton copula functions are members of the Archimedean family which have been used widely in many disciplines.The Frank copula have symmetric dependence patterns where we have a strong association between and for large ɵ in absolute value.The Frank copula takes the following form .Note that when ɵ=0, implies independence between the two variables.The joint Pdf of and driven from Frank copula is given by where are the Cdf of TTGL distribution given by (1).
The Clayton copula have asymmetric dependence patterns with higher dependence between and at left tails.The corresponding joint Pdf of and are respectively, given by , .
The main advantage of the Frank and Clayton copulas is that their copula and density functions have simple closed form expressions.For details, see (Clayton, 1978;Frank, 1979;Hutchinson & Lai, 1990;Nelsen, 2007).

Dependency Measures
According to Schweizer and Wolff ("On nonparametric measures of dependence for random variables," n.d.), copula capture all the dependence between the variables and .This can be measured using Spearman's rho and Kendall's tau, which are respectively given by (5) (6)

Parameter Estimation
In this section, parametric and semiparametric estimation are used to estimate the unknown parameters of the proposed distribution.

Parametric Methods of Estimation
There are two methods for fitting copula model.The first one is to estimate the parameters of the marginal distributions and the copula at once.The second method is the two-stage method where we estimate the parameters of the marginal distributions first by maximizing the marginal log-likelihood and the second stage estimate the copula parameter by using the resulting parameter estimates from the first stage and maximizing the log-likelihood for the copula.

Maximum likelihood estimation
Maximum likelihood (ML) estimation is usually used to estimate all the parameters simultaneously.Let , i=1,…,n, be a bivariate random sample from with density function given in (4), then the log-likelihood function of the joint distribution can be expressed as , where , j=1,2.
The first derivatives of are given by The solution of the above system of nonlinear equations gives the ML estimates of the unknown parameters for the proposed BTTGL distribution using nonlinear optimization algorithms such as a quasi-Newton algorithm.
Confidence intervals (CI) can be obtained by using the large sample approximation in which the ML estimates ( are approximately multivariate normal with mean ( and covariance matrix , where is the inverse of the information matrix, thus Then the asymptotic CI for the parameters using above approach are Two-stage estimation The two-stage approach named as the method of inference functions for margins (IFM), where the first stage of this method will estimate the parameters of the marginal distributions by maximizing the marginal log-likelihood (Harry Joe, 1997;H Joe & Xu, 1996).The marginal log-likelihood for and are given by The solution of the above system of nonlinear equations gives the ML estimates of the marginal parameters Then, the second stage will estimate the copula parameter by maximizing the log-likelihood of the copula parameter after substituting the estimated marginal parameters in the first stage, that is , Where and denote the ML estimates of the marginal parameters obtained from the first stage.This method has the advantage of reducing computational time and mathematical complexity (H Joe & Xu, 1996).

Semiparametric Methods of Estimation
Three semiparametric methods are conducted to estimate the copula parameter.These are the maximum Pseudo-Likelihood method, and the two Method-of-moments: inversion Kendall's and inversion of Spearman's rho.
Maximum Pseudo-Likelihood method Genest et al. (1995) proposed the maximum pseudo-likelihood (MPL) method which estimate the copula parameter independently of the marginal fitting.The method uses the empirical distribution of the marginal where the data is transformed to pseudo-observations having uniform margins.That is, the maximum pseudo-likelihood estimator of is obtained by maximizing the following log-likelihood , are the pseudo-observations from C calculated from data as follows , (7) are respectively, the ranks of .

Method-of-Moments
This method estimate the copula parameter using the relation between the copula with Kendall's tau and Spearman's rho (Genest, 1987;Genest & Rivest, 1993).That is, this method estimate the copula parameter by matching the sample correlation Spearman's rho or Kendall's tau to the dependence measure given in ( 5) and ( 6) which is independent of the marginal distributions.Therefore, consistent estimators of θ that are called inversion of Kendall's (itau) and inversion of Spearman's rho (irho) are, respectively given by , , where are respectively, the ranks of (Kojadinovic & Yan, 2010).

Goodness of Fit Tests for Copula
The idea of this test is to compare the empirical copula with the parametric estimator derived under the null hypothesis (Dobrić & Schmid, 2007;Fermanian, 2005).That is, test if the copula C is well-represented by a specific copula Two approaches are commonly used in the literature to test the goodness of fit of a copula; the parametric bootstrap (Genest & Ré millard, 2008) or the fast multiplier approach (Genest, Ré millard, & Beaudoin, 2009;Kojadinovic, Yan, & Holmes, 2011).The goodness of fit tests based on the empirical process is , Where is the empirical copula of the data of and and is given by Where are given by ( 7), is a consistent estimator and is an estimator of obtained using the pseudo observations.According to Genest et al.(2009), the test statistics is the Cramer-von Miss and is defined as See for details (Genest & Ré millard, 2008;Genest et al., 2009;Kojadinovic et al., 2011).

Illustrative Data Analysis
This section illustrates the applications and the performance of the proposed BTTGL distributional models by analyzing Football data set which describes the UEFA Champion's League for 2004-2005 and 2005-2006, see (Meintanis, 2007).The data represents the games in which the home team has at least one goal scored and one goal by any team.That is, = time of the first goal scored by any team in minutes, =time of the first goal scored by the home team.Meintanis (2007) analyzed this data for the Marshall-Olkin bivariate exponential distribution while Kundu and Dey (2009) and Kundu and Gupta (2009) studied this data using the Marshall-Olkin bivariate Weibull and bivariate generalized exponential distributions, respectively.In this article, this data is also considered to analyze the BTTGL distribution.Some descriptive statistics of and are reported in Table 2.In addition, the scatter plot of and shows a positive correlation.There is no adequate goodness of fit test for general bivariate distribution as stated in (Kundu & Gupta, 2011).Therefore, the TTGL distribution was fitted first to the marginals separately, and the ML estimates of each marginal parameters are obtained to be ( 16.98, 6.78) and ( 19.66, 2.47), respectively.These estimates used as initial values for the model parameters of BTTGL distribution.Then Kolmogorov-Smirnov (K-S) goodness of fit test statistics is calculated to verify that the fits based on TTGL distribution are suitable for the data.The K-S test with associated p-value in bracket for are 0.0979 (0.8703) and for are 0.0952 (0.8904).Also, figure 1 shows the plots of the fitted and the empirical Cdf for both marginals based on ML estimates.These results indicate that TTGL distribution provides an appropriate fit for the data set and can be used to fit the marginals.
Figure 1.The plot for the fitted and empirical CDF of the TTGL distribution using ML estimates

Goodness of fit tests for copula functions
To check if the Gaussian, Frank, and Clayton copula functions are suitable for the data, the goodness of fit test statistics in Section 4 using equation ( 8) is calculated.The results in Table 3 show a non-significant p-values using parametric bootstrap which indicates that these copula functions provide suitable fit for the data.

Models Comparison
For comparison purposes, the univariate generalized exponential (GE) (Kundu & Gupta, 2009) and the exponentiated Weibull (EW) distributions are fitted to the marginals using this data set.Table 5 reports the K-S tests, associated p-value, and the maximized log-likelihood values (LL) for these distributions.It is clear that the p-values are not significant based on the TTGL, GE, and EW as marginals in which we can use these distributions to fit the data.Therefore, the BTTGL, the bivariate generalized exponential (BGE) and the bivariate exponentiated Weibull (BEW) distributions can be used for this data.

Simulation Studies
Here we will conduct a Monte Carlo simulation to illustrate the performance of the proposed BTTGL distribution driven from Gaussian copula function and compare between different methods of estimations.The Gaussian copula has the advantage that it is underline the multivariate normal distribution and incorporates dependency as the multivariate normal distribution does use only pairwise correlations among variables, which gives more flexibilities to the constructed bivariate distribution.That is, the flexibility and analytical tractability of Gaussian copula suggest that it is a good choice to represent dependency.
To study the effect of the marginal parameters on the dependency measures, different sample sizes and different values of the parameters are used.In each case, parametric and semiparametric estimates of the parameters are obtained by computing the average estimates and their relative mean square errors (RMSE) over 1000 replications.Tables 7 and 8 present the results of the simulation.
Table 7 The average estimates and the corresponding RMSE (in brackets) of parametric and semiparametric estimates of BTTGL driven from Gaussian copula function for σ_1=1.2,σ_2=1.1,α_1=α_2=2.8with different values of the copula parameter ρ.From Tables 7 and 8, we concluded the following: 1.In all methods of estimation, as the dependency parameter (copula parameter) value increases, its RMSE become smaller.2. The ML method mostly performs better than IFM method based on RMSE for different sample sizes and different values of the parameters.3.As the sample size increases, the RMSE decrease for all the parameters using parametric and semiparametric methods, as expected.4. For semiparametric methods, the itau method provides better estimates for the dependency parameter (copula parameter) and smaller RMSE compared to MPL and irho methods. 5. Based on the RMSE of the copula parameter, the parametric methods perform better than the semiparametric methods.This is consistent with the results found by Genest et al. (1995).6.The marginal parameters generally have small effect on estimating the copula parameter as seen in Tables 7  and 8 with different values of the parameters.

Concluding Remarks
In this article, bivariate truncated generalized logistic distributional models are introduced.The proposed bivariate distributional models derived from commonly used copula functions with truncated generalized logistic distribution as marginals.Different methods of estimation are used to estimate the unknown parameters.The proposed bivariate distributional models derived from Gaussian, Frank, and Clayton copula functions fitted to a real life data set and the results of the analysis showed that the BTTGL distribution provides more suitable fit than bivariate generalized exponential and the bivariate exponentiated Weibull distributions.Also, the results indicated that the BTTGL distribution driven from Gaussian copula provides a better fit for the data set compared to BTTGL distributional models that are driven from Frank and Clayton copulas.Therefore, particular attention was directed to BTTGL distribution driven from Gaussian copula, and a Monte Carlo simulation was performed to compare the performance of the ML, IFM, BML and moment estimators which shows that the parametric methods perform better than semiparametric methods based on RMSE.

Table 1 reports
Kendall's tau and spearman's rho range for Gaussian, Frank, and Clayton copulas.Table 1.Kendall's tau and spearman's rho range Copula Kendall's tau Range Spearman's rho range Gaussian

Table 2 .
Descriptive statistics of and .

Table 3 .
Cramer-von Miss goodness of fit test statistics with associated p-values.Table4display parameter estimates with associated 95% confidence intervals (CI) for the proposed BTTGL distributional models driven from different copula functions using parametric and semiparametric methods.Table4.ML, IFM, BML and moments estimates with associated 95% CI in bracket for BTTGL distributional models driven from selected copula functions.

Table 5 .
Kolmogorov-Smirnov goodness of fit test statistics with associated p-values and maxmized log-likelihood for the two marginal of TTGL, GE, and EW distributions.

Table 6 .
Maximum likelihood estimates for the copula parameter with the associated CI in bracket, AIC, and LL of the BTTGL, BGE, and BEW distributions.

Table 8
The average estimates and the corresponding RMSE (in brackets) of parametric and semiparametric estimates of