The Topp-Leone Marshall-Olkin-G Family of Distributions With Applications

A new generalized distribution is developed, namely, Topp-Leone Marshall-Olkin-G distribution. The new distribution is a linear combination of the exponentiated-G family of distributions. We considered three sub-families of the new proposed family of distribution. The distribution can handle heavy-tailed data and various forms of the hazard rate functions. A simulation study was conducted to evaluate consistency of the model parameters. Three applications are provided to demonstrate the usefulness of the new model in comparison with competing non-nested models.


Introduction
There is an increase in the demand for generalized distributions, which can handle various levels of skewness and kurtosis. More so, there is an increased need for generalized distributions that can fit data that exhibit various shapes for the hazard rate functions. These generalized models has wider applications in areas of reliability and engineering. Also, these generalized models have wider applications in hydrology, medicine, economics, finance and insurance. In response to this demand, many generators are proposed in literature and these include beta-G by Eugene, Lee, and Famoye (2002), Marshall-Olkin-G (MO-G) by Marshall and Olkin (1997), Kumaraswamy-G (Kw-G) by Cordeiro and de Castro (2011), gamma-G by Zografos and Balakrishinan (2009), Weibull-G (W-G) by Bourguignon, Silva, and Cordeiro (2014), T-X family by Alzaatreh and Ghosh (2013), beta odd Lindley-G (BOL-G) by Chipepa, Oluyede, Makubate, and Fagbamigbe (2019), Kumaraswamy odd Lindley-G by , Topp-Leone odd log-logistic-G (TLOLL-G) by Brito, Cordeiro, Yousof, Alizadeh, and Silva (2017), to mention a few.
Furthermore, Topp and Leone (1955) developed a model that always exhibits the bathtub shaped hazard rates. The Topp-Leone distribution is an extension of the triangular distribution, and as such it is not a very flexible distribution since its domain is restricted to (0, 1). This distribution has cumulative distribution function (cdf) defined as for 0 < x < 1 and b > 0. Marshall and Olkin (1997), introduced a new distribution defined by where δ is the tilt parameter and G(x) is the baseline cdf. The MO-G distribution is more flexible compared to other distributions like exponential, Weibull and gamma. We propose a new family that enables us to model data that are • heavy tailed • heavily skewed • platykurtic and leptokurtic compared to the baseline distribution • The proposed new model can be applied to data that have non-monotonic hazard rate functions.
In this paper, we develop the Topp-Leone Marshall-Olkin-G (TL-MO-G) family of distributions. In Section 2, we develop the new generalized distribution. Section 3 contains sub-families. In Section 4, we presents structural properties of the new distribution. In Section 5, we derive maximum likelihood estimates. We present results of the simulation study in Section 6. In Section 7, we present applications of the new model to real data examples, followed by concluding remarks.

The Topp-Leone-Marshall-Olkin-G Family of Distributions
We use Equation (1), and the generalization by Marshall and Olkin given in Equation (2) to derive the Topp-Leone-Marshall-Olkin-G (TL-MO-G) family of distributions. Therefore, the TL-MO-G family of distributions is given by with corresponding probability density function (pdf) for b, δ > 0,δ = 1 − δ and ξ is a vector of parameters from the baseline distribution G(.).

Linear Representation
We derive the series representation of the TL-MO-G distribution using the pdf of the TL-MO-G distribution and the series expansion and applying the following binomial expansion By applying the binomial expansion we can therefore, write the linear representation of the TL-MO-G distribution as where and g p (x; ξ) = (p + 1)g(x; ξ)G p (x; ξ) is an exponentiated-G (Exp-G) distribution with parameter p. The TL-MO-G family of distributions is a family of the Exp-G distributions.

Sub-Families
We present some sub-families of TL-MO-G distribution. We considered cases when the baseline distributions are uniform, log-logistic, Weibull and normal distributions.

Topp-Leone-Marshall-Olkin-Uniform (TL-MO-U) Distribution
By taking the baseline distribution to be uniform distribution, we obtain the Topp-Leone-Marshall-Olkin-Uniform (TL-MO-U) distribution. The uniform distribution has g(x) = 1/θ and G(x, θ) = x/θ, for 0 < x < θ. Therefore, the TL-MO-U distribution is given by From Figures 1 we deduce that the new distribution can handle data that is symmetric, left or right skewed and that the TL-MO-U model can fit data sets that have an increasing, upside bathtub followed by bathtub hazard rate functions (hrf).

Topp-Leone-Marshall-Olkin-Log-Logistic (TL-MO-LLo) Distribution
By taking the baseline distribution to be the log-logistic distribution we obtain the Topp-Leone-Marshall-Olkin-loglogistic (TL-MO-LLo) distribution. The log-logistic distribution has g(x) = cx c−1 (1 + x c ) −2 and G(x) = 1 − (1 + x c ) −1 , for c > 0, respectively. Therefore, the TL-MO-LLo distribution is given by International Journal of Statistics and Probability Vol. 9, No. 4;2020 Figure 2. Plots of the pdf and hrf for the TL-MO-LLo distribution respectively, for b, δ, c > 0. Figure 2 show that the new distribution applies to heavy-tailed data. The distribution also addresses the variation in both kurtosis and skewness. The TL-MO-LLo model can fit to various hazard rates that includes bathtub followed by upside down bathtub, decreasing, increasing, and upside bathtub or uni-modal.

The Topp-Leone Marshall-Olkin-Weibull Distribution
By taking the basine distribution to be Weibull distribution, we obtain the Topp-Leone-Marshall-Olkin-Weibull (TL-MO-W)distribution. Weibull distribution has g(x; λ, ω) = λωx ω−1 e −λx ω and G(x; λ, ω) = 1 − e −λx ω , for λ, ω > 0. The TL-MO-W distribution is given by The pdfs of the TL-MO-W distribution can take uni-modal, left or right skewed and reverse-J shapes. Also, the TL-MO-W distribution exhibits various shapes for the hazard rate function.

The Topp-Leone Marshall-Olkin-Normal Distribution
Consider the normal distribution with pdf g(x; µ, σ) = σ −1 φ x−µ σ and cdf G(x; µ, σ) = Φ x−µ σ , for µ ∈ and σ > 0, as the baseline distribution, we obtain the Topp-Leone Marshall-Olkin-normal (TL-MO-N) distribution with cdf and pdf given by  show that the TL-MO-N distribution can take various shapes for its pdf. Also, the hazard rate function for the TL-MO-N distribution exhibit various shapes.

Distribution of Order Statistics
We can use equation (7) to determine the distribution of the i th order statistics from the TL-MO-G family of distributions.
where B(., .) is the beta function. Using equations (3) and (4), f (x)F(x) j+i−1 from equation (7) simplifies to and by applying the binomial expansion International Journal of Statistics and Probability Vol. 9, No. 4;2020 yields Furthermore, applying the binomial expansion we can write Also, applying the following binomial expansion Therefore, where and g q (x; ξ) = (q + 1)g(x; ξ)G q (x; ξ) is the Exp-G distribution with parameter q. It follows that the i th order statistics from the TL-MO-G distribution can be obtained directly from that of the Exp-G distribution.

Entropy
There are two common measures of entropy and these are Rényi entropy by Rényi (1960) and Shannon entropy by Shannon (1951). In this paper, we derive the Rényi entropy (I R (ν)) of the TL-MO-G family of distributions using the formula Substituting Equation (4) for f (x), we get .

Applying the binomial expansion
we can write Also, considering the binomial expansion Also, applying the following binomial expansion we can therefore, write the Rényi entropy of the TL-MO-G family of distributions as where and It follows that the Rényi entropy of the TL-MO-G distribution can be obtained directly from the Exp-G distribution.

Moments and Moment Generating Function
We derive the s th ordinary moment of the TL-MO-G family of distributions using Equation (5) and is given by where Y p is an Exp-G distribution with power parameter p and v p is given by Equation (6). The r th central moment of X is given by International Journal of Statistics and Probability Vol. 9, No. 4;2020 The cumulants of X follow recursively from We use ordinary moments to determine the measures of spread, which includes, standard deviation, kurtosis and skewness.
Furthermore, we can find the r th incomplete moment of X as follows We use the incomplete moment to estimate Lorenz and Bonferroni curves, which are useful in science, engineering, economics and demography. These quantities can be expressed mathematically by L(p) = φ 1 (q)/µ 1 and B(p) = φ 1 (q)/(pµ 1 ), respectively, where µ 1 is given by equation (14), with r = 1 and q = Q(p) is the quantile function of X at p. The incomplete moment (equation (15)) can also be expressed as where H p (z) = z −∞ x r g p (x; ξ)dx is the r th incomplete moment of the Exp-G distribution. We present the first five moments of the TL-MO-LLo distribution, and the standard deviation (SD or σ), coefficient of variation (CV), coefficient of skewness (CS) and coefficient of kurtosis (CK) for selected parameters values. The results are shown in Table 1. Furthermore, we obtain the moment generating function (mgf) of the TL-MO-G distribution where M p (t) is the mgf of Exp-G distribution.

Probability Weighted Moments
We can use Probability Weighted Moments (PWMs) to estimate parameters of distributions which are not in closed form.
From equation (8), we have International Journal of Statistics and Probability Vol. 9, No. 4;2020 which simplifies to and g q (x; ξ) is an Exp-G pdf. Therefore, the PWM is given by where T j q is j th power of an Exp-G distributed random variable with power parameter q.

Quantile Function
To obtain the quantile function of the TL-MO-G distribution, we invert the cdf given in equation (3). Note that can be written as which can be written asḠ We can therefore determine the quantiles of the TL-MO-G family of distributions by solving the equation using iterative methods by making use of Matlab or R software. We present quantiles for the TL-MO-LLo distribution for some selected values of parameters. The results are shown in Table 2.

Maximum Likelihood Estimation
If X i ∼ T L − MO − G(b, δ; ξ) with the parameter vector ∆ = (b, δ; ξ) T . The total log-likelihood = (∆) from a random sample of size n is given by The score vector U = ( ∂ ∂b , ∂ ∂δ , ∂ ∂ξ k ) has elements given by: respectively. These partial derivatives are not in closed form and can be solved using R, MATLAB and SAS software by use of iterative methods.

Simulation Study
We conducted a simulation study to evaluate consistency of the maximum likelihood estimators. We simulated for N=1000 times with sample size n= 60, 120, 240, 480, 960 and 1920. Simulation results are shown in Table 3. From the Monte Carlo simulation results, we conclude that our model produces consistent results when estimating parameters for the model because as the sample size increases the mean result approaches the true parameters values and also the root mean square error (RMSE) and average bias dies towards zero for all parrameters values.

Applications
We applied the TL-MO-LLo model to three real data examoles to demonstrate usefulness of the new distribution compared to its sub-models models and other known non-nested distributions. The best fitting model was assessed using the goodness-of-fit statistics, namely, -2loglikelihood (-2 log L), Akaike Information Criterion (AIC), Consistent Akaike Information Criterion (AICC), Bayesian Information Criterion (BIC), Cramer von Mises (W * ) and Andersen-Darling (A * ) as described by (Chen and Balakrishnan, 1995). The best model has smaller values of these statistics. We used R software to estimate the model parameters via the nlm function. Model parameter estimates (standard errors in parenthesis) and the goodness-of-fit-statistics for the three data sets are shown in Tables 4, 5 and 6. We also present plots of the fitted densities, the histogram of the data and probability plots (Chambers, Cleveland, Kleiner and Tukey, 1983) to show how well our model fits the observed data sets. The plots are shown in Figures 5, 6 and 7.

Kevlar 49/Epoxy Strands Failure at 90% Data
The first data set consists of 101 observations representing failure times (in hours) of kevlar 49/epoxy strands subjected to constant sustained pressure at the 90% stress level (see Andrews andHerzberg, 2012 or Barlow, Toland andFreeman, 1984 Figure 5. Fitted pdfs and probability plots for kevlar data set From the results shown in Table 4, we can conclude that the TL-MO-LLo distribution fit the kevlar data set better than the non-nested models considered. Furthermore, from the fitted densities plots (Figure 5), we can conclude that the proposed model fits well on data that is heavy-tailed compared to the sub-models.

Strengths of 1.5 cm Glass Fibres Data
The second data set was analyzed by by (Bourguignon, Silva and Cordeiro, 2014) and (Smith and Naylor, 1987)   We can also conclude from the results shown in Table 5 that the TL-MO-LLo distribution fit the glass fibres data set better than the non-nested models considered. The TL-MO-LLo distribution has smaller values of the the goodness-of-fit statistics and a bigger p-value for the K-S statistic. Furthermore, from the fitted densities plots (Figure 6), we can notice the improvement achieved by using the TL-MO-LLo distribution in fitting the glass fibre data compared to the sub-models.

Concluding Remarks
We developed a new family of distributions, by combining the Topp-Leone and the Marshall-Olkin-G distributions. The new distribution can handle heavy tailed data and also have non-monotonic hazard rate shapes. The proposed distribution is a linear combination of the Exp-G distribution. We applied the new distribution to three real data sets and our model perform better than the competing non-nested models as shown in Tables 4, 5 and 6.