The Exponentiated Generalized Standardized Half-logistic Distribution

We study a new two-parameter lifetime model called the exponentiated generalized standardized half-logistic distribution, which extends the half-logistic pioneered by Balakrishnan in the eighties. We provide explicit expressions for the moments, generating and quantile functions, mean deviations, Bonferroni and Lorenz curves, and order statistics. The model parameters are estimated by the maximum likelihood method. A simulation study reveals that the estimators have desirable properties such as small biases and variances even in moderate sample sizes. We prove empirically that the new distribution provides a better fit to a real data set than other competitive models.


Introduction
It is hardly necessary to emphasize that a probabilistic model is commonly employed for practical situations in which a deterministic model is not feasible.We can verify that probabilistic models still awaken the fascination of applied scholars and researchers.This interest materializes in the great amount of works that are dedicated to the proposal of new distributions.The half-logistic (HL) distribution pioneered by Balakrishnan is the absolute value of a random variable following the logistic distribution.It has a monotonically increasing hazard rate function (hrf) for all parameter values, which is a property shared by relatively few distributions with support on the positive real line.Recently, the HL distribution has been discussed by several authors.We shall refer to the following works: (Balakrishnan, N. & Wong, K. H. T., 1991) obtained approximate maximum likelihood estimates (MLEs) for the location and scale parameters with type-II right-censoring; (Balakrishnan, N. & Chan, P., 1992) presented the estimation for the scaled HL distribution under type II censoring; (Panichkitkosolkul, W. & Saothayanun, 2012) investigated bootstrap confidence intervals for the process capability index under this distribution.More recently, (Oliveira, J., 2016) introduced a new extension of the HL model by considering the standardized half-logistic (SHL) distribution, which is an attractive model for statisticians and applied researchers since it does not have parameters and its mathematical properties can be easily obtained.The cumulative distribution function (cdf) and probability density function (pdf) (for t > 0) of the SHL distribution are given by and respectively.
Let T be a random variable having density (2).The HL distribution is defined by a linear transformation W = µ + σ T , where µ ∈ IR + and σ > 0. Without loss of generality, we can work with the SHL model.The nth moment of T is where ζ(•) is the Riemann zeta function.For details on the Riemann zeta function, see the Wolfram website http: //mathworld.wolfram.com/RiemannZetaFunction.html.In particular, the first two moments of T are E(T ) = log(4) and E(T 2 ) = π 2 /3.In addition, the hrf of T is given by λ(t) = 1/(1 + e −t ).The moment generating function (mgf) of T , say M T (s) = E(e −sT ), is given by (1 + e t ) 2 dt = 2J 1 (1 + s, 1 − s), where J p (a, b) = ∫ p 0 u a−1 (1+u) a+b (a, b > 0) is the type II incomplete beta function.For more properties of the HL distribution (Balakrishnan, N., et al., 1991;Balakrishnan, N. & Chan, P., 1992;Panichkitkosolkul, W. & Saothayanun., 2012;Oliveira, J., et al., 2016;Cordeiro, G. M., et al., 2015).The addition of parameters to the SHL model may generate new distributions with great adjustment capability and, for this reason, we propose a generalization of it.The recent literature has suggested several ways of extending well-known distributions, among them, the generator approach, to provide more realistic statistical models in a great variety of applications.Some of the most important generators were recently discussed by Mansoor, M., et al., (2016).
For a baseline continuous cdf G(x), (Cordeiro, G. M., et al., 2013) defined the exponentiated generalized (EG for short) class of distributions by where a > 0 and b > 0 are two extra parameters whose role is to govern skewness and generate distributions with heavier/ligther tails.They are sought as a manner to furnish a more flexible distribution.Because of its tractable distribution function (3), this class can be used quite effectively even if the data are censored.The EG class is suitable for modeling continuous univariate data that can be in any interval of the real line.The pdf corresponding to (3) is given by where g(x) = dG(x)/dx is the baseline pdf, which is a special case of (4) when a = b = 1.Setting a = 1 gives the exponentiated-G ("exp-G") class.If b = 1, we obtain the Lehmann type II class.So, the family (4) generalizes both Lehmann types I and II classes; that is, this method can be interpreted as a double construction of Lehmann alternatives.Note that even if g(x) is a symmetric density, the density f (x) will not be symmetric.
The rest of the paper is organized as follows.In Section 2, we define the exponentiated generalized standard halflogistic (EGSHL) distribution by inserting (1) in equation (3).In Section 3, we study the shapes of its pdf and hrf.Its hrf can take non-monotonous forms, such as bathtub and inverted bathtub, which explain many real phenomenons.
A detailed study of the quantile function (qf) and some applications is addressed in Section 4. In Section 5, we obtain a useful linear representation for the new density.Some properties of the exp-SHL model are given in Section 6. Explicit expressions for the ordinary and incomplete moments, mean deviations, Bonferroni and Lorenz curves, generating function and reliability of the EGSHL distribution are obtained in Section 7. Sections 8 and 9 are related to the probability weighted moments (PWMs) and Rényi entropy, respectively.In addition, for each important equation associated with the new model, we provide plots and numerical studies in order to illustrate its usefulness.The order statistics and their moments are investigated in Section 10.We discuss maximum likelihood estimation of the model parameters in Section 11.In Section 12, we present a simulation study.An application to real data in Section 13 shows the usefulness of the proposed distribution.Finally, concluding remarks are addressed in Section 14.

The New Distribution
Let X be a random variable with support on the positive real line having the EGSHL (a, b) distribution, say X ∼ EGSHL (a, b).The cdf of X, for x > 0, is defined by inserting (1) in equation ( 3) where a > 0 and b > 0. Equation ( 5) has a simple closed-form, which is an important aspect to generate EGSHL variables by using the inversion method.The density of X becomes For brevity of notation, we shall drop the explicit reference to the parameters a and b unless otherwise stated.For a = b = 1, equation ( 6) reduces to the SHL density.The EGSHL model also includes the Lehmann type I and type II transformations of the SHL distribution, denoted by ESHLI and ESHLII.For example, the exponentiated SHL distribution, say ESHLI, follows when a = 1.Some plots of the pdf (6) are displayed in Figures 2 and 2. These plots reveal that the pdf of X is quite flexible and can take symmetric and asymmetric forms, among others.In summary, they reinforce the importance of the proposed model.The sf and hrf of X are given by Some plots of (7) are displayed in Figure 2.Besides monotone forms, the hrf of X can take bathtub and inverted bathtub shapes.This non-monotone form is particularly important because of its great practical applicability.The time of human life is just one of many phenomena that the bathtub shape hrf is applicable (Lee, E. T., 1992).

Shapes
Some plots of log{ f (x)} using the Wolfram Mathematica software for selected parameter values are displayed in Figure 3.We can investigate the shapes of the pdf and hrf of X from their first and second derivatives.
The first derivative of log{ f (x)} is given by where η(x) = 1 + e −x , v 1 (x) = − 2 a e −a x + η a (x) and v 2 (x) = 2 a e x(1−a) − η a−1 (x).Thus, the critical values of f (x) are the roots of the equation: The value x 0 , which solves the equation above can be a maximum, minimum or inflection point.To check this, we evaluate the sign of the second derivative of log{ f (x)} at x = x 0 .We have It is often difficult to obtain an analytical solution for the critical points of this function.Therefore, it is common to obtain numerical solutions with high accuracy through optimization routines in most mathematical and statistical platforms.Some plots of the first derivative of log{ f (x)} for selected parameter values are displayed in Figure 3. Similarly, we provide the first and second derivatives of log{h(x)}.The critical values of log{h(x)} are the roots of the equation: .
The second derivative of log{h(x)} is given by ] } , Some plots of the first derivative of log{h(x)} for selected parameter values are displayed in Figure 3.

Quantile Function
In previous sections, we provide some important functions that characterize the random variable X ∼ EGSHL (a, b).By inverting (5), we obtain the qf of X as where u ∈ (0, 1).The proposed distribution is easily simulated from a uniform random variable U by X = Q(U).Next, we use (8) to generate 100 EGSHL(1.5, 1.2) occurrences.Figure 4 displays the histogram and empirical cdf for the simulated data and also the exact pdf and cdf of X.These plots reinforce the adequacy model for practical applications.
For similar studies, see (Jafari, A. A., et al., 2014;Jafari, A. A. & Mahmoudi, E., 2015), among others.As mentioned earlier, the qf practical uses are numerous.For example, Q(1/2) determines the median of the model.Table 4 gives the results of a small simulation study using the R software.The goal is to compare the empirical medians (EMed) generated for different parameter values and random samples of size n = 10, 20, 40, 100, with their corresponding theoretical medians (Med) obtained by Q(1/2).As expected, the difference between EMed and Med decreases when n increases.
Table 1.Theoretical and empirical medians (for n = 10, 20, 40, 100) of X for some parameter values  For an arbitrary baseline cdf G(x), a random variable Y a has the exp-G class with power parameter a > 0, say Y a ∼exp-G(a), if its cdf and pdf are given by H a (x) = G(x) a and h a (x) = a g(x) G(x) a−1 , respectively.For a comprehensive discussion about the exponentiated class, see a recent paper by Tahir, M. H. and Nadarajah, S. (2015).
Here, we consider the generalized binomial expansion Using (9) twice in equation ( 4), the EG density function can be expressed as where ) (

m a j+1
) and h j+1 (x) = ( j + 1) g(x) G(x) j is the exp-G pdf with power parameter j + 1. Equation (10) reveals that the EG density is a linear combination of exp-G densities.We can derive some structural properties of the EG class from those exp-G properties.The cdf F(x) comes from (10) by simple integration, namely where j+1 is the exp-G cdf with power parameter j + 1.
Equations ( 10) and ( 11) were obtained by Cordeiro, G. M. and Lemonte, A. J. (2014).They hold for any baseline distribution G.It is not difficult to show numerically that ∑ ∞ j=0 w j+1 = 1.Moreover, for most practical purposes, we can set the upper limits equal to 20.
We can adopt (10) for the EGSHL distribution and obtain its mathematical properties from those of the exp-SHL distribution.Let Y j+1 be a random variable having the exp-SHL density with power parameter j + 1 ( j ≥ 0) given by Clearly, several mathematical properties of X can be determined from the linear representation (10) and those of the exp-SHL distribution given by Oliveira, J. et al. (2016) and report in the next section.

Properties of the Exp-SHL Distribution
Henceforth, let Y j+1 ∼exp-SHL( j + 1) have the density function ( 12).We use throughout an equation for a power series raised to an integer j = 1, 2, . . .
where a 0 0, c j,0 = a j 0 and the coefficients c j,i (for i ≥ 1) are determined recursively by The nth moment of Y j+1 derived by expanding the binomial terms is given by where the quantities c n,i 's are obtained from equation ( 13) by taking For empirical purposes, the shape of many distributions can be usefully described by the incomplete moments.They form natural building blocks for measuring inequality: for example, the Lorenz and Bonferroni curves depend upon the first incomplete moment of an income distribution.The nth incomplete moment of Y j+1 is given by where tanh(•) is the hyperbolic tangent function.
The mgf of Y j+1 , say M j+1 (s) = E(e sY j+1 ), can be expressed as where 2 F1 is the regularized hypergeometric function defined by is the falling factorial, (a) 0 = 1, and Γ(a) = ∫ ∞ 0 x a−1 e −x dx is the gamma function.For |z| < 1 and arbitrary parameters a, b and c, the above infinite sum is convergent.For more, see (Oliveira, J., et al., 2016).

Properties of the EGSHL Distribution
In this section, we obtain explicit expressions for some quantities of the EGSHL distribution.The formulae derived can be handled in most symbolic computation platforms such as Mathematica and Maple more efficiently than computing them directly by numerical integration of the density function (6).The infinity limits can be substituted by a large positive integer such as 20 or 30 for most practical purposes.

Moments
The statistical relevance for calculating moments, especially in applied research, is widely know in the literature.Next, we provide two ways to compute the nth moment of X with density (6).The first formula follows as Although we do not have a closed-form for this integral, it is very simple to evaluate it computationally.For illustrative purposes, we provide a small numerical study by computing E(X n ) and the variance of X from (17) numerically.We consider several parameters values and n = 1, 2, 3, 4, 5 and the results are given in Table 7.1 with five decimal digits.All computations are performed using Wolfram Mathematica platform.Plots of the moments for some parameter values are display in Figure 7.1.
Based on the values in Table 7.1 and the plots in Figure 7.1, we conclude that the additional parameters a and b have large impact on the moments of X. Theses values and plots reveals that, in general, for fixed a parameter value, the moments and the variance increases when b increase.The inverse happens when we set values for b and the parameter a increases.
Alternatively, the nth moment of X can be obtained from equations ( 10) and ( 14) as where the quantities c n,i are defined in ( 14).
Figure 10.Plots of the moments of X for some parameter values

Incomplete Moments and Their Applications
The nth incomplete moment of X, say m(n; y) = ∫ y 0 x n f (x) dx, can be determined from ( 10) and ( 15) as Generally, there has been a great interest in obtaining the first incomplete moment of a distribution.The mean residual function follows from ( 19) with n = 1 as µ ′ 1 − m(1; y) − y.Further, we can obtain mean deviations from the mean and the median given by δ , where the mean µ ′ 1 and the median M follow from ( 18) and (8), respectively.Equation ( 19) with n = 1 is also useful to derive the Bonferroni and Lorenz curves defined (for a given probability π) by B(π) = m(1; q)/(π µ ′ 1 ) and L(π) = m(1; q)/µ ′ 1 , respectively, where q = Q(π) follows from (8).

Reliability
Here, we derive the reliability, say R, when X 1 ∼ EGSHL(a 1 , b 1 ) and X 2 ∼ EGSHL(a 2 , b 2 ) are two independent random variables.Let f 1 (x) denote the pdf of X 1 and F 2 (x) denote the cdf of X 2 .The reliability can be expressed as x) dx and using equations ( 10) and ( 11) gives where ) .
Thus, the reliability of X reduces to Table 7.4 gives some values of R for different parameter values.Clearly, for a 1 = a 2 and b 1 = b 2 , we obtain R = P(X 1 > X 2 ) = 1/2.All computations are done using Wolfram Mathematica software by taking the upper limits equal to 30 in (20).

Probability Weighted Moments
The PWMs are used to derive estimators of the parameters and quantiles of generalized distributions.The moment method of estimation is formulated by equating the population and sample PWMs.These moments have low variances and no severe biases, and they compare favorably with estimators obtained by maximum likelihood.The (s, r)th PWM of X is defined by δ s,r = E[X s F(x) r ].Clearly, the ordinary moments follow as δ s,0 = E(X s ).Next, we derive simple expressions for the PWMs of X defined by Inserting ( 5) and (6) in equation ( 21), the PWMs of X can be expressed in a simple form Table 8 gives from ( 22) the values of δ s,r for X ∼ EGSHL (2, 3) and some values of s and r.All computations are performed using Wolfram Mathematica software.Based on the values in Table 8, we conclude that, for fixed r, the PWMs increase when s increases.The opposite happens when we fix the parameter s and r increases.We now present a simpler expression for the PWMs of X.Under simple algebraic manipulation, we can write δ s,r as where f [x; a, (r + 1)b] is the EGSHL density with parameters a and (r + 1)b.Equation ( 23) revels that the PWMs of X can be expressed in terms of the ordinary moments of X r ∼ EGSHL [a, (r + 1)b].

Rényi Entropy
Given a certain random phenomenon under study, it is important to quantify the uncertainty associated with the random variable of interest.One of the most popular measures used to quantify the variability of X is the Rényi entropy.See, for example, Da Silva et al (2013) for the gamma extended Fréchet model (Alshangiti, 2014), for the Marshall-Olkin extended modified Weibull distribution and (Castellares, F. & Santos, M. A. C., 2015) for an extended logistic distribution.
By inserting (6) in this equation, we obtain Equation ( 24) can be easily implemented computationally and the values of I R (ρ) are obtained in a few seconds.Table 9 gives some values of I R (ρ) for different parameter values.All computations use the Wolfram Mathematica software.
Based on the figures in Table 9, we note that, independently of a and b, I R (ρ) decreases when ρ increases.For fixed ρ, the Rényi entropy is larger for a < b.

Order Statistics
The importance of order statistics and their applications is widely disseminated in the literature.As define by Balakrishnan, N. and Cohen, A. C. ( 2014), the main objective of the order statistics is the investigation of properties and applications of ordered random variables, as well as functions of these variables.The density function f i:n (x) of the ith order statistic, say X i:n , based on a random sample X 1 , . . ., X n , can be expressed as (for i = 1, . . ., n) By inserting ( 5) and ( 6) in the above expression, the density function of the EGSHL order statistics follow as There are many practical applications in which we can employ the above equation.Perhaps, the most important of these refers to the moments of X i:n .The r-th moment of X i:n comes from (26) as The r-th moment of X i:n can be easily obtained numerically using ( 27) through any symbolic computing platform.In Table 10, we present a small illustration, in which we calculate the first five moments of X i:10 for a = b = 2 and some values of r and i.All computations are performed using the Wolfram Mathematica platform.For a similar study, readers may see a paper by Barreto-Souza, W. et al. (2010), who evaluated E(X r i:n ) numerically for the Weibull-geometric distribution.Finally, we provide a linear representation for f i:n (x).After a simple algebraic manipulation, we can write where ξ i, j = [(−1) j /(i + j)] ) and f [x; a, (i + j)b] is the EGSHL density with parameters a and (i + j)b.Equation (28) revels that the pdf of X i:n is a linear combination of EGSHL densities.So, the moments, incomplete moments and other quantities for the EGSHL order statistics can be determined from the above expression.

Estimation and Inference
Several approaches for parameter estimation were proposed in the literature but the maximum likelihood method is the most commonly employed.The maximum likelihood estimators (MLEs) enjoy desirable properties and can be used for constructing confidence intervals and also in test statistics.The normal approximation for these estimators in large sample distribution theory is easily handled either analytically or numerically.So, we consider the estimation of the unknown parameters for the EGSHL distribution from complete samples only by maximum likelihood.Let x 1 , . . ., x n be observed values from this distribution with parameters a and b.
The log-likelihood function for the vector of parameters θ = (a, b) ⊤ , say ℓ(θ), can be expressed as Equation ( 29) can be maximized either directly by using the R (optim function), SAS (PROC NLMIXED) or Ox program (sub-routine MaxBFGS) or by solving the nonlinear likelihood equations obtained by differentiating (29).The components of the score function are: The negative elements of the observation matrix J(θ) are given by For large n, the distribution of ( θ−θ) can be approximated to a bivariate normal distribution with zero means and variancecovariance matrix J(θ) −1 .Some asymptotic properties of θ can be based on this normal approximation.

Simulation Study
In this section, we verify if the parameter estimates are obtained with precision since the inferences and the decision processes will depend directly on the quality of the estimates.In this context, one of the most used simulation methods to evaluate the performance of estimators is by Monte Carlo simulation, see, for example, the following works: Lemonte, A. J. ( 2013), Cordeiro, G. M. and Lemonte, A. J. (2014), Alshangiti, A. M., et al. (2014), Silva, A. O., et al. (2014), Jafari, A. A., et al. (2014) andDe Andrade, et al. (2015).We investigate the behavior of the MLEs for the parameters of the EGSHL model by generating from (8) samples sizes n = 20, 40, 80, 120 with selected values for a and b and 10, 000 replications.The simulation process is performed in the R software using the simulated-annealing (SANN) maximization method in the maxLik script.To ensure the reproducibility of the experiment, we use the seed for the random number generator: set.seed(103).Initial kicks are taken as equal to half of the true values of the parameters in each scenario.The results of the simulations are presented in Tables 12 and 12, which contain the estimates and their estimated asymptotic variances in parentheses.These results reveal that the EGSHL estimates have desirable properties even for small to moderate sample sizes.In general, the biases and variances decrease as the sample size increases, as expected.
Finally, Figure 13 displays the histogram of the data and the estimated pdf and cdf of the EGSHL model.These plots reveal that the proposed model is quite suitable for these data.

Conclusions
In this paper, we introduce a univariate continuous distribution with two parameters that govern the asymmetry and kurtosis, named the exponentiated generalized standard half-logistic model, say EGSHL.We provide a comprehensive mathematical treatment and show by numerical studies that the formulas related to the new model are computationally manageable.In particular, the maximum likelihood estimators are easily estimated.These estimators have desirable properties, such as low biases and variances, even in small or moderate sample sizes.A study using real data shows that the new distribution can be used in practical situations due to its great power of adjustment when compared to other competitive models.We hope that the proposed model can be useful for applied statisticians and other researchers who refer to a model with few parameters but flexible to accommodate supported data in real positives.For future research, we will study bias correction via bootstrap for estimators in small samples.

Figure 1 .Figure 2 .
Figure 1.Plots of the EGSHL density function for some parameter values

Figure 3 .Figure 4 .
Figure 3. Plots of the EGSHL hazard function for some parameter values Figure 5. Plots of the first derivative of log{ f (x)}.

Figure 6 .
Figure 6.Plots of the first derivative of log{h(x)} Figure 7. Plots of the EGSHL(1.5, 1.2) pdf, histogram, exact and empirical cdfs for simulated data with n = 100 use the qf of X to determine the Bowley skewness (Kenney, J. F. & Keeping, E. S., 1962) (B) and Moors kurtosis (Moors, J. J. A., 1988) (M).The Bowley skewness is based on quartiles Figure 8. Plots of the Bowley skewness of X.

Figure 11 .
Figure 11.Dispersion of the 128 data units of the phosphorus concentration in the leaves Figure 12.Estimated pdf and cdf of the EGSHL model for phosphorus concentration in leaves data

Table 4 .
The PWMs of X for some values of s and r

Table 6 .
The first five moments of X i:10 for a = b = 2 and some values of r and i

Table 7 .
MLEs for several a and b parameter values (variances in parentheses)

Table 8 .
MLEs for several a and b parameter values (variances in parentheses)

Table 9 .
Descriptives statistics for phosphorus concentration in leaves data

Table 10 .
MLEs (and their standard errors in parentheses), AIC, BIC and CAIC for phosphorus concentration in leaves data