The Extended Generalized Gamma Geometric Distribution

We propose and study the so-called extended generalized gamma geometric distribution. The proposed distribution has five parameters and it can be accommodate increasing, decreasing, bathtub and unimodal shaped hazard functions. The new distribution has a large number of well-known lifetime special sub-models such as the generalized gamma geometric, Weibull geometric, gamma geometric, exponential geometric, Rayleigh geometric, half-normal geometric among others. We provide a mathematical treatment of the new distribution including explicit expressions for moments, moment generating function, mean deviations, reliability and order statistics. The method of maximum likelihood and a Bayesian procedure are adopted for estimating the model parameters. Finally, an application of the new distribution is illustrated in a real data sets.

1. Introduction Stacy (1962) introduced the generalized gamma (GG) distribution that is an extensive family that contains a varieties of special sub-models, including the gamma, Weibull, log normal and Maxwell distributions, among others.This distribution is useful for modeling lifetime data and for modeling phenomenon with different types of hazard rate function as well as monotonically increasing and decreasing, in the form of bathtub and unimodal (Cox et al., 2007).
Several distributions based on extensions or mixtures of the distributions were developed in last years providing more flexibility for modeling survival data.Adamidis and Loukas (1998) introduced a two-parameter distribution with decreasing hazard rate so-called exponential geometric (EG).Silva et al. (2010) proposed the generalized exponential geometric with decreasing, increasing and unimodal hazard rate depending on its parameters.Barreto-Souza et al. (2011) defined the Weibull geometric (WG) which is an extension of the EG distribution and considered for modeling monotone or unimodal hazard rates.Cordeiro et al. (2011) introduced the exponentiated generalized gamma distribution, Pascoa et al. (2011) proposed the Kumaraswamy generalized gamma and Cordeiro et al. (2013b) defined the beta-Weibull geometric.Ortega et al. (2011), following the idea of Adamidis and Loukas (1998) for a process of mixing distributions, presented the generalized gamma geometric (GGG) distribution with four parameters that generalizes a number of well-known special lifetime models such as the GG, EG and WG, among others.The GGG distribution can also be obtained by geometric generated family of distributions, that is a special case of the well-known Marshall-Olkin family of distributions proposed by Marshall and Olkin (1997).
The GGG distribution has monotonically increasing and decreasing, in the form of bathtub and unimodal hazard rate.However, this distribution and its sub-models does not provide a reasonable parametric fit for some practical applications where data may be bimodal shape.
In this work we propose so-called the extended generalized gamma geometric (denoted with the prefix "ExGGG" for short) distribution with five parameters and derive some of its properties with the hope that it will attract wider applications in reliability, engineering and in other areas of research.We are motivated to study the ExGGG distribution because of the wide usage of the GGG distribution and their sub-models in survival analysis.It is also suitable for testing goodness-of-fit of some special sub-models, such as the GGG, GG, WG, EG, Weibull and exponential distributions.Furthermore, the current extension provides density and hazard rate functions with great flexibility for to model complex data in a great variety of applications including the bimodal, skewed and heavy-tailed cases.
The paper is outlined as follows.In Section 2, we define the ExGGG distribution and some of its submodels.Further, we derive useful expansions for its density function.In Section 4 we obtain two alternative expansions for the moments.
In Section 5 we provide an explicit expression for the moment generating function.The mean deviations are determined in Section 6.The reliability is derived in Section 7. In Section 8 we derive the density function of the ith order statistic.Maximum likelihood method and Bayesian approach for the parameter model are discussed in Section 9.The usefulness of the new model is illustrated by means of an application to real data in Section 10.Some conclusions are offered in Section 11.

The ExGGG Distribution
The distribution GGG with four parameters α > 0, τ > 0, k > 0 and p ∈ (0, 1), defined by Ortega et al. (2011), has the probability density function (pdf) given by ) τ ]} , x > 0. (1) Let G(x) be a cdf, the extended class of distributions (also referred to as the Lehmann type II class of distributions) presented by Cordeiro et al. (2013a) corresponding to G(x) is defined by , where λ is a positive real number.Hence, the cdf of the ExGGG with five parameters α > 0, τ > 0, k > 0, λ > 0 and p ∈ (0, 1) has the form and the pdf is given by A random variable X having pdf (3) is denoted by ExGGG(α, τ, k, p, λ).Clearly, when λ = 1 we have GGG distribution.Some distributions are obtained from (3) as particular cases, for example, when k = 1 we have the extended Weibull geometric (ExWG), which is a new distribution, for k = λ = 1 we have the WG distribution, for τ = k = λ = 1, we obtain the EG distribution.The GG distribution is the limiting distribution (the limit is defined in terms of the convergence in distribution) of the ExGGG distribution when p → 0 + and λ = 1.On the other hand, if p → 1 − , we obtain the distribution of a random variable Y such that P(Y = 0) = 1.Some important ExGGG sub-models are listed in Table 2.
The survival and hazard rate functions corresponding to (2) are Plots of the ExGGG density and hazard rate function for selected parameter values are given in Figures 1 and 2, respectively.These plots illustrate the versatility of the ExGGG distribution.
Note in Figure 1(a) that the density function of the ExGGG distribution has very flexible shapes, especially bimodal.This is a great advantage of the distribution proposed in relation to its sub-models because none has bimodal density.The Figure 1 indicate that α is a scale parameter, whereas τ, k, p and λ are shape parameters.The hazard rate function also presents some peculiar shapes.For instance, the blue hazard rate function in Figure 2(a) is initially increasing and then decreasing and finally increasing again.
The ExGGG distribution have an attractive physical interpretation whenever λ is positive integer.Consider a device made of λ components independent and identically distributed according to G(x) (1) in a series system.The device fails if any component fails.Let X 1 , • • • , X λ denote the lifetimes of the components, with common cdf G(x).Let X denote the lifetime of the device.Thus, the cdf F(x) of X is So, the lifetime of the device obeys the ExGGG distribution.ExGGG(α, τ = 1.2, k = 3, p = 0.2, λ = 1.2) ExGGG(α = 2, τ = 2.5, k, p = 0.9, λ = 0.15)

Expansion of the Density Function
Now, we demonstrate that the density function (3) can be expressed as a linear combination of GG density functions.This result is important to provide mathematical properties of the ExGGG distribution directly from properties of the GG distribution.
Let g α,τ,k (x) be the density function of the GG(α, τ, k) distribution given by For |z| < 1 and ρ ∈ R, we consider the power series where Considering (4) in (3), the pdf of the ExGGG(α, τ, k, p, λ) can be written as Grouping common terms, using (4) and binomial expansion, we have that where Therefore, using the result (28) (givin in Appendix A) in the expression (5), the pdf f (x) can be written as a linear combination of the distribution GG, in the form: where k • = k( j + m + 1) + q, g α,τ,k • (x) has distribution GG(α, τ, k • ) and the weightings w j,m,q (k, p, λ) are given by and the coefficients c j+m,q are determined from the recurrence relation (27) (Appendix A).
Expression (7) shows that the density function ExGGG distribution can be written in terms of a linear combination of densities GG.

Moments
Some important features of a distribution such as dispersion, skewness and kurtosis can be studied through its moments.This section we obtain two alternative expansions for the moments of the ExGGG distribution.Initially, we know that the rth ordinary moment of the GG(α, τ, k) distribution, denoted by µ ′ r,GG , is Now, follows from expressions (7) and ( 9), the rth moment ordinary of the ExGGG(α, τ, k, p, λ) is given by The expression (10) depends on the quantities c j+m,q which are obtained recursively by ( 27).
Another infinite sum representation for µ ′ r is obtained computing the moment directly, that is ) τ in the last expression, Considering ( 4) in ( 11) twice conveniently, we have that Using the binomial expansion in the term ] l , the last the expression is rewritten as ∑ ∞ l=m in the expression (12), we have Therefore µ ′ r can be rewritten as where s j,m (λ, p) is defined by expression (6), and This integral can be determined from expressions ( 24) and (25) of Nadarajah (2008) in terms of the Lauricella function of type A (Exton, 1978;Aarts, 2000) defined by where (a) i is the ascending factorial defined by (a) i = a(a + 1) • • • (a + i − 1) assuming (a) 0 = 1.Numerical routines for the direct computation of the Lauricella function of type A are available (Exton, 1978;Trott 2006).We obtain The graphic representations of the skewness and kurtosis measures in terms of λ for selected values of α, τ, k and p, are shown in Figure 3.

Moment Generating Function
Here, we provide two expressions for the mgf of ExGGG distribution based on mgf of GG distribution.Expanding the first exponential in Taylor series and using ∫ ∞ Using the result in ( 7), the mgf of ExGGG(α, τ, k, p, λ) is given by where k • = k( j + m + 1) + q and w j,m,q (k, p, λ) is given by (8).Therefore, However, for τ > 1, it can be simplified by considering the Wright generalized hypergeometric function (Wright, 1935) defined by This function exists if 1 + ∑ q j=1 B j − ∑ p j=1 A j > 0. Combining the results in (14) to rewrite (13), we have Finally, the mgf of ExGGG can be written from expressions ( 7) and ( 15) as ] .

Mean Deviations
The amount of scatter in a population is evidently measured to some extent by the totality of deviations from the mean and median.If X has the ExGGG distribution with density function f (x), we can derive the mean deviations about the mean µ ′ 1 = E(X) and about the median m 1 from the relations The measures δ 1 and δ 2 can be expressed as where The integral I(m 1 ) can be obtained from expression (7) as w j,m,q (k, p, λ) By setting u = x/α in the expression (17), we obtain The substitution w = u τ yields J(α, τ, k • , m 1 ) in terms of the incomplete gamma function Hence, inserting the last result into (17) gives The ExGGG mean deviations follow from ( 16) and the last expression.The result is analogous to I(µ ′ 1 ).

Reliability
The reliability measure of the ExGGG distribution is determined considering the expression presented by Cordeiro et al. (2016), given by where f (x) and F(x) are calculated from ( 7) and (2), respectively.The reliability can be written explicitly as follows Using the expansion (4) twice conveniently in the last expression, we have Using the binomial expansion in expression ( 18), we have Finally, setting w = where ) and w j,m,q (k, p, λ) is defined by expression (8).Using the Lauricella function of type A (defined in Section 4), the last integral can be written as

Order Statistics
The density function f i:n (x) of the ith order statistic, say X i:n , for i = 1, . . ., n, from random variables X 1 , . . ., X n having density (3), is given by where f (x) and F(x) are the pdf and cdf of the ExGGG distribution, respectively and B(•, •) denotes the beta function.We readily obtain using the binomial expansion However, if F(x) is the cdf of ExGGG distribution defined in (2) and u positive integer, we have Using the binomial expansion and (4) twice conveniently, we have Now, using the binomial expansion in the expression ) τ ]} a , we obtain where Inserting f (x)F(x) u given by ( 21) in expression (20), applying expansion (28) and rearranging terms, the density function of the ith ExGGG order statistics is expressed by where is defined above and c j+m,q is calculated recursively by ( 27).
Density function ( 22) gives the density function of the ith order statistics as a linear combination of GG densities.Hence, some of the mathematical quantities of the ExGGG order statistics can be derived by knowing those of the GG distribution.For example, the rth ordinary moment and the mgf and where M α,τ,k • (s) can be calculated from expression (13) or (15).

Inference and Estimation
In this Section, we discuss the maximum likelihood method and Bayesian approach for the inference and estimation of the ExGGG parameter model.We also assess the performance of the maximum likelihood method for estimating the ExGGG parameters using Monte Carlo simulation.

Maximum Likelihood Estimation
Here, we consider the estimation of the parameters of the ExGGG distribution by maximum likelihood method.
Let X i be a random variable following (3) with the vector of parameters θ = (α, τ, k, p, λ) T .Suppose that the data consist of n independent observations x i of X i for i = 1, . . ., n. Parametric inference for such data are usually based on likelihood methods and their asymptotic theory.The log-likelihood ℓ(θ) for the model parameters can be expressed as The score components corresponding to the parameters in θ are and where, The maximum likelihood estimates (MLEs) θ of θ is obtained numerically from the nonlinear equations For interval estimation and hypothesis tests on the model parameters, we require the 5 × 5 unit observed information matrix, say whose elements are given in Appendix B. Under conditions that are fulfilled for parameters in the interior of the parameter space but not on the boundary, the asymptotic distribution of √ n( θ − θ) is N 5 (0, I(θ) −1 ), where I(θ) is the expected information matrix.This matrix can be replaced by J( θ), i.e., the observed information matrix evaluated θ.The multivariate normal N 5 (0, J( θ) −1 ) distribution can be used to construct approximate confidence intervals for the individual parameters.For Bayesian estimation of the parameters of the ExGGG distribution, the following distributions were considered a priori: α ∼ Γ(0.01, 0.01), τ ∼ Γ(0.01, 0.01), k ∼ Γ(0.01, 0.01), p ∼ Be(0.5; 0.5) and λ ∼ Γ(0.01, 0.01).The estimation was done by using the Metropolis-Hastings algorithm in which, for each parameter, two independent chains with 100.000 iterations each were generated, by discarding the first 10.000 and by taking 10 by 10.At the end 9.000 samples were obtained for each parameter, in each chain.The convergence of the chains was monitored by using the test of Gelman and Rubin (1992) ( R), in which it was observed that all the parameters converged.Table 8 shows the posteriori mean, standard error, and the 95% highest posterior density intervals (HPD) a posteriori.The posteriori means present results similar to those obtained by the MLEs.The approximate posterior marginal density functions for the parameters are presented in Figure 6.
Considering the MLEs of ExGGG and ExWG distributions (Table 5) and its estimated survival function (Figure 5), we obtain some useful results.The estimate for the median permanence time of Brazilian immigrants in Japan is approximately equal to thirteen years and nine months for ExGGG distribution and twelve years and one months for ExWG distribution.The probability of a Brazilian to stay less than five years in Japan is 15.25% for ExGGG distribution and 17.55% for ExWG distribution.The probability of a immigrant to remain less than twenty years is 89.60% for ExGGG distribution and 87.08% for ExWG distribution.

Conclusions
We introduced a new five parameter distribution called the extended generalized gamma geometric (ExGGG) distribution   nential geometric, Rayleigh geometric, among some other distributions.Therefore the ExGGG distribution is suggested in a variety of problems for modeling lifetime data, such as bimodal and skewed.We demonstrated that the ExGGG density function can be expressed as a mixture of GG density functions.We derived explicit expressions for moments, moment generating function, mean deviations, reliability and order statistics.The estimation of parameters was approached by the method of maximum likelihood and by Bayesian method.Additionally the observed information matrix was determined.Furthermore, an application of the ExGGG distribution to real data showed that it could provide a better fit than other statistical models frequently used in lifetime data analysis.

Figure 1 .
Figure 1.Plots of the ExGGG density for some parameter values.

Figure 2 .
Figure 2. The ExGGG hazard rate function.(a) Plots of the hazard rate function for some parameter values.(b) Unimodal hazard rate function.(c) Bathtub hazard rate function.(d) Increasing and decreasing hazard rate function.

Figure 3 .
Figure 3. Skewness and kurtosis of the ExGGG distribution as a function of the parameter λ.

Figure 4 .Figure 5 .
Figure 4. Histogram and fitted density functions for the permanence time data in Japan.(a) Fitted ExGGG, GGG, GG and gamma distributions.(b) Fitted ExWG, WG, Weibull and exponential distributions.

Figure 6 .
Figure 6.Approximate posterior marginal densities for the parameters from the ExGGG model for the permanence time data in Japan.

Table 2 .
Empirical means and the MSEs in parentheses.

Table 3 .
Permanence time (years) in Japan of the Brazilian immigrants (n = 147).

Table 4 .
Descriptive statistics of the permanence time in Japan.

Table 5 .
MLEs of the model parameters for the permanence time data in Japan and the corresponding SEs in parentheses.

Table 6 .
Goodness-of-fit statistics for the permanence time data in Japan and the corresponding p-values in parentheses.

Table 8 .
Posterior summaries for the parameters from the ExGGG model for the permanence time data in Japan.