Dagum Distribution : Properties and Di ff erent Methods of Estimation

This article addresses the various properties and different methods of estimation of the unknown parameters of a threeparameter Dagum distribution from the frequentist point of view. Although, our main focus is on estimation from frequentist point of view, yet, various mathematical and statistical properties of the Dagum distribution (such as quantiles, moments, moment generating function, hazard rate, mean residual lifetime, mean past lifetime, mean deviation about mean and median, various entropies, Bonferroni and Lorenz curves and order statistics) are derived. We briefly describe different frequentist approaches, namely, maximum likelihood estimators, moments estimators, L-moment estimators, percentile based estimators, least squares estimators, maximum product of spacings estimators, minimum distances estimators, Cramér-von-Mises estimators, Anderson-Darling and right-tail Anderson-Darling estimators and compare them using extensive numerical simulations. Monte Carlo simulations are performed to compare the performances of the proposed methods of estimation for both small and large samples. Finally, a real data set have been analyzed for illustrative purposes.


Introduction
Dagum distribution was introduced by Dagum (Dagum, C., 1977) for modeling personal income data as an alternative to the Pareto and log-normal models.This distribution has been extensively used in various fields such as, income and wealth data, meterological data, reliability and survival analysis.The Dagum distribution is also known as the inverse Burr XII distribution, especially in the actuarial literature.An important characteristic of Dagum distribution is that its hazard function can be monotonically decreasing, an upside-down bathtub, or bathtub and then upside-down bathtub shaped, for details see Domma (Domma, F., 2002).This behavior has led several authors to study the model in different fields.In fact, recently, the Dagum distribution has been studied from a reliability point of view and used to analyze survival data (Domma, et al., 2011;Domma, et al., 2013).Kleiber and Kotz (Kotz, S., 2003) and Kleiber (Kleiber, C., 2008) provided an exhaustive review on the origin of the Dagum model and its applications.Domma et al. (Domma, et al., 2011) estimated the parameters of Dagum distribution with censored samples.Shahzad and Asghar (Shahzad, M. N., & Asghar, Z., 2013) used TL-moments to estimate the parameter of this distribution.Oluyede and Ye (Oluyede, B. O., & Ye, Y., 2014) presented the class of weighted Dagum and related distributions.Domma and Condino (Domma, F., & Condino, F., 2013) proposed the five parameter beta-Dagum distribution.
A continuous random variable T is said to have a three-parameter Dagum distribution, abbreviated as T ∼ Dag(β, λ, δ), if its density probability function (pdf) is given as where λ > 0 is the scale parameter and its two shape parameters β and δ are both positive.
The corresponding distribution function of (1) is given by (2) The main aim of this paper is to consider different estimation methods and study how the estimators behave for different sample sizes and for different parameter values.We mainly compare; maximum likelihood estimators, moments estimators, L-moment estimators, percentile based estimators, least squares estimators, maximum product of spacings estimators, minimum distances estimators, Cramér-von-Mises estimators, Anderson-Darling and right-tail Anderson-Darling estimators.
The maximum likelihood estimation (MLE) and the method of moments estimation (MME) are traditional methods of estimation.Although MLE is advantageous in terms of its efficiency and its nice theoretical properties, there is evidence that it does not perform well, especially, for small samples.The method of moments is easily applicable and often gives explicit forms for estimators of unknown parameters.There are, however, cases where the method of moments does not give explicit estimators e.g., for the parameters of the Weibull and Gompertz distributions.Therefore, other methods have been proposed in the literature as alternatives to the traditional methods of estimation.Among them, the L-moments estimator (LME), the least squares estimator (LSE), the generalized spacing estimator (GSE) and the percentile estimator (PE) are often suggested.Generally, these methods do not have good theoretical properties but in some cases they can provide better estimates of the unknown parameters than the MLE and the MME.
The appeal of the methods of estimation varies from user to user and according to the area of application.This paper considers ten different frequentist estimators for the Dagum distribution and evaluates their performance for different sample sizes and different parameter values.Simulations are used to compare the performance as it is not possible to compare all estimators theoretically, see Gupta and Kundu (Gupta, R. D. & Kundu, D., 2001;Gupta, R. D. & Kundu, D., 2007).
The paper is organized as follows.In Section 2, we study the mathematical and statistical properties of the distribution.Section 3 deals with parameter estimation; simulation and real data application are presented in Section 4. The paper ends with a brief conclusion in Section 5.

Some Mathematical and Statistical Properties
In this section, we provide some important mathematical and statistical properties of Dagum Distribution like quantiles, moments, moment generating function, hazard rate and mean residual life functions, conditional moments, mean deviation, Bonferroni and Lorenz curves, Rényi and Shannon entropy.

Shape of Pdf
The study of shapes is useful to determine if a data set can be modeled by the Dag(β, λ, δ).The limit of Dagum density as x → ∞ is 0 and the limit as x → 0 is ∞.The following theorem gives simple conditions under which the pdf (1) is decreasing or unimodel.
Proof.The first derivative of ln f (t) is Now, it is easy to see that for βδ ≤ 1, the function (ln f (x)) ′ is negative, which implies that the pdf f (x) is decreasing function with f (0) = ∞ and f (∞) = 0. We consider now the case βδ > 1, the function (ln f (x)) ′ is increasing for λ > 1/βδ and decreasing for λ ≤ 1/βδ. 2 Figure 1 shows different shapes of the pdf of the Dagum distribution for various parameter specifications.

Quantile Function
Let X denote a random variable with the pdf given by (1).The quantile function, denoted by , is From (2), it follows that the quantile function The first quartile, the median and the third quartile can be obtained simply by applying (3).In particular, for p = 0.5 we have the median of the Dagum distribution as follows.

Moments
We hardly need to emphasize the necessity and importance of the moments in any statistical analysis especially in applied work.Some of the most important features and characteristics of a distribution can be studied through moments, e.g.tendency, dispersion, skewness, and kurtosis.
If the random variable T is distributed as Dag(β, λ, δ), then its kth moment around zero can be expressed as where B (•, •) is the complete beta function.From relation (5) we can observe the mean, µ = E[T ], and the variance, σ 2 = V(T ), of T as follows.

Moment Generating Function
Many of the interesting characteristics and features of a distribution can be obtained via its moment generating function (mgf) and moments.Let T denote a random variable with the probability density function (1).By definition of moment generating function of T and using (1), we have Consequently, the rth moment of T is The coefficient of variation(CV), Skewness(CS) and Kurtosis(CK) are, respectively, given by and ) .

Hazard Function
Among the basic tools for studying the ageing and reliability characteristics of a system is the hazard rate (HR) function.
The HR gives the rate of failure of the system immediately after time t.Thus the hazard rate function of the Dagum distribution is given by Using the Glasers lemma (Glaser, 1980), it can be proved that if F X ∼ Dag(β, λ, δ) then F can have upside-down bathtub, decreasing, and upside-down bathtub and then bathtub shaped, see Domma (Domma, 2002).
Figure 1 gives different shapes of the hrf of the Dagum distribution for various parameter specifications.

Mean Residual Lifetime
The mean residual life (MRL) is the expected remaining life, T − t, given that the item has survived to time t.Thus, in life testing situations, the expected additional lifetime given that a component has survived until time t is called the MRL.
Since the MRL function is the expected remaining life, t must be subtracted, yielding Figure 2. The hrf of the Dagum distribution for various parameter specifications

Mean Past Lifetime (MPL)
In a real life situation, where systems often are not monitored continuously, one might be interested in getting inference more about the history of the system e.g., when the individual components have failed.Assume now that a component with lifetime X has failed at or some time before t, t ≥ 0. Consider the conditional random variable t − T |T ≤ t.This conditional random variable shows, in fact, the time elapsed from the failure of the component given that its lifetime is less than or equal to t.Hence, the mean past lifetime (MPL) of the component can be defined as One can easily show that k(t) → ∞ as t → 0.

Conditional Moments
For lifetime models, it is also of interest to know what E(T n |T > t) is.It can be easily seen that In particular . . . .
The mean residual lifetime function is E(T |T > t) − t .

Mean Deviation
The mean deviations about the mean and the median can be used as measures of spread in a population.Let µ = E(T ) and M be the mean and the median of the Dagum distribution, respectively.The mean deviations about the mean and about the median can be calculated as respectively.

Entropies
The concept of entropy is important in different areas such as physics, probability and statistics, communication theory, economics, etc.Several measures of entropy have been studied and compared in the literature.Entropy of a random variable T is a measure of variation of the uncertainty.If T has the probability density function f (t) then the Shannon entropy (see Shannon (1951)) is defined by where ψ(.) is the digamma function.Rényi entropy (Rényi (1961)) can be expressed as where z = (1 + λt −δ ) −1 .When θ → 1, the Rényi entropy converges to the Shannon entropy.

Order Statistics
Suppose T 1 , T 2 , . . .T n is a random sample from (2).Let T 1:n ≤ T 2:n ≤ . . .≤ T n:n denote the the corresponding order statistics.It is well known that the probability density function of the of rth order statistic, say T r:n , 1 ≤ r ≤ n, is given by and the cumulative distribution function for k = 1, 2, . . ., n.It follows from ( 1) and ( 2) that and The jth moments of T r:n can be expressed

Bonferroni and Lorenz Curves
Bonferroni and Lorenz curves are proposed by Bonferroni (1930).These curves have applications not only in economics to study income and poverty, but also in other fields like reliability, demography, insurance and medicine.They are defined as 16) where µ = E[X] and q = F −1 (p).By using (1), one can reduce ( 16) and ( 17) to ) .

Methods of Estimation
In this section we describe ten estimation methods for estimating the parameters β , λ and δ of the Dag(β, λ, δ) distribution.For all methods, we consider the case when all the parameters β, λ and δ are unknown.This is also considered in the simulation study presented in Section 4.

Method of Maximum Likelihood
The method of maximum likelihood is the most frequently used method of parameter estimation (see Casella and Berger (1990)).The method's success stems no doubt from its many desirable properties including consistency, asymptotic efficiency, invariance and simply its intuitive appeal.Let t 1 , . . ., t n be a random sample of size n from the Dag(β, λ, δ) distribution with parameters β, λ and δ.
From (1) the likelihood is and log-likelihood function is The maximum likelihood estimators of β MLE , λ MLE and δ MLE of the parameters β, λ and δ, can be obtained numerically by maximizing, with respect to β, λ and δ, the log-likelihood function (19).In this case, the log-likelihood function is maximized by solving in β, λ and δ, the non-linear equations are:

Method of Moments
The MMEs of the three-parameter Dag(β, λ, δ) distribution can be obtained by equating the first three theoretical moments of (1) with the sample moments

Method of L-Moments
In this subsection we provide the L-moments estimators, which can be obtained as the linear combinations of order statistics.The L-moments estimators were originally proposed by Hosking (1990), and it is observed that the L-moments estimators are more robust than the usual moment estimators.The L-moment estimators are also obtained along the same way as the ordinary moment estimators, i.e., by equating the sample L-moments with the population L-moments.L-moment estimation provides an alternative method of estimation analogous to conventional moments and have the advantage that they exist whenever the mean of the distribution exists, even though some higher moments may not exist, and are relatively robust to the effects of outliers (Hosking, 1994).
Let t 1:n < • • • < t n:n be the order statistics of a random sample of size n from DAG(β, λ, δ) distribution.From Hosking(1990), the first, second and third sample L-moments, respectively, are Since the quantile function of the Dag(β, λ, δ) distribution is as given in (3), then the first, second and third population L-moments of θ = (β, λ, δ), respectively, are and The L-moments estimators β LME , λ LME and δ LME of the parameters β , λand δ can be obtained by solving numerically the equations

Method of Maximum Product of Spacings
Cheng and Amin (1979and Amin ( , 1983) ) introduced the maximum product of spacings (MPS) method as an alternative to MLE for the estimation of parameters of continuous univariate distributions.Ranneby (1984) independently developed the same method as an approximation to the Kullback-Leibler measure of information.
or, equivalently, by maximizing the function The estimators θ MPS = ( β MPS , λ MPS , δ MPS ) of the parameters θ = (β, λ, δ) can also be obtained by solving the nonlinear equations where Cheng and Amin (1983) showed that maximizing H as a method of parameter estimation is as efficient as MLE estimation and the MPS estimators are consistent under more general conditions than the MLE estimators.

Methods of Ordinary and Weighted Least-Squares
The least square estimators and weighted least square estimators were proposed by Swain et al. (1988) to estimate the parameters of Beta distributions.Using the same notations in subsection (3.3), it is well known that .

Method of Percentiles
The Dagum distribution has an explicit distribution function, therefore in this case the unknown parameters β, λ and δ can be estimated by equating the sample percentile points with the population percentile points and it is known as the percentile method.This method was originally suggested by Kao (1958Kao ( , 1959) ) and it has been used for Weibull distribution and for generalized exponential distribution.In this paper, we apply the same technique for the Dagum distribution.If p i denotes an estimate of F (t i:n | β, λ, δ), then the percentile estimators β PCE , λ PCE and δ PCE of the parameters β, λ and δ can be obtained by minimizing, with respect to β, λ and δ the function: where p i = i n+1 is the unbiased estimator of F (t i:n | β, λ, δ)(see Mann et al.(1974)).The estimates of β, λ and δ can be obtained by solving the following nonlinear equations ] p respectively.

Methods of Minimum Distances
In this section we present three estimation methods for β, λ and δ based on the minimization of the goodness-of-fit statistics with respect to β , λ and δ.This class of statistics is based on the difference between the estimates of the cumulative distribution function and the empirical distribution function.

Methods of Anderson-Darling and Right-tail Anderson-Darling
The Anderson-Darling test was developed by Anderson andDarling (1952, 1954) as an alternative to other statistical tests for detecting sample distributions departure from normality.Specifically, the AD test converge very quickly towards the asymptote.
The right-tail Anderson-Darling estimates α RT ADE and β RT ADE of the parameters α and β are obtained by minimizing R(β, λ, δ) with respect to α and β: adequate fit for the data.

Conclusion
In this article, we provide explicit expressions for the quantiles, moments, moment generating function, conditional moments, hazard rate, mean residual lifetime, mean past lifetime, mean deviation about mean and median, various entropies, order statistics and

Figure 1 .
Figure 1.The pdf of the Dagum distribution for various parameter specifications

Figure 3 .Figure 4 .
Figure 3.Estimated densities for breaking stress of carbon fibers data.

Table 3 .
MLEs (standard errors in parentheses) for breaking stress of carbon fibers data.

Table 4 .
Goodness-of-fit tests and the measures AIC, BIC, HQIC and CAIC for breaking stress of carbon fibers data.

Table 5 .
Estimates of the parameters of Dagum distribution for breaking stress of carbon fibers data.