On Bias Correction in a Class of Inflated Beta Regression Models

Inflated beta regression models bear practical applicability in modeling rates and proportions measured continuously in the presence of zeros and/or ones. In this article, the second-order bias of maximum likelihood estimators for zero-or-one inflated beta regression model parameters is derived. This enables one to obtain corrected estimators that are approximately unbiased. Numerical results exhibit that corrected estimators show better performance in terms of mean-square error and bias when compared to maximum likelihood estimators.

The inflated beta linear regression model at point c is defined as follows. The independent random variables y 1 , . . . , y n are such y t , for t = 1, . . . , n, has density (1) with parameters α = α t , μ = μ t , and φ. It is assumed that α t and μ t are defined as where γ = (γ 1 , . . . , γ M ) and β = (β 1 , . . . , β m ) are unknown regression parameter vectors, such that γ ∈ I R M and β ∈ I R m , and z t1 , . . . , z tM and x t1 , . . . , x tm are observations of known exogenous variables, with m + M < n. Note that the values of z's and x's may fully or partially coincide. It is assumed that the link functions h : (0, 1) → I R and g : (0, 1) → I R are strictly monotone and twice differentiable. Here, φ is a precision parameter that is constant for all observations.
The likelihood function for the parameter vector θ = (γ , β , φ) , of the zero-or-one inflated beta linear regression model is given by bi c (y t ; α t , μ t , φ) = L 1 (γ)L 2 (β, φ), with (1 − α t ) 1−1l {c} (y t ) and L 2 (β, φ) = t:y t ∈(0,1) f (y t ; μ t , φ), www.ccsenet.org/ijsp International Journal of Statistics and Probability Vol. 1, No. 2; where 1l A (y) = 1 if y ∈ A, and 1l A (y) = 0 if y A, and the parameters μ t and α t satisfy (3), that is, α t = h −1 (ζ t ) and μ t = g −1 (η t ). Note that the likelihood function (4) can be factored into two terms, one that depends solely on the parameter vector γ and another that depends solely on β and φ. Therefore, the parameter vectors γ and (β , φ) are separable (Pace & Salvan, 1997, p. 128) and maximum likelihood inference on (β , φ) can be conducted separately from that for γ, as if the value of γ were known, and vice-versa. Further, note that the discrete component L 1 (γ) involves only the parameters used to model the probability of occurrence of zero or one. In contrast, the continuous component L 2 (β, φ) only involves the parameters used to model the conditional distribution of the response variable, given that it belongs to the interval (0, 1).
The log-likelihood function for θ is given by where By the separability of the parameter vectors γ and (β , φ) , one can independently obtain the score for γ and the score for (β , φ) .
From the separability of the parameters γ and (β , φ) , the MLE of γ is obtained independently from that of (β , φ) as the solution to the nonlinear system U γ (γ) = 0. Likewise, the MLE of (β , φ) is obtained as the solution of the nonlinear system (U β (β, φ) , U φ (β, φ)) = 0. Note that, in both cases, such estimators do not have closed form. They can be obtained numerically by maximizing the log-likelihood function using a nonlinear www.ccsenet.org/ijsp International Journal of Statistics and Probability Vol. 1, No. 2; optimization algorithm, such as the Newton algorithm (Newton-Raphson, Fisher score, BHHH, etc.) or a quasi-Newton algorithm (BFGS); refer to Press et al. (1992, Chapters 9 and 10).
From the standard formula for the inverse of partitioned matrix (see, for instance, Rao, 1973, p. 33), it can deduced that the Fisher information matrix inverse is given by where with I m being the m × m identity matrix. The inverse of Fisher's information matrix is useful for computing asymptotic standard errors of MLEs. For more details see Ospina and Ferrari (2012).

Cox and Snell's Formula
Let L(ω) be the log-likelihood function of a parameter vector ω = (ω 1 , . . . , ω k ) for a sample of n observations, and let ω = ( ω 1 , . . . , ω k ) be the MLE of ω obtained as the solution of the system of equations U(ω) = ∂L(ω)/∂ω = 0. Cox AND Snell (1968) obtained the O(n −1 ) term of the bias of ω r as where −κ rs = κ r,s is the (r, s) element of the Fisher information matrix inverse, κ stu = E(∂ 3 L(ω)/∂ω s ∂ω t ∂ω u ), and κ st,u = E(∂ 2 L(ω)/∂ω s ∂ω t × ∂L(ω)/∂ω u ). In many situations, it is convenient to use the Bartlett identity to facilitate expressing the bias in matrix notation. The great usefulness of (11) is in defining a corrected MLE, which is unbiased up to order O(n −1 ), given by where B( ω r ) is given by (11) evaluated at ω. This new estimator, ω r , has bias of order O(n −2 ), as E( ω r ) = ω r + O(n −2 ), and may be preferred over ω r that has bias of order O(n −1 ).
Our aim is to obtain an expression to calculate the second-order bias of the vector of the MLEs for zero-or-one inflated beta regression models using formula (11).

Bias of γ and ( β , φ)
Now, we obtain an expression for the second order biases of the MLEs in a class of inflated beta regression models using the Cox and Snell's (1968) general formula. This expression will, in turn, allow us to obtain bias corrected estimates of the unknown parameters. The notation to be used is introduced below. The derivatives of the loglikelihood function (5) with respect to the unknown parameters are indicated by indices, where the letters R, S , . . . correspond to the derivatives with respect to the elements of γ, the letters r, s, . . . correspond to the derivatives with respect to the elements β, and φ corresponds to the derivatives with respect to φ. For instance, U R = ∂ /∂γ R , U r = ∂ /∂β r , U φ = ∂ /∂φ, U φs = ∂ 2 /∂φ∂β s , U rsφ = ∂ 3 /∂β r ∂β s ∂φ, and so forth. The notation for the cumulants of log-likelihood derivatives is borrowed from Lawley (1956) , and so on. In the zero-or-one inflated beta regression model, the separability between γ and ϑ = (β , φ) enables us to independently obtain the second-order bias of γ and ( β , φ) . From (11) and the aforementioned Bartlett identity, the second-order bias of the b-th element of γ = ( γ 1 , . . . , γ M ) reduces to The terms in (12) are obtained from the cumulants of the log-likelihood function given in the Appendix.
. . , w 0n } is an n × n diagonal matrix with t-th diagonal element given by and δ γγ represents the n × 1 vector obtained from the main diagonal of ZK γγ Z = Z(Z QZ) −1 Z . We obtain where Z is the covariate matrix used to model the discrete component of the model. Now, from (11), the second-order bias of the a-th element of ϑ = ( β 1 , . . . , β m , φ) can be written as for a = 1, . . . , m + 1. Note that β and φ are not orthogonal parameters, as the block K βφ in the information matrix is not a null matrix. It forces us to calculate all the terms in B( ϑ a ). The cumulants of 2 (β, φ) that appear in (14) are given in the Appendix. After some calculations we arrive at where diagonal(·) represents the row vector formed by the main diagonal of a square matrix, K ββ , K φφ and K βφ are blocks of the Fisher information matrix inverse, and W i , for i = 1, . . . , 5 are given in the Appendix.
The second-order bias of β can be written as where X is given in (8).
As earlier, after some algebraic manipulations, we have that Let K φ * = (K φβ K φφ ) be a 1 × (m + 1) matrix. The second-order bias of φ can be written as Let ϑ = (β , φ) . The second-order bias of the MLE of ϑ can be written as where ξ = W −1 δ with W given in (9). Consequently, B( ϑ) can be estimated from a generalized least squares regression in the auxiliary variable ξ. Note that the expression for the correction of the second-order bias of ϑ involves the parameters of the discrete and continuous components of the model.

Bias of Smooth Functions of φ
For the zero-or-one inflated beta regression model, a reparameterization of the precision parameter, φ, can be considered. Let σ = T (φ), where T (·) is a continuous strictly monotone twice differentiable function. From the invariance property of the MLEs, we have σ = T ( φ). From a Taylor series expansion of T (φ) up to the third term in a neighbourhood of φ, we get By taking the expected value, the second-order bias of σ can be written as where B( φ) is the second-order bias of φ given in (16), and V( φ) = K φφ is the asymptotic variance of φ. In Table 1, different specifications for σ are presented along with the second-order bias of the respective MLE.
In Figure 1 the behavior of relative bias of B( σ)/σ as a function of φ is presented. Note that the relative bias of the MLEs corresponding to the parameterizations σ = φ and σ = 1/(φ + 1) 2 goes to a non-zero constant value as φ increases. Also, note that for the parameterization σ = 1/(φ + 1), the relative bias behaves approximately as a constant function at zero and this is valid even for moderate sample sizes (for instance, n = 80). Finally, for the parameterization σ = log φ, it is observed that the relative bias decreases to zero as φ grows. In light of these facts, the use of the parameterization σ = 1/(φ + 1) can be suggested because even with moderate values of n and small values of φ, the relative bias remains stable and close to zero.
t = 1, . . . , n. Here, h and g are logit link functions. The true parameter values are γ 0 = −0.5, γ 1 = 1.5, β 0 = 0.5, β 1 = 1.8, and φ = 120; the covariate values, z t and x t , are independently selected from the U(0, 1) distribution. For this experiment, the sample sizes are n = 30, 60, and 90, and 5000 Monte Carlo replications. For each simulated sample, we fitted the regression model (19); that is, by maximizing the log-likelihood function, we obtained the estimates γ = ( γ 0 , γ 1 ), ϑ = ( β 0 , β 1 , φ), and σ = 1/( φ + 1). In addition, from the results in Section 3, we calculated the corrected MLEs, γ = ( γ 0 , γ 1 ), ϑ = ( β 0 , β 1 , φ), and σ. In the Monte Carlo experiment, we used the "multiply-with-carry" algorithm (GM) as a pseudorandom number generator, with period 2 60 . The log-likelihood function is maximized through the BFGS method with analytical derivatives, which, in general, is the method that presents the best performance (Mittelhammer, Judge, & Miller, 2000, p.199). All the simulations were programmed using the Ox matrix programming language Ox (Cribari- Neto & Zarkos, 2003). For each sample size, the bias, relative bias, and root-mean-square error of the 5000 estimates were calculated. Tables 2 and 3 present the simulation results. Notice that the bias, relative bias, and mean square error decrease as the sample size increases, as expected. In Table 2, we observe that the bias of the MLEs of the regression parameters that model the discrete component are non-negligible for small sample sizes. We note that the analytical correction improves the performance of MLEs when the sample size is small in terms of the bias and relative bias. For example, when n = 30, the relative biases are 0.1235 for γ 0 , and 0.1028 for γ 0 .
The MLEs and their corrected versions show similar performance in terms of root-mean-square error. We note that in general, the corrected estimators present a negligible precision gain in comparison to the MLEs of the parameters that model the discrete component, even in small sample sizes.  Table 3 shows the results regarding the MLEs of the regression parameters of the continuous component of the model and their corrected versions. We note that the performance of these estimators is similar, regardless of the sample size. For example, for n = 60, the mean of β 1 differs from the mean of β 1 in the third decimal place. Likewise, the root-mean-square errors are close. For example, for n = 60, √ MSE = 0.1757 for β 1 and √ MSE = 0.1748 for β 1 . Further, note that the biases and relative biases are close to zero. This indicates that the MLEs for the regression parameters of the continuous component present good sample properties; that is, the estimated values of the regression parameters are close to the true values of the regression parameters, even when the sample size is relatively small. These results are consistent with those obtained by Ospina et al. (2006) for beta regression models. Table 3 indicate that the MLEs of φ are markedly biased, and the estimated bias is positive for the different sample sizes considered. The bias correction proposed in this article significantly reduces the bias. For example, for n = 60, the bias of φ is 16.7394 while the bias of φ is 0.8568. Consequently, the bias correction becomes very important, because if φ is overestimated, the variance of the response variable may be underestimated.

The figures in
In hypothesis testing, this may lead to the investigator erroneously rejecting the hypothesis that the regression parameters are zero. We also notice that the corrected estimator φ presents smaller root-mean-square errors than φ for all the sample sizes considered. In other words, the corrected estimator is less biased and more accurate than the uncorrected MLE of φ. Therefore, we recommend the use of the bias corrected estimator for φ in practical applications.
Finally, we note that the MLE of σ = 1/(φ + 1) has a relative bias smaller than that of φ. For instance, for n = 60, φ has a relative bias equal to 0.1395 while the relative bias of σ is −0.0683. In order to compare the accuracy of the MLE of σ with that of φ it is more convenient to use the scaled root-mean-square error ( √ MSE/parameter). In all of the cases, the MLE of σ outperforms the MLE of φ. For example, for n = 60, the scaled root-mean-square errors for φ is √ MSE/φ = 0.34371 and for σ is √ MSE/σ = 0.2420. It is therefore more convenient to parameterize the zero-or-one inflated beta regression model in terms of σ instead of the precision parameter φ, since the MLE of σ is less biased and more accurate than the MLE of φ.

Concluding Remarks
In this article, we derived expressions for the second-order bias of MLEs for inflated beta regression model parameters. We showed that the second-order biases obtained using the Cox and Snell (1968) formula may be written in terms of generalized least square regressions, which facilitates the calculation. The expressions found allow one to construct analytically modified MLEs with reduced bias.
The simulation results show that the bias corrections of the MLEs for the regression parameters that model the discrete and continuous components of inflated beta regression models are effective in reducing bias. However, bias corrections are not imperative because the uncorrected MLEs are not markedly biased. On the other hand, the MLE of the precision parameter is highly biased, and therefore, we recommend the use of the bias correction obtained in this article. Also, almost unbiased estimators are obtained if the model is parameterized in terms of σ = 1/(φ + 1) and the bias correction proposed in this paper is employed.

Appendix
Cumulants and their derivatives with respect to γ For R = 1, . . . , M, we have By taking expected value, we have where for U = 1, . . . , M. From the global orthogonality between γ and (β , φ) we obtain κ (φ) RS = κ (u) RS = 0.  (20) and (21) we have . Let W 0 = diag{w 01 , . . . , w 0n } be the n × n diagonal matrix with t-th diagonal element given by Hence, Consequently, the Cox and Snell (1968) formula to calculate the second-order bias of the MLE of the b-th element of γ is Let e ∼b be the b-th column of the M × M identity matrix. We have where z t is the column vector obtained from the t-th row of Z. Now, let δ γγ represent the n × 1 vector obtained from the main diagonal of ZK γγ Z . We have Cumulants and their derivatives with respect to β and φ We define the quantities We have +w t a t x ts x tr x tu , κ φφφ = − n t=1 (1 − α t )s t .