Objective Bayesian Snalysis for the Complementary Exponential Geometric Model Applied to Cancer Data

Abstract In this paper we provide a reference Bayesian framework to a new two-parameter lifetime distribution with increasing failure rate, the complementary exponential geometric (CEG). To this end, we presented some of the main properties of this model and its characteristics related to the reliability analysis. A simulation study is performed to analyse the frequentist properties of credible intervals from the reference posterior distribution among of the standard error and mean square error (MSE) of estimations. The presented methodology is illustrated by the use of a real data set which presents the study of time until the cure of cervix lesions, that are precursors cancer lesions in the cervix. According to to INCA (Cancer National Institute), cervical cancer stands as the fourth cause of death among women in Brazil. Together with breast cancer, it is one of the most common malignancy affecting women worldwide. For this reason, patients must be carefully evaluated for metastatic disease. These data were collected in the Woman Clinic which is sited in Maringá city (Paraná State, Brazil).

Indeed, these advances are available in many countries that have approved the introduction of the HPV vaccine.However, gaps in knowledge about the causal role of HPV in cervical cancer and the benefits of preventing HPV infection may hamper the successful introduction of the technologies.The greater the womens knowledge of HPV and its role in the development of cervical cancer, the greater will be the adherence to preventive measures.
The presented methodology in this paper is applied to a real study involving patients treated in the Woman Clinic sited in Maringá city, Paraná State, Brazil.The data consist of the time until the cure of CIN (in months) in 363 women followed up from years 2,000 to 2,006.For all patients was observed the maximum time that the initial lesions took to be considered totally cure (after specific diagnostics).We consider a new two-parameters lifetime distribution with increasing failure rate, the complementary exponential geometric distribution proposed by (Louzada, et al., 2011).which is complementary to the exponential geometric model proposed by Adamidis & Loukas (1998).The new distribution arises on a latent complementary risks scenarios, where the lifetime associated with a particular risk is not observable, rather we observe only the maximum lifetime value among all risks.This distribution is based on a generalization of the the exponential distribution, which is a widely used lifetime distribution for modeling many problems in lifetime testing and reliability study.
According to Bernardo (Bernardo, 1979), in the quest for objective posterior distributions, several requirements have emerged which may reasonably be requested as necessary properties of any proposed solution: generality such procedure should be completely general; invariance (Jeffreys, 1946;Datta Ghosh, 1995;Datta Ghosh, 1996); present consistent marginalization; and the properties under repeated sampling of the posterior distribution must be consistent with the model (Neyman Scott, 1948;Lane Sudderth, 1984).
Under these cited assumptions, Bernardo (Bernardo, 1979) introduced the reference analysis which was further developed by Berger and Bernardo (Berger Bernardo, 1992a) and Berger and Bernardo (Berger Bernardo, 1992b).The technique, according to those authors, appears to be the only available method to derive objective posterior distributions which satisfy all these desiderata and even for moderate sample sizes, the information provided by the data should dominate the prior information because of the vague nature of the prior knowledge.
Alternative to this reference prior, the Jeffreys prior is widely used in Bayesian with an important characteristic that it is invariant under injective transformations (Paulino, et al., 2003).Despite the property of invariance, the posterior distribution by using Jeffreys prior may be improper (unlike by using reference prior) and lead to a uninformative distribution.Also, the use of the Jeffreys rule in the multiparametric case is often inadequate.The assumption of a prior independence between parameters of different natures, and the separate use of Jeffreys rule for specification of marginal distributions may give different results than obtained by the Jeffreys principle (Berger, 1985).So, in this paper the parameter estimation will be considered from the reference Bayesian perspective for the parameters of the the complementary exponential geometric (CEG) distribution (the MLEs of the CEG model in a classical context can be seem in (Louzada, et al., 2011)).
The paper is organized as follows.Section 2, we presented a briefly introduction to a model CEG with some particularities and an overview of reference analysis with emphasis on the case of two parameters.In Section 3, the inference for the CEG model is presented by using the reference prior built.After presenting the model and inference aspects, a simulation study was made and it can be seem in Section 4. Finally, in Sections 5 and 6, respectively, we can see the usefulness of the CEG model in a Bayesian context by studying the time until the cure of precursors cancer cervix lesions and some conclusions.

The CEG Model
Proposed by (Louzada, et al., 2011), the CEG model was formulated to describe lifetime with increasing failure rate.According to the authors, this is complementary to the exponential geometric model proposed by Adamidis and Loukas (Adamidis Loukas, 1998) and arises on a latent complementary risks scenario, in which the lifetime associated with a particular risk is not observable.Instead, we observe only the maximum lifetime value among all risks.For more details on latent risk problem, interested readers can refer to Basu and Klein (Basu Klein, 1982) and Louzada-Neto (Louzada-Neto, 1999).
Let M be a random variable denoting the number of failure causes, m = 1, 2, . .., and considering M with geometrical distribution of probability given by where 0 < θ < 1 and M = 1, 2, . ... Let us also consider t i , i = 1, 2, 3, . .., realizations of a random variable denoting the failure time, i.e., the time-to-event due to the ith complementary risk, with T i having an exponential distribution with probability index λ, given by In the latent complementary risk scenario, the number of causes M and the lifetime t i associated with a particular cause are not observable (latent variables), and only the maximum lifetime Y among all is usually observed.So, we only observed the random variables given by Y = max(t 1 , t 2 , . . ., t M ). (3) By considering those descriptions, the model CEG was built and the Proposition 2.1 follows.
Proposition.Let Y be a nonnegative random variable denoting the lifetime of an individual in some population.The random variable y is distributed according to a CEG distribution, with parameters λ ∈ Λ and θ ∈ Θ, Λ = {λ; λ ∈ (0, +∞)}, Θ = {θ; θ ∈ (0, 1)}, if its probability density function (pdf) and cumulative distribution function are given, respectively, by and Proof.The build of CEG model and some specific properties can be find in Louzada et al. (Louzada, et al., 2011).
In order to show the behaviour of the probability density and cumulative function are shown, respectively, in left and right panels of the Figure 1.In reliability analysis, we have interest in two important characteristic of a random variable Y that represents the lifetime, namely, the reliability function R(y|θ, λ), which is the probability of an item not failing prior to some time y, is defined by R(y|θ, λ) = 1 − F(y|θ, λ); and the hazard function, which can be loosely interpreted as the conditional probability of failure, given it has survived to the time y (Lawless, 2003).In what follows, we exhibit the equations of reliability and hazard functions; else, in Figure 2 we show the behaviour of the reliability and hazard functions for some specifics values of parameter θ and λ = 2 fixed.
Let Y be a nonnegative random variable denoting the lifetime of an individual in some population.The reliability function R(y|θ, λ) is written as where λ ∈ Λ and θ ∈ Θ are the scale and shape parameters.The hazard, h(y|θ, λ), and cumulative hazard rate, H(y|θ, λ) functions are given by and λ ∈ Λ and θ ∈ Θ.

Reference Analysis
In this section we present the declared objective of reference Bayesian analysis introduced by (Bernardo,1979) and further developed by (Berger Bernardo, 1992a) and (Berger Bernardo, 1992b) is to specify a prior distribution such that, even for moderate sample sizes, the information provided by the data should dominate the prior information because of the "vague" nature of the prior knowledge.
An important feature in the Berger-Bernardo approach to construct a non-informative prior is the different treatment for interest and nuisance parameters.When there are nuisance parameters (typical case in this paper), one must establish an ordered parametrization with the parameter of interest singled out and then follow the procedure below.
be a probability model with two real-valued parameters θ and λ, where θ is the quantity of interest.Let I(θ, λ) the corresponding 2x2 Fisher's matrix in terms of θ and λ, and let V(θ, λ) = I −1 (θ, λ).Suppose that the joint posterior distribution of (θ, λ) is asymptotically normal with covariance matrix V( θ, λ), where θ and λ are the corresponding consistent estimators of θ and λ.It follows that: (i) the conditional reference prior of λ given θ is is required, and the reference prior of λ given θ is (iii) the sequence of priors can be obtained as Proof.See a heuristic justification in Bernardo (2005).
Corollary.If the nuisance parameter space Λ (θ) = Λ is independent of θ, and the functions v −1/2 11 (θ, λ) and I 1/2 22 (θ, λ) factorize in the form Thus, the reference prior relative the ordered parametrization (θ, λ) is given by and there is no need for compact approximation, even if the conditional reference prior π (λ|θ) is not proper.
Considering that the posterior distribution is assymptotically normal, then the reference prior only depends on Fisher information matrix.Here we derive the reference prior considering the approach of one nuisance parameter described above.
Then, the likelihood and log-likelihood functions of (θ, λ) based on the observed sample of size n, y = y 1 , y 2 , . . ., y n , from the CEG distribution (4) are given, respectively, by and In order to find the Fisher information matrix, the first-order derivates of the log-likelihood function (11) for a single observation are given by ∂l (θ, λ|y 1 , . . ., and ∂l (θ, λ|y 1 , . . ., The second-order derivates of the log-likelihood function ( 11) are given by Finally, the Fisher information matrix for model CEG is given by where Li 2 (•) represents the polylogarithm function given by Li The inverse of Fisher Information matrix for a single observation, presented in ( 14), is given by where Form Collorary 2.2 and the Equation ( 14), the join reference prior density of CEG model, with two real-valued parameters θ and λ, π where where Li 2 (β) is the polylogarithm function defined as Li Combining the likelihood function in (11) and the prior (16), the joint posterior distribution for θ, λ is given by, In Appendix A we show that the join posterior distribution is proper.

Simulation Study
Louzada et al. ( 2011) performed a misspecification simulation study for the CEG model, in order to assess the extent of misspecification errors when testing the exponential geometric distribution against the complementary one in the presence of different sample size and censoring percentage.The authors discovered that it is usually possible to discriminate between the distributions even for small samples in the presence of heavy censoring by using the MLEs in a classical inference context (for sample sizes more than 50).
Aiming to examine the finite sample properties of the MLEs, this section presents the results of a Monte Carlo experiment in a Bayesian context.For that, we use 1, 000 Monte Carlo replications in all presented results.The sample sizes n range from 50 to 1, 000, generated according to a CEG distribution for each combination of the parameter value θ = (0.1; 0.3; 0.5) and λ = 2 fixed.The lifetime times of this model were generated by considering the inverse transformation of the cumulative function (Ross, 2009), as follow.
Proposition.Let F(y|θ, λ), y ∈ (0, +∞), denoted a cumulative distributions function of a CGE model.Define Y = F −1 (U), where U has a continuous uniform distribution over the interval (0, 1).Then Y is distributed as F, that is, P(Y ≤ x) = F(y), y ∈ (0, +∞) and the lifetimes are generate as The samples are subsequently obtained by the Metropolis-Hastings technique through the MCMC implemented in software SAS 9.3 -Statistical Analysis System, by using the procedure MCMC with a single chain of the dimension 50, 000.
A burn-in of 10, 000 was adopted in order to eliminate the effect of the initial values, resulting a sample size 40, 000.The convergence of the chain was checked by the criterion proposed by Geweke (1992) for each set of simulated data and an average of the estimates of the parameters and standard deviation (SD), the mean square error and the coverage probability of the 95% posterior credible intervals was obtained by using the reference distribution prior.The coverage probability of the posterior credible interval was proposed, in instead of the confidence interval, once we are interested in study and apply this model in a Bayesian context.
Table 1 show the MLEs by using the reference prior presented in Section 3. We can observed that the variances of the MLEs and their mean square error (MSE) become smaller when the sample size increases.Also, the coverage probability of a 95% two sided credibility intervals, for the model parameters, are observed to be close to the nominal coverage for large sample sizes, though the usually differ from the nominal coverage probability less than 2% for moderate sample sizes (more than 100).
Note that, the simulation was proposed to check the behaviour of the MLEs for finite sample sizes.Although the bias is large and the coverage probability of the credibility intervals for the model parameters is far from the nominal value, for samples with size less than 100, we intend to applie the same inference method in a real study related to the cervical intraepithelial neoplasia with sample size more than 300 patients.When we considere that specific sample size, n=300, we observe that the bias is undercontrol and the coverage probability is really close to the proposed nominal value.0.519 1.997 0.079 0.145 0.007 0.021 0.949 0.951 1, 000 0.1 0.1022 1.997 0.011 0.072 0.000 0.005 0.948 0.950 0.3 0.306 1.998 0.033 0.090 0.001 0.008 0.952 0.947 0.5 0.5087 2.000 0.054 0.102 0.003 0.011 0.948 0.946

Cervical Intraepithelial Neoplasia Data
Recall the CIN data presented in the introductory section.In this section we illustrate the usefulness of the CEG distribution by using the reference prior on modeling such a real data set.
Firstly, a brief descriptive analysis was made.The minimum observed time was 1 month and the maximum observed one was 47 months, approximately four years.The mean time for the cure is 6.4 months with 5.099 SD.The left panel of the Figure 3 shows the TTT plot (Barlow Campo, 1975), in order to verify the possible shape for the hazard function.If the TTT plot is concave, it indicates increase hazard, which can accommodate by a CEG distribution.After initial analysis, the CEG model was fitted the data.As the exponential model is a particular case of CEG model when θ = 1, this model was fitted too.The posterior samples (for both models), were generated by the Metropolis-Hastings technique, similar to the simulation study.An single chain of the dimension 100, 000 was considered for each parameter, discarting the first 20, 000 iterations to eliminate the effect of the initial values, and to avoid correlation problems, a lag with size 20 was used, resulting in a final sample 4, 000.Table 2 shows the posterior summaries for the parameters and the 95% credible intervals considering the reference.The convergence of the chain was verified by Geweke criterion (Geweke, 1992) which demonstrate that these criteria are satisfied for both models, as follows in Table 3.Also, the acceptance rates were obtained too.In order to compare these models, the deviance information criterion (DIC), which is a hierarchical modelling generalization of the AIC (Akaike information criterion) and BIC (Bayesian information criterion, also known as the Schwarz criterion), was obtained.This criteria is particularly useful in Bayesian model selection problems where the posterior distributions of the models have been obtained by Markov chain Monte Carlo (MCMC) simulation which is our case.
The idea is that models with smaller DIC should be preferred to models with larger DIC which is the case of CEG model with DIC = 2, 006.797 against 2, 075.498 for the exponential model.
In addition to this criterion, it is possible to visually distinguish the model that best fits this data set.The middle panel of the Figure 3 shows the survival curves estimated empirically and by the model CEG and exponential.Clearly, we can see that the estimated survival curve for the model CEG is closer to a empirical curve than the model exponential which suggest that CEG model fits better in this case.Furthermore, in right panel of the Figure 3 we can see the increasing behavior of the estimated hazard curves for models CEG.
Figure 4 shows the marginal posterior density for unknown quantities θ and λ, left and middle panels, respectively.Also, right panel shows the scatterplot to analyze the convergence of the parameters by considering two distinct initial values.Considering the CEG distribution fitting, the expected cure time for cervix lesions is E(T ) = − ln( θ)/[ λ(1 − θ)] = 6.14 months.The 95% credible interval for expected cure time ranges from 4.79 to 8.30 months.Moreover, the modal value for the time until the cure is given by M(Y) = 1 λ ln 1−θ θ = 3.87 months with a 95% credible interval ranging from 2.17 to 6.32 months.

Concluding Remarks
In this paper we presented the reference Bayesian inferential procedure for the CEG distribution.The reference Bayesian analysis is an alternative to the Jeffreys prior.In contrast to the Jeffreys, as we show that the prior is permissible then the reference prior leads to a proper posterior distribution.
After some discussion about the reference prior, we made a simulation study where we can observe the adequacy of the proposed inferential method by using different sample sizes.The simulated results show better frequentist properties for samples sizes of above 100.
In order to illustrate the usefulness and effectiveness of the Bayesian CEG model, one real data set was considered which studies the time to cure of precursors lesions of cervical cancer.The use of complementary model was extremely important in this case because it did not know the times in which each lesions were cured but the final time (in this case the maximum time), in which all had been eliminated.If we chose on an usual model for survival analysis, this important peculiarity has not been observed.

Proof that the Posterior Distribution is Proper
The join posterior distribution is given by Between this and that, .

Figure 1 .
Figure 1.pdf and cdf of the CEG distribution with parameter λ = 2 fixed.

Figure 2 .
Figure 2. Survival and hazard of the CEG distribution with parameter λ = 2 fixed.

Figure 3 .
Figure 3. Left panel: TTT plot.Middle panel: estimated survival curves according to the CEG and Exponential distribution fittings over imposed on the Kaplan-Meier empirical curve.Right panel: estimated hazard function according the CEG distribution.

Table 1 .
Mean of parameters estimated and SD, the mean square error (MSE) and coverage probability for each combination of sample size and generated parameters, by using the reference prior

Table 2 .
Posterior model summary of CEG model by considering the reference prior

Table 3 .
Geweke criteria and acceptance rate