Confidence Interval for Change Point in Hazard Rate With Staggered Entry

We consider the construction of a con?dence region (interval) for a change point in hazard rate of the patients survival distribution when the patients enter the trial at random times. We show that the locallikelihood ratio process converges weakly to a certain process and obtain the maximum distribution of the process which does not depend on the change point, and thus can be used to construct the confidence region for the change point. We also compare the limiting density function to the empirical density and discuss the empirical coverage probability of the confidence interval by simulation. Stanford Heart Transplant data are used for illustration.


Introduction
Assume Y 1 , · · · , Y N are the survival times for the N patients that follows the density f Y (y) =        λ 1 e −λ 1 y i f y < ν λ 2 e −λ 1 ν−λ 2 (y−ν) i f y ≥ ν, where 0 < λ 1 < λ 2 , and ν is the change point. This change point problem in hazard rate was first proposed in Miller (1960). We refer to Müller and Wang(1994) for a review and Chen and Gupta (2011) for an introduction.
In Kim et al. (2004), the testing problem is studied by showing the weak convergence of the log-likelihood ratio process under C(−∞, ∞). The same technique was extended to Weibull case in Williams & Kim (2011,2013a, 2013b. For construction of confidence region or interval, one can use the MLE. Siegmund (1988) proposed to invert the loglikelihood ratio process to obtain confidence region and then construct the confidence interval by joining all the disconnected intervals and concluded that there is practically no difference between the confidence region and confidence interval. By using this approach, Loader (1991) obtained the confidence region by extending the method under model (1) by assuming the exponential random censoring mechanism.
In this article, we extend the model to the staggered entry case with type I censor. Assume that the patients arrive for treatment at times 0 < τ 1 < τ 2 < · · · , following a Poisson process with known rate γ. Let N be the total number of patients who arrived in the time interval [0, T ] and for i = 1, ..., N. The observations are where I{A} denotes the indicator function of an event A. The log-likelihood function is x ∧ ν = min(x, ν), and x + = max(x, 0). Given ν, the maximum likelihood estimator (MLE) of λ i (ν) iŝ for i = 1, 2, respectively. So the profile log-likelihood is The MLE of ν is obtained by maximizing (ν) with respect to ν. This must be done numerically.
The main contribution here is to give a systematic method of constructing confidence region/interval for ν by inverting the likelihood ratio process. For this, we need to prove the weak convergence of the local log-likelihood ratio process with a change point on D(−∞, ∞) and study the distribution of the maximum of log-likelihood ratio process We obtain its limit distribution that is free of the change point ν. Thus, we can construct the confidence region of the form {ν| (ν) ≥ (ν) − c} by selecting c satisfying the designated coverage probability.
In Section 3, we use Monte Carlo simulation to compare the limiting density function to the empirical density function constructed from the simulated data. We also discuss how to construct the confidence interval and comment on the empirical coverage probability of the confidence interval with nominal confidence level. In Section 4, we use the Stanford Heart Transplant data to illustrate our method. We show that the data set fits the model assumptions quite well.

Main Results
In notation, let → D , → p , and ⇒ denote the convergence in distribution, convergence in probability, and the weak convergence, respectively. For a fixed ν, define a local likelihood ratio process Z γ by where for i = 1, 2, We shall split the proof of weak convergence of Z γ (u) into several steps.
First, by denotingK International Journal of Statistics and Probability Vol. 9, No. 6; and applying a two-term Taylor expansion, we can rewrite . Similar to the proof for Proposition 1 in Kim, Woodroofe, and Wu (2002), we can show Second,we show the weak convergence ofK 1 (u, ν) given in (3). Note that given N = n, the random vectors (X 1 , δ 1 ), . . . , (X n , δ n ) are a permutation of i.i.d. random vectors with a common distribution function F X (x) = P(X ≤ x) and sub-distribution By denoting V + (t), V − (t) as two independent copies of a random process such that we have Lemma 1. For each fixed ν,K 1 (u, ν) as in (3) converges weakly to a process of the form is the signum function, and Proof. For each u > 0, by using Taylor expansion and (3), where β is as in (5). By the Poisson Convergence Theorem, International Journal of Statistics and Probability Vol. 9, No. 6; following essentially the same line as for u > 0 case. Therefore,K 1 (u, ν) → D K * 1 (u, ν) for each u. The convergence of all finite dimensional distributions may be similarly established. Finally, by Theorem 15.6 in Billingsley(1968),K 1 (u, ν) is tight.
Third, we show the convergence ofT i (u, ν) as defined in (4). Let F N (x) denote the empirical distribution function of F X (x), i.e., Lemma 2. With β as in (5), for each fixed ν and any positive number c, By uniform convergence of F n (x) and E(N/ (γT )) = 1, Var(N/ (γT )) → 0, and by continuity of F X at ν,T 1 (u, ν) converges to uβ uniformly.
Combining Lemmas 1 and 2 and using Slutsky's theorem, we thus have Theorem 1. Z γ (u) and Z * γ (u) weakly converge to a process (4) and β in (5). The next theorem gives the limiting distribution of the supremum of Y(u) process and its proof is given in Section 5.
Theorem 2. As x → ∞, the distribution function of the supremum of Y(u) process can be approximated by where Remark. The results are given for λ 1 < λ 2 . For λ 1 > λ 2 , similar results hold with only minor modifications in a few places. f Y (y) in (1), F X (x), andF X (x) are replaced by their left-continuous versions. K 1 (ν) and K 2 (ν) are replaced by

Simulation Study
As we have seen in Theorem 1, the local likelihood ratio process Z γ (u) converges weakly to a process Y(u) whose supremum has distribution function that does not depend on the change point ν; it depends only on the ratio of λ 1 and λ 2 . In this section we compare the theoretical density and empirical density curves generated from sup Z γ (u) , which in turn are International Journal of Statistics and Probability Vol. 9, No. 6; 2020 based on data sets from (1) using three different ν values. Then we describe how to construct confidence interval for ν and show actual coverage probability of the confidence intervals for various combination of change points and nominal coverage level.
With T, λ 1 , λ 2 , and γ fixed, the data and density functions are generated in the following steps.
Step 1. Draw a random number N from Poisson (γT ).
Step 2. Fix ν and draw a random sample Y 1 , . . . , Y N from f Y (y) as in (1).
For comparison, we generated 10,000 random samples for each of the change points ν = 0.3, 0.5, and 0.7 and obtained the density curve of (ν) − (ν). They are shown together with the (approximated) limiting density function. The plot shows that the densities are quite similar in shape and all three are close to the theoretical density function. Besides, Kolmogorov-Smirnov goodness-of-fit tests indicate that the densities are not significantly different at 5%.
To construct a confidence interval for ν from a given random sample, fix a suitable confidence level 1 − α and find the 100(1 − α)th percentile from (7). Then follow Steps 1 through 4 described above for a fixed ν to generate a random sample of (average) size 5,000 and do: Step 5 . Cover a sufficiently large interval (ν ± 0.2, for example) containing the MLEν with a fine set of grid points ν 1 , . . . , ν J and for each j = 1, . . . , J, calculate j = (ν) − (ν j ).
Step 6 . Find J L = inf 1≤ j≤J { j : j ≥ p 1−α and j+1 < p 1−α } and J R = sup 1≤ j≤J { j : j < p 1−α and j+1 ≥ p 1−α }. Then find C L by linearly interpolating two points (ν J L , J L ) and (ν J L +1 , J L +1 ) and finding the point crossing the horizontal line at p 1−α . Do the same for C R using (ν J R , J R ) and (ν J R +1 , J R +1 ). Report (C L , C R ) as the 100(1 − α)% confidence interval. Table 1 shows the empirical coverage probabilities for 90%, 95%, and 99% confidence intervals based on 10,000 replications for each of change points ranging from 0.4 to 0.7. Standard errors for 90%:0.003; 95%:0.002; 99%:0.001. Apparently the coverage probabilities for 90% confidence intervals are all higher than the nominal level; in particular, the 90% interval covers the true change point 92.3% of the time when ν = 0.4 and 91.9% when ν = 0.45. The excessive coverage probability of 90% confidence interval is partially due to the approximation error caused by Theorem 2 in that (7) holds when x is large. Also, the jagged nature of the log-likelihood function seems to contribute to the excessive coverage.

Example: Stanford Heart Transplant Data
To illustrate the results that we developed in this paper, we consider the heart transplant data as reported in Crowley & Hu(1977). Out of 103 patients who participated in the Stanford Heart Transplantation Program, we consider the 69 patients who received heart transplantation. Of those, 45 patients died between the operation and the closing date on April 1, 1974, and 24 patients were alive by the closing date so their lifetimes are Type I censored. The published data contain the calendar dates for the transplant operation and dates of death as well as censoring indicator.
Recall that two underlying assumptions for our method are: (i) that patients enter the treatment following a Poisson process, and (ii) the patients' survival time after the treatment (subject to Type I censoring) are distributed as in (2). We regard the surgery dates as the treatment times, and the first assumption can be easily checked by comparing the distribution of inter-arrival times to the exponential distribution. The usual Kolmogorov-Smirnov test results in a p-value of 0.65 so the first assumption seems reasonable. Also, the heart transplant data seem to fit the second assumption equally The empirical distribution function in Figure 2 strongly indicates the presence of a change point in the hazard at an early stage after the surgery. Not surprisingly the risk of death is quite high initially (steep slope at the beginning of the curve), then it quickly diminishes. From the data it is estimated that the change occurs approximately at 68 days, and the hazard rates before and after the change point are estimated to beλ 1 = 0.0075,λ 2 = 0.00075, which correspond to 132 days and 1320 days, respectively.
We can also formally test for a change point. Following the test procedure described in Kim et al. (2002), we first rescale the whole observed interval to [0,1] and the rescaled change point is 0.0298. The test statistic depends on the interval that is being tested, and for a particular interval [0.001, 0.03], the observed test statistic 6.622 is greater than the critical point 3.520, indicating that the test is significant at 1%. Notably, practically any subinterval containing the estimated change point is found to be highly significant.
To compute the confidence interval for the change point, we need the critical value of the distribution of the supremum of Y(u). From Theorems 1 and 2, we see that and for a given α, we can select the critical point c from the equation Since the upper 5% critical point is 3.37, by inverting (ν) − (ν) and reading off the two end points of the region that fall below the critical value (see Figure 3), we obtain a 95% confidence interval (65.5, 82.5) in days. Using the same method, 90% and 99% confidence intervals are computed as ( Suppose λ 1 < λ 2 . Let u * = u(1 − ν/T ). Since ν < T, sgn(u) = sgn(u * ) and essential properties remain unchanged under the simple transformation, with abuse of notation we may write First consider the case u ≥ 0. For each positive integer i, let W i denote the "interarrival time", or the uncensored lifetime of patients associated with the Poisson process K * 1 (u, ν) as in (4). Then W i is exponentially distributed with parameter λ 2 e −λ 1 ν and X i = λ 2 e −λ 1 ν W i ∼ Exp(1).
Therefore, by denoting δ = λ 2 /λ 1 , we have Similarly, for u < 0, where X i are independent copies of X i . Thus, To find the distribution function of U − , Let (1 − δ) X i + log δ .
To find the distribution function of U + , denote by the memoryless property of the exponential distribution. Therefore, By independence of U + and U − , If on the other hand λ 1 > λ 2 , U + is changed to which is similar to U − . Applying the same method to U * + , we see where κ(x) is in (8). On the other hand, U − is changed to X i + (k − 1) log λ 1 λ 2 and the approximated distribution of U * − is the same as the approximated distribution of U + , i.e., P(U * − ≤ x) ≈ 1 − e −x so

Conclusion
In this communication, we presented a method of constructing confidence region (interval) for a change point in hazard rate with staggered entry by using log-local likelihood ratio process. By using the weak convergence, we are able to calculate the limit distribution and use it to obtain the confidence region given confidence level. There are many possible extensions for staggered entry. We only mention some recent development. For estimations under regression models for hazard rate, we refer to Pons (2002), Na et al.(2005), Dupuy (2006), Goodman et al. (2011) and Li et al. (2013). For nonparametric estimations or under nonparametric models, we refer to Antoniadis et al. (2000), Wu et al.(2003) and Gijbels & Gürler (2003). Kosorok & Song (2007) considered a model where the change is caused by a covariate crosses a threshold. More general models under censoring mechanisms are considered in Zhao et al.(2009), Rabhi & Asgharian (2017), and Wang et al. (2019). Sequential tests for possible multiple change points are considered in He et al. (2012) and Zhang et al. (2014). Computing intensive techniques using Monte Carlo and Bootstrap technique are considered by Achcar & Loibel (1998), Pham & Nguyen (1990, 1993.