Estimation of Change-point and Post-change Parameters after Adaptive Sequential CUSUM Test in an Exponential Family

In this paper, we consider an adaptive sequential CUSUM procedure in an exponential family where the change-point and post-change parameters are estimated adaptively. It is shown that the adaptive CUSUM procedure is efficient at the first order. The conditional biases of the estimation for the change-point and post-change parameter are studied. Comparison with the classical CUSUM procedure in the normal case is made. Nile river flow and average global temperature data sets are used for demonstration.

However, when θ δ, the procedure is no longer efficient.Three approaches have been used to increase efficiencies.One is to use the GLRT (generalized likelihood ratio test) by treating θ as a unknown parameter (Siegmund & Venkatramen, 1991).To overcome the memory problem, Lai (1995) considered the window-limited GLRT.The second is to use integrated likelihood ratio by treating the post-change parameter as a nuisance parameter.However, explicit forms are typically difficult to obtain.The third, considered in this paper, is to use the adaptive CUSUM procedure by estimating the change-point and the post-change parameter adaptively (Draglin, 1990;Wu, 2005Wu, , 2015;;Lorden & Pollak, 2005, 2008).The advantage for this approach is the change-point and post-change parameters are easily identified after the detection as the basic form of the CUSUM procedure is kept.
Many forms of adaptive CUSUM control charts in on-line quality control have been proposed and discussed.Draglin (1998) suggested to use the sample mean.Yakir et al. (1999) and Krieger et al. (2003) considered the linear post-change model.Capizzi and Mascrotto (2003) proposed an adaptive EWMA procedure.An adaptive Shiryayev-Roberts procedure using the adaptive estimators is considered in Lorden and Pollak (2005).Yashchin (1995) and Jiang et al. (2008) used the EWMA as the adaptive post-change mean estimator.Han et al. (2010) proposed to use the last current observation as the estimator for the mean.
Our discussion is mainly focused on the change-point and post-change parameter estimation after detection under the exponential family model which extends the results of Wu (2005) and Lorden and Pollak (2008).By treating the CUSUM procedure as a sequence of sequential tests, we can use the adaptive sequential tests (Robbins & Siegmund, 1974, 1975) by estimating the post-change parameters adaptively for each test.More specifically, we use the notations from stochastic approximation.For observations X k+1 , ..., X n , define recursively the mean estimator where µ k+1,k = µ 0 is the initial mean value and γ k,n (t) is the gain defining the rate of convergence.Two most used ones are (i)(recursive moment estimation) γ k,n = 1 t+n−k ; and (ii) (exponentially weighted moving average) γ k,n = γ, a constant.Correspondingly, the post-change parameter θ is estimated through the equation The adaptive CUSUM procedure is defined as follows: (i) Set ν = 0, µ = δ, and (iv) The procedure stops at And the change-point and post-change parameter are estimated as The discussion is organized as follows.In Section 2, we first present a nonlinear renewal theorem for the adaptive random walk which gives asymptotic results for the ARL 0 under the changed adaptive measure and also provides an adaptive importance sampling technique for simulating ARL 0 .The first order result for ARL 1 is given by using a martingale structure related to the adaptive random walk which shows the adaptive CUSUM procedure is asymptotically efficient.
The biases for the change-point and post-change parameter estimation are studied theoretically in Section 3 by using the renewal property of the adaptive CUSUM process.Simulation comparison with the classical CUSUM procedure in the normal case in terms of average delay detection time and bias of change-point estimation is conducted in Section 4. Nile river flow and average global temperature data sets are used for illustration in Section 5.

A Nonlinear Renewal Theorem
We first present a nonlinear renewal theorem under the changed adaptive measure for an adaptive random walk, which helps to derive the first order result for ARL 0 and also provides an adaptive importance sampling technique for simulation.
By using Wald's likelihood ratio identity, we have Example 1. (Recursive moment estimation) First, we notice that the assumption (A1) is obviously satisfied.Second, since ).So by using the martingale property of the recursive structure, Before the change occurs, the time epoches at which T n = 0 consists a sequence of renewal points for T n .The renewal argument shows that which implies . (1) As d → ∞, by using the renewal theorem and Wald's likelihood ratio repeatedly, we have

ARL 1
The most common measure for the operating characteristics of a detecting procedure is the average out-of-control run length ARL 1 .By using the same renewal argument as in Equation (1), we can show .
To evaluate E θ [N 0 ], we note that (Robbins & Siegmund, 1975 } is a martingale with mean 0 under P θ (.).We rewrite it as By using the martingale property, we get The second term appears because of the adaptive estimation.The first order result can be obtained as follows.First, Second, using the same technique, we have On the other hand, given Wu (2004)).Under the boundness of c ′′ (θ) and the assumption that It follows that at the first order, Remark.For the recursive moment estimation,

Bias of ν and θ
In this section, we study the biases of the estimation for the change point and post-change mean in the recursive mean estimation case.The main ideas follow the lines of Srivastava and Wu (1999) and Wu (2004).

Bias of ν
From the renewal theorem, as ν → ∞, (ν − ν n , T n ) converges in distribution to (L, M) where L follows distribution and given L = k, M follows the same distribution as S k given S 1 > 0, ..., S k > 0 and µ 0 = δ.In particular, if L = 0, M = 0.By splitting on whether ν > ν or ν ≤ ν, we can write The event {ν > ν} is asymptotically equivalent to τ M < ∞ with initial state (L, M).Given ν > ν, ν − ν is equivalent to τ M plus the total length of cycles of T n coming back to zero afterwards with total expected length On the other hand, given ν < ν, ν − ν is asymptotically equal to L. Thus, we have the following result:

Normal Mean Shift
We compare the classical CUSUM procedure with the adaptive CUSUM procedure for detecting the mean shift in a normal model with unit variance.
For the classical CUSUM procedure, the design of d can use the following simple accurate approximation (Siegmund, 1985, Equation (2.56)) where So we shall first simulate ARL 0 's for the adaptive CUSUM procedure and then find the value of d for the classical CUSUM procedure by matching the corresponding ARL 0 's.
For δ = 1.0, 0.5 and t = 0.0, 0.5, we let d = 4.8.Table 1 gives the simulated results for ARL 0 where we use the adaptive importance sampling technique by simulating ARL 0 as where . The simulation is replicated for 10,000 times.The results show that the effect of t is not significant.Finally, we simulate the biases for the change-point and post-change mean estimators.For the same designs given in Table 2, the simulation is replicated 5000 times and and only those stopping times with N > ν are counted to calculate the conditional expectations.Reported also includes the average delay detection time as an alternative measure to ARL 1 .By comparing Table 3 with Table 2, we see that there are very little differences between ALR 1 and ADT .Also, the bias for the change-point estimation becomes larger when the post-change mean gets smaller, so does the bias for the post-change mean estimation.

Unknown Initial Mean
Let µ 0 and µ be the pre-change and post-change means which are unknown and µ − µ 0 > 0 be the change magnitude.We can update the estimate for µ 0 after each sequential test when it goes below zero and track the change magnitude recursively when a new sequential test is formed.More specifically, with a slight different notations, let µ (0) 0 = µ 0 and δ (0) 0 be the assigned starting value for the pre-change mean and change magnitude.Define where and if N (1) + ... + N (i)  .

Restricted Adaptive Estimations
For practical application, the recursive post-mean estimation may become negative.Robbins and Siegmund (1974) proposed to use max(δ, µ k+1,n (δ, t)) as the adaptive estimation where δ is treated as the minimum shift amount to detect.Sparks (2000) and Jiang et al. (2008) proposed to use restricted exponentially weighted moving average as the adaptive estimation.More specifically, instead of using the sample mean we define as the exponentially weighted moving average and use max(δ, µ k+1,n (δ, β)) as the adaptive estimation.The EWMA as a control charting tool has been extensively studied in the literature and an adaptive EWMA procedure can be seen in Capizzi and Mascrotto (2003).An advantage of EMMA estimation is that it gives the most current mean estimation for more flexible post-change mean structures.However, its convergence in probability under the adaptive change probability measure P * (.) can not be established.

Detecting Slope Change
Suppose the means follow the model Following the same idea as for the mean shift case, we define the adaptive estimator for β based on X k+1 , ..., X n as where β k+1,k = β 0 by default.The CUSUM process can be defined as where the adaptive change-point estimation is updated as ν n = ν n−1 if T n > 0, and ν = 0 if T n = 0.After an alarm is raised at N, the change-point is estimated as ν N and the post-change slope is estimated as .

Nile River Flow Data
The Nile river flow data from 1871 to 1970 are reproduced from Cobb (1978) (also see Wu (2005, pg. 27)).A plot in Figure 1 shows that there is an obvious decrease after year 1900.
(i) To use the adaptive CUSUM procedure, we use the first 20 data from year 1871 to 1890 as the training sample to estimate the pre-change mean and stdev as 1070 and 143, respectively.We standardize the data by letting for i = 1, 2, ..., 100, and a negative sign is added in order to detect a decrease in mean.For t = 0.5 and δ = 0.5 and 1.0 with d = 30, the adaptive CUSUM procedure gives N = 52 and ν = 28, which is the same as the ones by using the regular CUSUM procedure with known post-change mean (Wu [19]).Also, the post-change mean is estimated as 1.63, which gives post-change mean 1070 − 143 * 1.62 ≈ 837.
(ii) To detect whether a second change occurs, we use the data from 29 to 52 to calculate the mean and standard deviation as 837 and 149.5.So we standardize the data as x i = −(y i − 837)/149.5,for i = 29, ..., 100.For t = 0.5 and δ = 0.5 and 1.0 with d = 30, no further change point is detected by the adaptive CUSUM procedure.With change-point ν = 28, the global pre-change mean is given as 1097.75 and the post-change mean is 849.97.Note that we implicitly assumed that the post-change variance is the same as the pre-change variance.Figure1also shows that the residuals have no significant correlation.

Figure 1 :
Figure 1: Nile river data Figure 1.Nile river date

(
iii) Similarly, to detect the third change-point, we use the delay detection data from numbers 69 to 85 to fit the postchange model and it shows a constant mean model is a better fit.The mean of these 17 data is -0.05294 with stdev 0.1063.By standardizing rest of the data starting from number 69, the adapted CUSUM procedure detected the third change-point at number 97 (year 1976) with alarm at 103 (year 1982).(iv) Since no more change-point is detected, we can use the three change-points 30, 68, 97 to fit a global piece-wise linear model.The lm() function in R is used to find the best fit and the final mean is given as μ(t) = −0.3722+ 0.01194(t − 30)I [30<t≤68] + 0.3280I [68<t≤97] + (0.5118 + 0.01997(t − 97))I [97<t≤134] .(v)The residual analysis shows that the AR(1) model fits the residuals well with autocorrelation 0.2033 and stdev 0.1079.So the final fitted model isx t = μ(t) + ϵ t where ϵ t = 0.2033ϵ t−1 + 0.1079z t ,with z t being i.i.d.normal random variables.Figure2also gives the fitted model and the ACFs and qq-norm plot for the residuals before and after the correlation fitting.The analysis under AR(1) model under classical CUSUM procedure can be seen inWu (2016).

Figure 2 :
Figure 2: Global average temperature Figure 2. Global average temperature ) the adaptive random walk behaves asymptotically like a conditional random walk with a random drift .Thus, no matter what the sign of the drift is, P * 0

Table 2
gives the corresponding ARL 1 for several typical values of µ where E