Convergence of the Nelson-Aalen Estimator in Competing Risks

The classical analysis of lifetimes focuses on the duration until the occurrence of an event of interest, such as the lifetime before a death due to a certain cause (cancer, infectious disease, road accident, etc.), the duration of the response to treatment, the duration before development of a particular pathology, etc. The lifetime of interest is then modeled by a positive random variable T, of which we want to estimate the law. In practice, it is common that one can not observe T directly. This is the case, for example, when an individual leaves the current study before the occurrence of the event of interest. In this case, we only know that the duration T between the start of the study and the event of interest is greater than the duration spent on the study. This phenomenon is modeled by assuming that the minimum min(T,C) between the variable of interest T and a positive variable which is denoted C and which is called right random censorship is observed. In this case, it will always be assumed that the indicator δ = 11{T≤C} is also observed. The classical analysis of life spans then focuses on methods that allow to estimate the law of T from a censored sample of the form ( min(Ti,Ci), δ = 11{Ti≤Ci} ) for i = 1, ..., n. The study of these models has been studied by several authors, notably by Miller (Miller, 1982), Cox & Oakes (Cox & Oakes, 1984), Andersen & al. (Andersen & al., 1993), Kleinbaum (Kleinbaum, 1996), Klein & Moeschberger (Klein & Moeschberger, 1997), Collett (Collett, 2003) to name just a few.


Introducing the Problem
The classical analysis of lifetimes focuses on the duration until the occurrence of an event of interest, such as the lifetime before a death due to a certain cause (cancer, infectious disease, road accident, etc.), the duration of the response to treatment, the duration before development of a particular pathology, etc.The lifetime of interest is then modeled by a positive random variable T, of which we want to estimate the law.In practice, it is common that one can not observe T directly.This is the case, for example, when an individual leaves the current study before the occurrence of the event of interest.In this case, we only know that the duration T between the start of the study and the event of interest is greater than the duration spent on the study.This phenomenon is modeled by assuming that the minimum min(T, C) between the variable of interest T and a positive variable which is denoted C and which is called right random censorship is observed.In this case, it will always be assumed that the indicator δ = 1 1 {T ≤C} is also observed.The classical analysis of life spans then focuses on methods that allow to estimate the law of T from a censored sample of the form ( min(T i , C i ), δ = 1 1 {T i ≤C i } ) for i = 1, ..., n.The study of these models has been studied by several authors, notably by Miller (Miller, 1982), Cox & Oakes (Cox & Oakes, 1984), Andersen & al. (Andersen & al., 1993), Kleinbaum (Kleinbaum, 1996), Klein & Moeschberger (Klein & Moeschberger, 1997), Collett (Collett, 2003) to name just a few.
A first extension of the previous simple case consists in considering situations in which there is no longer a single event of interest but several types of events, each due to a given risk.For more details, the interested reader may consult Satten & Datta (Satten & Datta, 1999), Datta & Satten (Datta & Satten, 2000), Latouche (Latouche, 2004), Belot (Belot, 2009), Njamen & Ngacthou (Njamen & Ngacthou, 2014), Taylor & Peña (Taylor & Peña, 2014).In 2001, Pe na & al. (Pe na & al., 2001) studied the uniform convergence of the Nelson-Aalen and Kaplan-Meier nonparametric estimator and their asymptotic laws using the G function distribution.Stocker IV & Adekpedjou (Stocker IV & Adekpedjou, 2009) studied the optimal quality tests for recurrent event data of the same type as Peña & al. (Peña & al., 2001).Finally, a practical study of the comparison of the methods of Kaplan Meier and Nelson-Aalen in the survival analysis was done by Saranya & Karthikeyan (Saranya & Karthikeyan, 2015).

Explore Importance of the Problem
The main objective in competitive risks is the event-specific risk function of type j ( j ∈ {1, 2, ..., m}), which is interpreted as the probability of occurrence of the event of type j in an infinitesimal interval, knowing that this event has not yet occurred at the beginning of the interval.In a more general way, the competitive risk model is a particular case of multistate models (Commenges, 1999) where, from a "living" state, individuals can experiment m causes of exclusive events (See Figure 1).The transition rates (or intensity) between each state are cause-specific risk functions.The sum of all these intensities corresponds to the overall risk of leaving the "living" state.
As mentioned earlier, the term "competitive (or competing) risks" refers to the area of "survival time" analysis, which is generally used to refer to the time that lasts until the occurrence of a particular event that is not necessarily death: for example, it may be a relapse, and the duration of survival is, in this case, a remission; or recovery, and the survival time represents the delay between the diagnosis and the cure.In the biomedical field, the two main objectives of the analysis of survival times are as follows: • In a therapeutic trial, it is necessary to test the effectiveness of a new treatment by comparing the lifetimes that it allows us to obtain with those given by the usual treatment (or placebo) • In an epidemiological study, it is necessary to evaluate the prognostic value of one or more factors, either on the duration of survival or on the time of occurrence of a disease.
In either case, the models used and the methods are essentially the same.

Describe Relevant Scholarship
Competitive risks have been the subject of much debate for estimating the probability of a particular event in the presence of other events, or after the modification or elimination of another event (Collet, 2003 ), in demography, the competitive risk situation is observed to study the probability for a couple to get married or live in concubinage (the marriage being in competition with the concubinage) (Ghilagaber, 1998), in medical research, this situation is also frequent in various fields such as gynecology, where the probability of giving birth is natural or by caesarean section: the birth natural delivery being in competition with caesarean section (Com-nougue, 1999), In infectiology, we study the probability of dying or contracting nosocomial infection is studied: death is in competition with nosocomial infection (Wolkewitz & al., 2003;Resche-Rigion & al., 2006); in cancerology, where one studies the probability of seeing one's cancer reccur or dying from it (the death being in competition with the recurrence) ... Some authors find it useful to approach competing risk models as latent variable models.This is the case of Latouche (Latouche, 2004) who uses the model of Fine & Gray (Fine & Gray, 1999), which is a semi-parametric model with proportional risk of formulation similar to the Cox model, for the risk function associated with the function of Cumulative impact proposed by Gray (Gray, 1988).Belot (Belot, 2009) proposes a model to combine the method of competing risk analysis based on the specific cause approach with the excess rate method.
In 2014, Njamen & Ngatchou (Njamen & Ngatchou, 2014) adapted stochastic processes developed by Aalen ( Aalen, 1978a( Aalen, , 1978b) ) to the estimators of Nelson-Aalen and Kaplan-Meier (Kaplan-Meier, 1958) in a context of competing risks.They focused on the complete distributions of probabilities of individuals' failure time whose causes are known, which led them to consider a partition of the individuals in subgroups relative to each of the causes.This technique allowed them to obtain new Nelson-Aalen and Kaplan-Meier type estimators as well as their asymptotic properties.

State Hypotheses and Their Correspondence to Research Design
Many methods have been proposed to study the risks (Tsiatis, 1998).In particular, the latent-event approach proposed by Sampford (Sampford, 1952) has often been used in the characterization of survival models and then of competitive risks.This approach considers m positive random variables, T 1 , T 2 , ..., T m , each corresponding to the time of occurrence of one of the events considered, in a hypothetical situation where only one type of event can occur (the events being exclusive).The time T is then the minimum of T j and the event of cause η is a random variable that can take the values j ∈ {1, 2, ..., m}.These latent times are generally not observable and should rather be considered as a theoretical representation.However, they are also appropriate for modeling the first question posed in the context of competitive risk: what would happen if one of the causes of events could be eliminated (eg, smoking-related mortality, awareness program)?On the contrary, the observations are solely made up of the realization of the couple (T, η) (possibly indexed by i in reference to the individual), and it has been shown that several joint distributions of latent times under different conditions of mutual independence could lead to the same likelihood of the observations (Prentice & al., 1978).Finally, it is common in clinical studies that these data be subjected to a right censorship process non-dependent on (T, η), which complicates the estimate.
To analyze data in a context of competitive risk, two types of probabilities can be defined: • The "gross" probability of type j event in the presence of the other event risks, also known as the cumulative incidence function or sub-function of distribution of the j type event defined by: • The "net" probability of a j event in the situation where only this risk would affect the population.
For t ≥ 0, the quantity F j (t) also represents the probability that an event of type j will occur before the instant t and that the other types of events have not yet occurred at that moment.The functions F j are unfit distribution functions since they are not 1 to infinity, that is to say lim t→∞ F j (t) < 1.
A literature review reveals that the net probabilities were not identifiable quantities from the observations, nor the joint distribution of the latent times unless we assume, for example, that the latent times are mutually independent.However, this assumption can not be verified (see Tsiatis, 1975).
In classical survival analysis, we notice that in fact the couple (T, η) can be censored by a positive variable C of law G.
We work under the assumption of independence of (T, η) and C. When there is censorship, the duration of interest T is greater than the observed censure C, but the exact value of T is unknown.Of course, no information is then available on η, the risk associated with the duration T. To remedy this situation in a context of competitive risks, Njamen & Ngatchou (Njamen & Ngatchou, 2014) propose that for each individual we observe rather the couple of variables Let's consider a (Z i , ξ i ) for i = 1, ...n of independent distributed couples as (Z, ξ).The pair (Z, ξ) is censored by a positive random variable C of law G.By independence of T and C, the random variables Z 1 , Z 2 , ..., Z n are independent and identically distributed (i.i.d.) of law H given by the relationship 1 Let's consider τ H be the upper bound of the support of H (which is the minimum of the upper boundary of the supports F and G defined by τ H = sup {x : H(x) < 1}.Under this condition, it is important to note that no observation is possible beyond this point.
In the case of right censorship, the times of failure (or death) of individuals are not known to the experimenter.It is important to note that in most competitive risk data models, the functions that characterize the probability distributions of the variable of interest as well as the marginal ones are not always observable.
The questions to be addressed relate practically to the underlying functions corresponding to different causes, and the effects of covariates on the rates of occurrence of the competitive risks.One of the problems that can be confronted is that information about the cause of breakdown of the individual under observation can only be known after the autopsy, whereas nothing is known about the censored individuals in the follow-up.
The modeling of this type of event is extremely difficult and when you manage to get a result, its demonstration is very tedious.The introduction of martingales in this type of modeling is a boon.As in general theory of stochastic processes, the concept of martingale occupies a leading position in the study of counting processes.Thus in the mid-1970s, Aalen in his theory of martingales for the counting process provides a unified framework for statistical methods of survival analysis.His approach to the counting process uses the integral representation for censored data statistics which provides a simple and unified form of estimators, test statistics and regression methods.These methods that use martingales to obtain simple expressions for complicated statistics, for asymptotic test distributions and estimators.

Introduction
In survival analysis, we are interested in a group of individuals associated with an event of interest, often referred to as failure or death, occurring after a lifetime or survival data.This event of interest intervenes at most once for each individual.Typical examples are the breakdown of electronic components in industrial reliability, the end of a strike or period of unemployment in economics, the resolution of a specific task in psychological experimentation or, in medicine, relapse or death of a patient, which gave his name to the domain.In most cases, the event of interest symbolizes the transition from one state to another.
To determine precisely the occurrence of the failure, it is necessary to unambiguously define the origin of the times and the term of failure, as well as to choose a time scale.The origin of time is not necessarily the same for all individuals and must be defined precisely for each of them.In most cases, the 0 time is chosen as the time of a transition.For example, when the age is measured, the time origin is birth.For a medical treatment trial, the natural origin of time is the start of treatment, but as regards the course of a disease, the date of contamination being unknown, the moment of diagnosis can be considered as origin.This may seem like a suitable alternative, even though this will lead to approximations later on.
As far as the time scale is concerned, the most common case is an hourly measure, but there are other possibilities, such as the mileage of a vehicle or the cumulative time of use of a system.
The statistical analysis of the lifetimes studies the laws of instants of occurrence of events, based on observations of durations and possibly explanatory variables, made discretely or continuously over time.
Thus, we call T a positive random variable defined on a probability space (Ω, A, P), representing a duration up to an event of interest, the origin of the times being predefined.In the medical field, this event may be the death, healing, relapse of an individual; in the economic field, loss of employment; In reliability, the moment of first breakdown.Thereafter, the duration T will be called the lifetime.
In the analysis of durations of survival, each "individual" has a lifetime T, of density f, of risk function h, of distribution function F and of survival function S = 1 − F. We assume in the following that the random variables T and C are independent.

Counting Processes
Let X be a failure time of interest such that P[X < ∞] = 1.The most basic quantities used to summarize and describe X are the distribution and the hazard functions.The cumulative distribution function at time t, also called lifetime distribution or the failure distribution is given for t ≥ 0 by: The function F is right-continuous, nondecreasing and satisfies F(0) = 0 and F(∞) = 1.We denote by F − the leftcontinuous function obtained from F in the following way: The distribution of X may equivalently be dealt with in terms of the survival function which is given, for t ≥ 0, by: The cumulative hazard function is defined for t ≥ 0 by: ) is valid for all t ≥ 0. We can then call Λ the log-survival function.
If F admits a derivative with respect to Lebesgue measure on R, the probability density function exists and is defined for t ≥ 0 by: Heuristically, the function f may be seen as the instantaneous probability of experiencing the event.
With the same hypothesis of differentiability, the hazard function exists and is defined for t ≥ 0 by: For an extensive introduction to these notions, the reader is referred e.g. to the books of Cox & Oakes (Cox & Oakes, 1984) and Kalbfleisch & Prentice (Kalbfleisch & Prentice, 1980).
The main difficulty in the analysis of lifetime data lies in the fact that the actual failure times of some individuals may not be observed.An observation is right-censored if it is known to be greater than a certain value, provided the exact time is unknown.Let C be the nonnegative r.v. with distribution function G that stands for the censoring time of the individual.
As before, the nonnegative r.v.X with distribution function F denotes the failure time of the individual.If X is censored, instead of X, we observe C which gives the information that X is greater than C. In any case, the observable r.v.consists of T = min(X, C), D = 1 1 {X≤C} , where 1 1 {.} denotes the indicator function.The nonnegative r.v.T stands for the observed duration of time which may correspond either to the event of interest (D = 1) or to a censoring time (D = 0).
In the sequel, it is assumed that X and C are independent.Consequently, the random variable T has the distribution function H given by 1 For all t ≥ 0, we define the following subdistribution functions of H : and It can be checked easily that for all t ≥ 0, One has the following links between the functions H (0) , H (1) and the functions F and G : The cumulative hazard function of X can be expressed for all t ≥ 0 as: Let (T i , D i ), i = 1, . . ., n be n independent copies of the random vector (T, D).Let T 1,n ≤ T 2,n ≤ . . .≤ T n,n be the order statistics associated with the sample T 1 , . . ., T n .If there are ties between a failure time (or several failure times) and a censoring time, then the failure time(s) is (are) ranked ahead of the censoring time(s).
We define the empirical counterparts of H (0) , H (1) and H, by: The Nelson-Aalen estimator of Λ is then defined for t ≥ 0 by: Within the framework of competitive risks, let T be a positive random variable and C be a censoring variable such that Z = T ΛC and δ = 1 1 {T ≤C} .In this model of random censorship, for a sample i = 1, . . ., n, subject to a specific causes j ( j = 1, . . ., m), we can observe the couple (Z i , δ i ) where Z i = min(T i , C i ) and δ i = 1 1 {T i ≤C i } with T i = min(τ i 1 , . . ., τ i m ) and where τ i j is the time that an individual i is subject to the cause j.In the approach of the counting process, for each "individual" i (i = 1, ..., n), we associate a point process pair (( where Z i = T i ∧ C i is the minimum between survival time and censorship time and δ i = 1 1 {T i ≤C i } is the indicator of observation and where The process L i (t) can still be written in the form: 1 if the individual is under observation and at risk at moment 0 i f not.
The process (L i (t), t ≥ 0) is called at risky process: it indicates the presence at risk just before the instant t (ie the subject has not yet suffered the event at T − ).
The process K i (t) is defined by which represents the indicator of the event of interest in the interval ]0, t].
The process (K i (t), t ≥ 0) is continuous right with left limit is known simply as counting process, because it counts the number of events observed until time T. By convention N(0) = 0.
The process (L i (t), t ≥ 0) is continuous to the left with the right boundary and is known as the risk process, indicating whether an individual is likely to undergo an event at time t.
Recall that the risk function h i (t) is defined by: and for a dt "small" one can write: If dK i (t) increases the process K i on ]t, t + dt] with dt "small", then dK i (t) takes only two values: 0 (i.e.no event) or 1 (ie the occurrence of the event) and in this case we have: where F t defined by is the natural filtration (all the information available at time t) and where the notation dK i (t) refers to the formal writing of the stochastic integral writing made possible by the fact that K i (t) is an increasing process.dK i (t) is a random variable of Bernoulli and its mean is The tribe F t − defined by means that the event occurred before time t.Thus, conditionally at F t − , the v.a.dK i (t) admits a Bernoulli law of parameter L i (t)h i (t) (confers Fleming & Harrington, 1991, p.25).On the other hand, in Njamen & Ngatchou (Njamen & Ngatchou, 2014, p.8), we have: Drawing from the previous part, the resulting stochastic process is defined by and is called martingale in relation to F t − and the quantity ∫ t 0 L i (t)h i (t)du is the increase process of K i (t).Indeed, we have: For all i = 1, .., n, the processes and are called process intensity and "cumulative intensity process" respectively of the process (K i (t), t ≥ 0).The process (Λ i (t), t ≥ 0) is called "predictable compensator" of K i (t) (it is determined by F t − ).Using the intensity process λ i (t), the mean and variance of Bernoulli's v.a.dK i (t) respectively have the following expressions: and (5)

Example of Calculation in Survival Analysis
If the survival time T follows an exponential law varepsilon( theta), then for every t ≥ 0, we have: The cumulative intensity process is In this example, the v.a T is beautiful and well observed.
Remark 1 However, in the case where we have a left truncation U, we can always define the previous processes.For example, the process at risk becomes: In this paper, we work within the framework of competitive risks still called multivariate framework and this in a mechanism of right random censorship.

Estimation of the Function of the Cumulative Hazard Rate in Competitive Risk Contest
Within the framework of our study, we consider τ 1 , τ 2 , ..., τ m they continuous random variables representing the lifetimes under each of The m competitive risks, J ∈ {1, 2, ..., m} ∪ {0}.The set of cause indices where 0 corresponds to the state of functioning (or life) of the observed individual; T = min(τ 1 , τ 2 , ..., τ m ) the random variable of the event of interest, η ∈ J, the cause random variable.She is such that η = j if T = τ j , for all j ∈ {1, 2, ..., m}; F the distribution function of T, S = 1 − F, her survival function is such that S (t) = P[T > t], C the random variable of the censoring event on the right, δ = 1 1 {T ≤C} The censorship indicator, which is equal to 1 if the event is observed in T and 0 else; and for technical reasons ξ = ηδ such as ξ = j if T ≤ C and η = j and ξ = 0 if T > C.
Note that δ and ξ are observable and that η is only for uncensored T .
In the following, we assume that the censorship is non-informative i.e. than the v.a.C is independent of (T, η).The joint law of (T, η) is completely specified by the incident distributions of specific cause j, F j (t) defined by which are only other the sub-distributions of the specific cause failure j = 1, ..., m.For t ≥ 0, the quantity F j (t) also represents the probability that an event of type j occurs before the instant t and the other types of event have not yet occurred at this time.The functions F j are improper distribution functions, since they are not 1 to infinity, that is to say lim t→∞ F j (t) < 1.
When there is censorship, the duration of interest T is greater than the observed censure C, but the exact value of T is unknown.Of course, no information is then available on η, the risk associated with the duration T. For each individual, we then observe the pair of variables (Z = min(T, C), ξ = ηδ).
Either a sample (Z i , ξ i ) for i = 1, ...n of independent couples distributed as (Z, ξ).The pair (Z, ξ), where Z and ξ are defined above, is censored by a positive random variable C of law G.By independence of T and C, the random variables Z 1 , Z 2 , ..., Z n are independent and identically distributed from H given by the relation Let τ H be the upper bound of the support of H (which is the minimum of the upper bounds of the supports F and G) defined by τ H = sup {x : H(x) < 1}.
Under this condition, it is important to note that no observation is possible beyond this point.
Let the sub-distributions of the H function be defined by: and For all t ≥ 0, we have The relationship between the distribution of H (1) j functions and sub-distributions F j and G j is given for all t ≥ 0 and j = 1, ..., m by We define the empirical version of H (1) and H (1) j for all t ≥ 0 by and The cumulative hazard rate of specific cause j ( j = 1, ..., m) is given by where F is the distribution function defined by F(t) = P[T ≤ t] (probability of leaving the state at time t) satisfying for all t ≥ 0,

Formalism of the Process
In the case of a right random censorship, let (Z 1 , δ 1 , ξ 1 ), ..., (Z n , δ n , ξ n ) a n -sample of the observable triplet (Z i , δ i , ξ i ) with and where τ i j represents the duration that an individual i is subjected to the cause j.Then a Nelson-Aalen-type estimator of Λ j is given for j = 1, ..., m, by (see e.g. in Andersen & al., 1993) where J j (t) = 1 1 {Y j (t)>0} , is the process of counting the number of cause failures j observed in the time interval [0, t], and is the total number of individuals in the sample under observation who survive beyond time t.Note by the number of individuals at risk of falling into a specific cause j or being censored.

Nelson-Aalen Nonparametric Estimator
An important illustration of the approach to the counting process is summarized in the study of the properties of the Nelson-Aalen Λ(t) cumulative intensity process (where cumulative risk) Λ(t).Indeed, the cumulative risk in a region where there is at least one observation is given for all j = 1, ..., m, by: The Nelson-Aalen final estimator of (8) analogue of ( 7) and relating to the A j subgroup of individuals falling down j is given by with J * j (t) = 1 1 {Y j (t)>0} who is the indicator of the presence of at least one "individual" at risk, is the total number of individuals in the observation sample who survive beyond time and the total number of individuals at risk of falling into a specific cause j or being censored.With this notation, we note that Λ * j (t) estimates Λ * j (t).Indeed, where the martingale (specific of i th "individual") at a zero expectation.
Theorem 1 (Breuils, 2003, p. 25;Fleming & Harrington 1990, p. 26).Let T i be an absolutely continuous lifetime and C i be a censoring variable for any arbitrary distribution i ∈ {1, ..., n}.Let λ i be the risk function associated with T i .Let's put Z i = T i ∧ C i and For t ≥ 0, the process defined by for t such that P(Z i > t) > 0.
Proposition 1 (Njamen & Ngatchou, 2014) For a given t ≥ 0 and a given j ∈ {1, ..., m}, the stochastic processes defined by in the martingale associated with the subject specific cause j.
The m j (s) martingale above represents the difference between the number of events observed on [0, u[, ie N j (s) and its increasing process (which is the number of events " predicted" by the model for the i th individual): it is noted that Λ j (t) is called the "predictable compensator" of N j (t) since it is determined by F t − .

Results
In the case of the presence of at least one "individual" at risk, the following Theorem gives the simple convergence in probability of the Nelson-Aalen estimator in the context of the competitive risks obtained in Njamen & Ngatchou (Njamen & Ngatchou, 2014).This result constitutes the first fundamental result of this paper: Theorem 3 For all t ≥ 0, we have : Proof.The cumulative risk in a region where there is at least one observation is and its Nelson-Aalen estimator is given by ) .
On the other hand, we have: where By the inequality of Chebyshev and Th.2.4.4.P.72 of Fleming & Harringthon (Fleming & Harringthon, 1991), we obtain: We apply the Glivenko-Cantelli theorem for Y n j (t) = ∑ n i=1 1 1 {z i ≥t} and we have: where F is the distribution function (f.d.r.) of the survival times T i , G the f.d.r. of C i and S the survival function of (Z j = T j ∧ C j ) given by: who is the probability of having 0 individuals at risk on (0, t]) among the n individuals For an individual we have: we have: Consequently, we have n J * j (s) Y n j (s) from where J * j (s) Y n j (s) By the theorem of monotonic convergence we deduce that: From where Λ * jn (t) − Λ * j (t) P −→ 0 (n → ∞).
On the other hand, the expression (1 − J * j (s) ≥ 0) implies than: This completes the proof of the Theorem.
The second fundamental result following gives the uniform probability convergence of the Neslson-Aalen estimator in the context of competitive risks: Theorem 4 It is assumed that if for all t > 0, F(t) < 1 and Y n j (t) Y n j (u) We have by hypothesis ∀t > 0, Y n j (t) Y n j (u) Y n j (u) So ε and η are arbitrary, we deduce than Y n j (u) Hence the result.