Consistency bands for the mean excess function and application to graphical goodness of fit test for financial data

In this paper, we use the modern setting of functional empirical processes and recent techniques on uniform estimation for non parametric objects to derive consistency bands for the mean excess function in the i.i.d. case. We apply our results for modelling financial data, in particular Dow Jones data basis to see how good the Generalized hyperbolic distribution models fit monthly data.


Introduction
Let X be a random variable defined on a probability space (Ω, A, P), and let F be its distribution function with endpoint x F = sup{x ∈ R, F (x) < 1}, and letF = 1 − F its survival function.
Throughout the paper we suppose that E|X| < ∞. The mean excess function e(u) of X is defined by (see, e.g., Kotz and Shanbhag [6], Hall and Wellner [5], Guess  A natural way to estimate the mean excess function e(u) is achieved by using the plug-in method, that is replacing the survival function in (1.1) by its empirical counterpart, as did Yang [3]. Now consider a sequence X 1 , X 2 , ... of independent copies of X. The plug-in estimator of e(u), for n ≥ 1, is where I [X>u] = 1 if X > u and 0 otherwise.
For notation convenience, we denote and where f u (x) = xI [x>u] , g u (x) = I [x>u] , and P X is the probability law of X.
We also denote by P n the empirical measure associated with the sample X 1 , · · · , X n . We have X i I [Xi>u] and P n (g u ) = 1 n n i=1 I [Xi>u] .
Formulae (1.1) and (1.2) lead to if u > X n,n , where X n,n = max 1≤i≤n X i .
One of the most important motivation of the study of the mean excess function comes from extreme value theory (EVT). Indeed this function e(u) is linear in the threshold u when F is a Generalized Pareto distribution (GP D) and this is quite a powerful graphical test for such distributions. By using the Vapnik-Chervonenkis (VC) classes and the entropy numbers technics, we have been able to establish that the empirical mean excess function e n (u) converges almost surely and uniformly. We showed that for any u 1 less than the upper endpoint of the distribution F , sup u≤u1 |e n (u) − e(u)| → 0 a.s as n → ∞.
Next, by using the modern theory of functional empirical process mainly exposed in [10], we proved that the empirical mean excess function e n (u) also weakly converges, that is √ n(e n (u) − e(u)), u ∈ I → w {G(h u ), u ∈ I} .
where G is a Gaussian process and {h u , u ∈ I} is a function family to be both precised later.
Furthermore, using Talagrand's inequality (see [8]), and Mason and al. technics (see [9]), we arrived at finding our best achievement: that is finding consistency bands for the mean excess function e(u). Precisely we establish that for any interval I = [u 0 , u 1 ], u 1 being less than the upper endpoint of F and for any ε > 0, we have for n large P e n (u) − E n √ n < e(u) < e n (u) + E n √ n , u ∈ I > 1 − ε, where (E n ) (n≥1) is a non-random sequence of real numbers precised in Theorem 3 and where F satisfies a very slight condition.
These results allowed us to set graphical goodness of fitting test based on the empirical mean excess function and to apply this test to Dow jones data. We found that the Generalized hyperbolic family distribution reveals, itself, to generally fit financial data.
In this remainder of the text, we are going to detail this outlined results, to demonstrate them, to make simulations studies about them, and finaly to apply them to financial data.
The paper is organized as follows. We state uniform almost sure (a.s) convergence results in Section 2 and finite-distribution and functional normality theorems in Section 3. Section 4 is devoted to setting a.s consistency bands for the mean excess function. In Section 5, simulation studies and data driven applications using Dow Jones data are provided. We finish the paper by a concluding section.
Before we go any further, it is worth mentioning that, in the sequel, all the suprema, taken over u < u 1 , are measurable since the functions of u that we consider below, are left or right continuous. This means that we are in the pointwise-measurability scheme. Thus, even when we use the results and concepts in [10], we do not need exterior either interior integrals or convergence in outer probability.

Almost Sure Convergence
In this section we are going to prove the uniform almost sure convergence of the empirical mean excess function by using Vapnik-Chervonenkis (VC) classes and bracketing numbers.
Proof. We observe that F 1 = {g u , u < x F } is a class of monotone real functions with values in [0, 1]. By Theorem 2.7.5 in [10], the bracketing number N [ ] (ε, F 1 , L r (Q)) is finite (bounded by exp(K/ε), for every probability measure Q, any real r ≥ 1, and a constant K that only depends on r). Since E|g u (X 1 )| < ∞ for u < x F , F 1 is functional Glivenko-Cantelli class in the sense of Theorem 2.4.1 in [10], meaning that The class Then it satisfies the uniform entropy condition 2.4.1 in [10]. Then F 2 is a Donsker class and hence it is a Glivenko Cantelli class, that is To finish, fix u 1 < x F . Then for u ≤ u 1 and n large enough, we have . Then From (2.1) and (2.2) above, we have n → 0 a.s and δ n → 0 a.s , as n → ∞. Now for u ≤ u 1 , we have P X (g u ) ≥ P X (g u1 ) and from (2.4) , 3. Asymptotic normality of e n (u) In this section, we are concerned with weak laws of the empirical mean excess process as a stochastic process. Hereafter {G(g), g ∈ G} denotes a Gaussian centered functional stochastic process with variance-covariance function Γ(g 1 , g 2 ) = (g 1 (x) − Eg 1 (X 1 ))(g 2 (x) − Eg 2 (X 1 ))dF (x). Theorem 2. Let X 1 , X 2 , · · · be iid rv's with common finite second moment. Put I = [u 0 , u 1 ], with u 0 < u 1 < x F and define the functions of t ∈ R, Then the functional empirical processes {G n (g u ), u ∈ I} and {G n (f u ), u ∈ I} weakly converge respectively to {G(g u ), u ∈ I} and {G(f u ), u ∈ I} in ∞ (I).
Before we give the proof, we need this lemma.
Lemma 1. Let g be a finite measurable function defined on R such that Eg(X 1 ) 2 < ∞ . Let u 0 < u 1 < x F . Define for any fixed v ∈ R and δ > 0 Let for a fixed n ≥ 1, u ∈ R Since for all (u, v) ∈ R 2 , we have it comes that α is finite. So for any ε > 0, we can find u ∈ [v − δ, v[ such that, Now, let δ > 0. Define for any p ≥ 1, and consider u j (p) = u j = v − δ + jδ/p, j = 0, ..., p.
Let us prove that for ε > 0, lim For each p ≥ 1, let j such that u j−1 (p) ≤ u ≤ u j (p).
We have, We get from (3.1) max For a fixed n ≥ 1, R j (p) → 0 as p → ∞, since the sequence of intervals (]ū, u j (p)]) p≥1 decreases to the empty set as p → ∞.
Next, consider the collection points {u j ( ), 0 ≤ j ≤ p, 1 ≤ ≤ p} and denote the set of its distinct values between them as {u j , 1 ≤ j ≤ m(p)} . We still have |u j − u j−1 | ≤ δ/p. And we surely have for any ε > 0 By construction, max for any fixed v > 0, for any η > 0, We observe that T 1 , T 2 , ..., T m(p) are partial sums of i.i.d. centered random variables so that the T 4 j form a submartingale. By the maximal inequality form submartingales, for any fixed p Since the right hand does not depend on p, we get by (3.2) Notice that T (n, u, δ) is a sum of n i.i.d centered random variables with variance and fourth moment Simple computations give (see the appendix 7.1 for a simple proof of that) By putting these facts together, we arrive at We finally get This achieves the proof of the lemma.
Proof of Theorem 2.
By Theorem 2.7.5 in [10] applied to F 1 and by the fact that F 2 is a Vapnik-Chervonenkis class, condition (2.5.1) is satisfied for both F 1 and F 2 thus F 1 and F 2 are Donsker classes.
This may be used in a simple manner to get Denote the functional empirical process for any real function g by Remind that for any Donsker class G, the functional stochastic process {G n (g), g ∈ G} converges in law to a Gaussian and centered stochastic process {G(g), g ∈ G} whose variance-covariance function is We have, as n → ∞ We find Since F 1 is a Donsker class, then sup We finally have At this step, we want to prove that In view of Theorem in 1.5.7 in [10], it is enough to prove that lim δ→0 sup u∈I lim sup Here, we apply Lemma 1 for the nondecreasing mesurable function g(x) = x and g(x) = 1.
In both cases, we inspect the assumptions of this lemma and see that if g(x) = x, we get g(x) ≤ g(u 1 ) = u 1 for any u 0 ≤ x ≤ u 1 and thus If g(x) = 1, the result is obvious.
We can apply Lemma 1 and we will get, Next, we use the following development for (u, v) ∈ I 2 We get
and P X (f u ).
We obtain P −1 Thus by using these bounds and (3.3), it comes that These quantities go to zero whenever F is continuous and hence uniformly continuous in I. Putting all these facts together and using (3.3) yield Finally This completes the proof.
Now we are going to concentrate on consistency bands for the mean excess function.

Consistency bands
Now, we may use the uniform bands of the functional empirical processes based on Talagrand's inequality (see [8]) and the new methods introduced by Mason and al. [9] to obtain consistency bands of the mean excess function as follows.
Theorem 3. Let X 1 , X 2 , · · · , be i.i.d random variables with finite second moment. Put I = [u 0 , u 1 ], with −∞ < u 0 < u 1 < x F . We suppose that F is continuous and satisfies Then for any ε > 0, there exists n 0 such that for n ≥ n 0 , The proof of this theorem is rather technical so we postpone it in the appendix subsection 7.2.1 where we also state the fundamental Talagrand's inequality.
Remark: The validity condition (4.1) is quite very weak and is satisfied by most of the continuous usual distribution functions. Indeed if F is absolutely continuous with respect to the Lebesgue measure with derivative function f , we get by using the mean value theorem, • First, it can be used to distinguish heavy tailed models distribution and those with light tailed distribution. An increasing mean excess function e(u) indicates a heavy-tailed distribution and a decreasing mean excess function e(u) indicates a light-tailed distribution. The exponential distribution has a constant mean excess function and is considered a medium-tailed distribution .
Then the plot of the mean excess function tends to infinity for heavy-tailed distributions, decreases to zero for light-tailed distributions and remains constant for an exponential distribution.
• Secondly, it can be used for tail estimation with the help of the generalized Pareto distribution which can model the tails of another distribution. Let F u (x), the excess distribution over threshold u, defined by By using Theorem 7.20 in [1] , a natural approximation of F u is a generalized Pareto distribution GP D(ξ, β) which mean excess function is given by If the empirical mean excess function plot looks linear, we can fit a GP D(ξ, β) model whose parameters can be estimated by means of linear least squares method : given data {(u 1 , y 1 ), . . . , (u n , y n )}, where u i = X i and y i = e n (u i ), i = 1, . . . , n, we estimate the parameters ξ and β to bê y i are the sample means of the observations on u and y, respectively.
As far as we are concerned, our goal is to estimate the mean excess function by consistency bounds.
In the remainder of this section, we are backing on the empirical mean excess function (emef for short) to construct graphical tools goodness of fit test.
In the first step we are considering a large set of distributions for which we draw the average emef. That means that we fix a distribution function and consider n = 6000 samples from it, each sample size is 4000. Next we compute the average of the n = 6000 empirical mean functions.
The graphs of these average mean empirical functions would serve as stallions in the following sense: each other sample having an alike emef will suggest such an underlying distribution.
We will use, as a special guest, the generalized hyperbolic (Gh for short) family of distribution functions. Nowadays, this family is very important in financial modeling.
In a second step we will try to use the obtained graphs as stallions for real data.
In this paper, we focus on monthly returns and log-returns of Dow Jones data. We will see that these data strongly suggest Gh model.
This section, beyong financial data, shows how to use the emef for goodness of fit testing purposes. It opens a great verity of applications for differents types of data.

Usual distributions.
To assess the performance of our estimator, we present a simulation study. We draw simulated emefs for standard distributions and next for Gh family of distribution functions 5.2.1. Emef for standard distributions. We consider some simple models that are listed in the table 1 below where the used parameters are specified and the emef figures corresponding to each choice are displayed.

5.2.2.
Generalized hyperbolic models. Next, we consider the emefs for the Gh models. We need some definitions. The Lebesgue density function of the one dimensional Gh is given by is a norming constant to make the curve area equal to 1 and is the modified Bessel function of the third kind with index λ.
The dependence of the parameters λ, α, β, δ, and µ is as follows: α > 0 determines the shape, 0 ≤ |β| < α the skewness, µ ∈ R is a location parameter and δ > 0 serves for scaling. The parameter λ ∈ R specifies the order K λ function Bessel that appears in the Gh density function and is used to obtain different subclasses of Gh distribution.
In the following, we summarise the differents domains of possibilities for the parameters An important Gh family aspect is that it embraces many special cases such that Hyperbolic (λ = 1), Student-t (λ < 0), Variance Gamma (λ > 0), and the Normal Inverse Gaussian (NIG) (λ = −0.5) distributions.
It nests also Generalized Inverse Gamma (GIG) distribution defined only by the three parameters λ, α, and β. An Inverse Gaussian (IG) distribution is a GIG distribution with λ = −0.5 and a Gamma (Γ) distribution is also a GIG distribution with β = 0.
All of these have been used to model financial returns and log-returns. In

Graphical test.
We are now in a position to use the emef graphs already drawn as tools of goodness of fit.
Emef for Normal Inverse Gaussian (NIG) and t-student-distributions are not monotonic function. They decrease and increase like for emef returns data. For this reason, we fit them to both monthly returns and log-returns from Dow Jones data base (see figure 17, figure 19, figure 21, and figure 23).
Dow Jones data base consists of several compagnies like AXP(American Express compagny), CSCO(Cisco Systems), DAX, CAT, IBM and so one. Each one having 5 values : from opening (op) values to closing (cl ) values , also minimum (min), maximum (max ), and volume (vol ) values.
We select AXP and CSCO compagnies and we consider returns and log-returns for their values as showed in the table 3. Then we construct their emef plot and their fitted counterpart.
Estimates parameters and the emef are given in table 4.          Table 4. Emef for fitted Gh distributions to DAX and CSCO compagnies data.

5.2.4.
Commentaries. In view of figure 10 and figure 17 we can say that t studient distribution fits well opening and minimum values return for the American Express compagny AXP, whereas N IG distribution fits well maximum and closing log-returns values for the Cysco System compagny CSCO in view of figure  11 and figure 19.

Conclusion
In this paper we have established an asymptotic confidence bands for the mean excess function by using functional process approach. Then we applied these bands for fitting Gh distributions to Dowjones financial data. It is a known fact that these ones fit well financial data since they embrace major part of classic distributions.
We remarked that Student and N IG distributions are good candidates for fitting returns and log-returns data showing their semi-heavy tails.

7.2.
Proofs of the uniform asymptotic consistency bounds.  [10]. Further, let ξ 1 , ξ 2 , . . . be a sequence of independent Rademacher random variables independent of X 1 , X 2 , . . . , and G m be the functional empirical process indexed by the class of functions F.
Inequality. Let F be a pointwise measurable class of functions satisfying for some Then for all t > 0 we have, where σ 2 F = sup f ∈F V ar(f (X)) and A 1 , A 2 are universal constants.
And the lemma below of Einmahl and Mason [9] is very helpful for obtaining bounds on this quantity, when the class F has a polynomial covering number. Assume that there exists a finite valued measurable function G, called an envelope function, which satisfies for all where the supremum is taken over all probability measures Q on R for which 0 < Q(G 2 ) := G 2 (y)Q(dy) < ∞ and d Q is the L 2 (Q)−metric. As usual, N ( , F, d Q ) is the minimal number of balls {g : d Q (g, f ) < } of d Q −radius needed to cover F. Here is the device of Einmahl and Mason [9].
Lemma 3. (Einmahl -Mason [9]) Let F be a pointwise measurable class of bounded functions such that for some constants β > 0, ν > 0, C > 1, σ ≤ 1/(8C) and function G as above, the following four conditions hold: Then we have for some absolute constant A, F is pointwise measurable since it suffices to take F 0 = { u , u ∈ I ∩ Q}, where Q is the set of irrationnal numbers.
We have σ 2 To check (A.2), consider any probability Q on R. We get for (u, v) ∈ I 2 , u ≤ v, By a classical result in probability in R, for any given 0 < ε < 1, we may cover [u 0 , u 1 ] by at most ( x stands for the smallest positive integer greater than or equal to x). Let C = (m + 1)ε, we have m < Cε −1 .
For any u ∈ [u 0 , u 1 ], there exists i ∈ {1, . . . , m} such as we have where C F = A M 1 log M 1 , since all the points of the Lemma 3 are checked.
Now we are going to apply the inequality (7.2) first for the class of functions In this case M 1 = 2 since (x) = 1, for any u 0 ≤ x ≤ u 1 , and Let ε > 0, n 1 ≥ 2 log 2 and t 0 such that exp −A 2 t 2 0 n 1 ≤ ε 8 , and exp − A 2 t 0 ≤ ε 8 and t 0 < √ n 1 .