Rebuild Normality in Chinese Financial Markets with Range-Based Bipower Variance

Realized range and bipower variance are two important improvements for increasingly popular high-frequency realized estimators in financial markets. This paper verifies a new type of estimator named realized range-based bipower variance and carries out its empirical research with high frequency data from Shanghai Composite Index (SHCI) and Shenzhen Synthesis Index (SZSI). The results show that 1) this estimator combines merits of both realized range and realized bipower variance. It is as efficient as realized range estimator and at the same time remains a consistent estimation of integrated variance of Chinese financial markets’ fluctuation; 2) After standardized by realized range based bipower variance, the distributions of SHCI and SZSI’s daily returns are neither skew nor with high kurtosis anymore. The fat tail and high peak of daily returns are basically eliminated by this high-frequency variance estimator, and the standardized distributions of these returns are nearly normal. 3) Comparative studies show that among four types of realized volatility estimators, the range based bipower variance is the best one to rebuild normality in Chinese financial markets. These findings mean when measuring volatility or fluctuations of financial assets, the usage of this new estimator will increase the performance of many financial practices like pricing or risk management. One feasible way to extend this paper is to consider co-estimators of related assets and detect their impacts to the dynamics of volatilities.


Introduction and Literature Review
Literatures of using range as the proxy of volatility are prolific.According to the definition of Chou (2005), range is the distance between the highest and the lowest prices of assets in some fixed sampling period.Mandelbrot (1963) uses range to test the long-run dependence characteristic of asset prices.Parkinson (1980) argues the natural logarithm of stock prices roughly follow Random Walk process with a constant diffusion parameter which equals the variation of returns.He also compared variations of the range estimator with variations of the traditional return estimator and found that the range estimator is five times more efficient than the return estimator.Beckers (1983) incorporates the information of close prices into the range estimator and adjusted the estimator by Parkinson (1980), his empirical study of 208 kinds of stocks and options proves that the performance of range estimator is much superior to the traditional return estimator.Intuition tells us that ranges contain more information than close prices.Wiggins (1991) shows that compared with the return estimator, the range estimator have a problem of downward bias.And its efficiency is damaged a lot by the outliers.Once outliers get removed from the sample, the range estimator remains high efficiency.Furthermore, Andersen and Bollerslev (2001) find that range estimator provides a higher R-square value than the traditional return estimator in Mincer-Zarnowitz (MZ) regression when using realized volatility as the proxy of true underlying volatility.

Theory with High Frequency Data in Financial Markets
The standardization of the sample space interval is set to be one for convenience, and divide the sample interval [ , 1] t t  on average for an interval of 1, 2,3 i M   at the length of  .The sampling frequency of is [1/ ] M   .Within the period of  , the last price, the highest price and lowest price are defined sequentially as Andersen and Bollerslev (2002) defined realized variance as the sum of squared returns on financial assets:  is the r -order moment of range calculated by I observations of a standard Brownian motion during unit interval.Even though there is no specific formula for it, when the sampling frequency tends to infinity, the statistic will converge to the following expression: Where ( / 2) x  and ( ) is respectively Gamma and Riemann's Zeta function.In particular, 1 8 / Parkinson(1980).While Barndorff-Nielsen and Shephard (2004) defined bipower variation as : .
For obtaining the theoretical properties of the above three estimators, logarithmic price of assets  is a random volatility process with strictly positive values and in line with Caglad natures, and ( ) B t is a standard Brownian motion.

( ) t
 is a random jump processes, and ( ) q t indicates the occurring time of stochastic jumps with density ( ) t  .
Asymptotic features of realized variance were addressed by a series of papers on quadratic variance, such as ABDE (2001), ABL (2002) as well as Barndorff and Shephard (2005).Their researches concluded that formula (1) converges to the quadratic variance of the above procedure in probability as the sampling frequency M approaches infinity: Realized variance converges to the integral variance of random process assets' logarithmic price according to formula (5).If jumps exist, realized variance will not be a consistent estimator of integral variance.Further, Dijk and Martens (2007) pointed out that the variance of formula ( 2) is about 5 times smaller than formula (1).That is to say realized range is more efficient than realized variance.However, Christensen and Podolskij (2006) concluded that if assets' logarithmic prices subject to formula (4), formula (2) will in probability converge to: As a result, realized range isn't the robust estimator of financial markets' volatility in the presence of jumps even if it is effective.On the other hand, B-S and Shephard (2004) build a large sample theory for realized power volatility based on formula (4).Bipower variation of formula (3) converges to the continuous part variance of formula (4) in probability as the sampling frequency approaches infinity: , 2 1 ( ) According to formula (7), bipower variance is a consistent estimator of integral variance of asset prices and robust to the presence of jumps and other noises.

Realized Range Based Volatility Estimator and Its Consistency
To improve efficiency of formula (4), realized range and bipower variation are combined to design a new kind of volatility estimation named range-based realized bipower variation based on the research of Christensen and Podolskij (2007).Firstly price range is employed instead of returns as the proxy of assets' volatility, and then bipower variation is established to keep the consistency.This estimator is called range-based realized bipower and it is more efficient than realized variance as well it keeps consistency of integrated variance of asset prices.Range-based realized bipower variance is defined as: Like the range estimator proposed by Christensen and Podolskij (2006) The above formal converges to integrated variance of formula (4) in probability as the sampling frequency approaches infinity: , 2 1 ( ) Thus, compared with the other two improvements of realized variance, formula ( 8) is not only more effective but also a consistent estimator of financial markets prices' variance.

Results
The estimator based on high-frequency data will be seriously affected by the market micro-structure noises.Therefore how to select data frequency is very important in the performance of realized variance.ABDE (2001) proposes spare sampling technique to minimize the errors caused by micro-structure noises.Although this approach can reduce errors, it does not use the market information efficiently.Hence the optimal sampling frequency which can minimize the errors and also exhaust market information is needed to be selected.Bandi and Russell (2006) as well as ZMA ( 2006) use the optimal sampling frequency based on the minimization of the mean squared errors.They compared this approach with equidistant sampling intervals which contain dependency micro-structure noises.They find that five minutes' sampling frequency is the most optimum to avoid market microstructure noise.Therefore the five minutes' sampling frequency is commonly used in high frequency data researches.This paper selects 5 minute frequency data of Shanghai Composite Index (SHCI) and Shenzhen Synthesis Index (SZSI).The sample period is form June 02, 2010 to May 26, 2013, a total of 720 trading days, an interval of 4 hours a day with 48 intervals.48 I  ， 720 T  mean for each index the data sets contain 34560 intervals and each interval data has its own opening price, the highest price, the lowest price as well as the closing price.These are used to construct estimators in equation ( 1), ( 2), ( 3) and (8).In the evaluation and comparison of volatility estimation, according to Andersen and Bollerslev (1998) realized variance usually acts as basic evaluation criteria to evaluate the estimation accuracy of models with different parameters.However Dijk and Martens(2007) suppose that standardize returns' unconditional distributions are always the basic evaluation criteria in comparison of different kinds of realized variance estimators.Generally speaking, realized variance should eliminate the non-normal characteristics of original returns as far as possible and turn the standardize returns' unconditional distribution closely enough to normal distribution.Figure 1 presents QQ plots of series in Table 1 against with normal distributions of two main indexes, SHCI and SZSI, while table 1 presents their descriptive statistics of unconditional raw and standardized returns.

Discussions
Volatility is one of the core problems in many financial practices but due to the unobservable it is often modeled as latent variable and its dynamics is regarded as one important source of high kurtosis and skewness of asset returns' unconditional distributions.Rebuild the normality of asset returns' unconditional distributions is a very important to solve many financial problems concisely.
In this paper, the range-based realized variance is constructed with sparse sampling techniques to filter jumps and noises in high frequency data and remain its high effectiveness in estimation of assets volatilities.Then the empirical study with high frequency data from Chinese Shanghai and Shenzhen markets chooses two main indexes, Shanghai Composite Index (SHCI) and Shenzhen Synthesis Index (SZSI), as main study objects.The results show that range-based realized bipower variation does combine the merits of bipower variation and realized range and it can eliminate the non-normal characteristics of original returns and turn the standardize returns' unconditional distribution closely enough to normal distribution.
These findings mean when measuring volatility or fluctuations of financial assets, the usage of our new estimator will increase the performance of many financial practices like pricing or risk management.With the development of the information technology, high frequency data are more and more available.Developing an accurate and robust estimation of assets volatility is more and more important.One feasible way to extend this paper is to consider co-estimators of related assets and detect their impacts to the dynamics of volatilities.

Table 1 .
Descriptive statistics of unconditional raw and standardized returnsNote.The mean and standard deviation in table 1 is 1000 times of the original value, and J-B is Jarque-Bera statistics and in parentheses is P value.In eliminating the skewness of the original daily returns, realized bipower variation RB and realized range-based bipower variance RBV are much better than realized variance RV and realized range RR no matter for or SZSI.The skewness of unconditional returns standardized by RB and RBV are significantly smaller than those with RV and RR.In addition, the skewness of RB and RBV of SZSI's standardized returns are almost 0(-0.002and -0.005) which means their skewness is totally eliminated; (4) The Jarque-Bera statistics are used to whether a distribution is close to normal distribution.The Jarque-Bera statistics of original daily returns of SHCI or SZSI reject the null hypothesis that they subject to normal distribution with the P value of 0. The P value of Jarque-Bera statistics of RBV standardized returns are the highest in both two financial markets which indicates that RBV has better comprehensive properties of "Effective & consistent" than other realized volatilities; (5) Ljung-Box 12-order lag statistic in the table also shows that while dynamics in volatility get rid of returns' high kurtosis and skewness, but the dynamics of raw returns are all remained for further modeling.Appropriate structural models like ARFIMA are still useful in capturing returns' dynamics.