Improve volatility forecasting with realized semivariance-Evidences from intra-day large data sets in Chinese

Realized semivariance is reported more informative than realized variance. This paper employs a new modeling approach for the realized semivariance inspired by Chou (2005) in order to capture the asymmetry of volatility in financial markets better. With high frequency data from Shanghai stock market in Chinese, the empirical results, which uses four types of volatility proxies including squared daily returns, daily high-low price ranges, realized variance, and realized range consistently, indicate that this model sharpens the forecast power of existing volatility models in terms of GARCH type models. Mincer-Zarnowitz regression and four loss functions are employed for the assessments in out of the sample forecasting.


Introduction
Volatility has been a traditional tool to measure the risk of the financial market for a long time.It plays a key role in the areas of asset pricing, portfolio allocation, and risk management.In recent years, as transaction data is becoming increasing widely available, great interest has been drawn into the use of high frequency data for measuring and forecasting volatility.This approach is called the realized volatility.One of the advantages to use the new emerging nonparametric volatility approach is that it can fully exploit intraday information and deliver an observable proxy for the volatility.Therefore it makes direct modeling volatility possible and avoids complicated estimation procedures which employ the unobservable volatility approach-the GARCH type and stochastic volatility models.Barndorff-Nielsen, Kinnebrock and Shephard (2008) introduced a new measure for the variation of asset prices based on high frequency data.It is called realized semivariance (RS) and it is reported more informative than the simple realized variance.Inspired by Chou (2005), the same methodology is adopted in that paper for the realized semivariance to capture the asymmetry in financial markets better.Combined with realized semivariance this modeling approach can sharpen the forecast power of existing volatility models intuitively.It is also confirmed in our empirical study through the comparison of four GARCH-type models for non-negative series, proposed by Engle (2002) and known as Multiplicative Error Model (MEM).We employ Shanghai composite index data of one minute's frequency to obtain our daily and realized volatility estimators.Mincer-Zarnowitz (MZ) regression is a widely accepted method for the model comparison.According to Engle (2005), different volatility proxies contain different information about volatility.Therefore, we use six different volatility proxies of both daily frequency and high frequency as the measure volatility in MZ equation: squared daily returns, absolute daily returns, daily high-low price ranges, realized variance, realized range, and realized bipower variation.They consistently indicate that our modeling approach sharpens the forecast power of non-negative series GARCH type models.Besides MZ equation, we use four loss functions in Hansen and Lunde (2005) as criterions for assessing the forecasting ability of the models.For the one step ahead out of sample tests, we also employ an expanding window estimation procedure to simulate the actual estimation adapted to the data updating process.All in sample, out of sample and the expanding window prediction consistently confirmed our intuition that this modeling approach is able to sharpen the forecast power of non-negative series GARCH type models combined with realized semivariance.The rest of this paper has the following structure.In section 2 we will discuss the theory of realized volatility and semivariance.Section 3 introduces our empirical funds.Section 4 is the model comparisons.Section 5 is the conclusions.

Realized Volatility, Realized Semivariance and the Model
Realized variance estimates the ex-post variance of asset prices over a fixed time interval.Since we are going to carry out our empirical analysis based in trading time, we define realized variance as: RV is the sum of squared intraday returns.Although the data arrives into our database at irregular points in time, however according to Barndorff-Nielsen, Hansen, Lunde, and Shephard (2006), these irregularly spaced observations can be regarded as being equally spaced observations on a new time-changed process in the same stochastic class.Thus there is no intellectual loss initially considering equally spaced returns.In arbitrage free markets, P is often considered to follow a semimartingale process.Then as we have increasing data in one day's time interval RV must converge into: Where , μ is a locally bounded predictable drift process andσis a cadlag volatility process, which adapted to some common filtration Ft.Barndorff-Nielsen, Kinnebrock and Shephard (2008) introduced a new measure of variation called realized semivariance.This kind of estimator is solely determined by the single side (upward and downward) moves in high frequency asset prices defined as: Where 1 P is the indicator function taking the value 1 of the argument is true and 0 otherwise.If P is a semi martingale without jumps as , then there would be no difference between They can both be converged into: Under in-fill asymptotic, the jumps in the process of P are: Then the realized variance of P converges into: And the downward realized semivariance and upward realized semivariance will converge into different limits under in-fill asymptotic: From above, we can easily figure out that: . However, since the two components of RV t can be distinguished, it must be more informative than mixed together.For the purpose of volatility measuring, we also introduce two another realized measures here.The first one is called realized range, proposed by Christensen and Podolskij (2005) and Martens and van Dijk (2007).This estimator is inspired by the idea of Parkinson (1980) that range-based variance estimator is much more efficient than return-based estimator.And this one is indeed reported more efficient and less contaminated by micro noises in empirical study.It is defined as follows: In a driftless martingale process, this estimator also converges to quadratic variation.Usually for the estimation of one day's volatility, driftless martingale process assumption is not a bad one.The second one is called realized bipower variation.This estimator is proposed by Barndorff-Nielsen and Shephard (2002).It is defined as: Where µ 1 is a normalization factor.And in a semimartingale process with finite jumps, realized bipower variation converges to integrated variation but not quadratic variation.
Inspired by Chou (2005), we can know that his model can be naturally extended to model the upward (downward) realized semivariances with a little modification: We call this model Asymmetric Multiplicative Error Model (AMEM), according to Engle and Gallo (2006), for Realized Semivariance (AMEM-RS).In the following empirical study, we compare volatility forecasting power in context of out-of-sample forecast of four different models: MEM-RV, MEM-RV with lagged return, AMEM-RS and AMEM-RS with lagged return.

Empirical Results
To calibrate our modelling approach, we employ high frequency Shanghai composite index data in this paper.The data contain observations from January 1, 2007 to January 4, 2013.After deleting the days of unavailable and insufficient information, we have 1570 days' observations of 1 minute's frequency data.The data is from the Wind database.Table 1 gives out the descriptive statistics of raw data and daily estimators obtained from raw data in every day.In order to compare models in terms of their prediction accuracy, we need to use proper proxies for underlying unobservable true volatility.According to Engle and Gallo (2006), there is still no consensus about a "true" or "best" measure of volatility.And "many ways exist to measure and model financial asset volatility".Here we employ six measures of asset volatility for our model comparison.Three of them are three ordinary daily measures: absolute daily returns, daily Parkinson high-low range estimator and the most usual squared daily returns.We give their statistics description in Table 1.The other three of them are realized volatility measures: realized variance, realized range and realized bipower variation with the most used 5 minutes' frequency.Table 2 gives their statistics description together with RS+ and RS-. Figure 2 presents the time series of RS+ against RS-.These two parts of realized variance do look very different from each other, and therefore two separately models for each of them is necessary.The information may be fruitful.
Figure 1.Upside and downside realized semivariance In order to incorporate the leverage effects of lagged returns better, we estimate four models in this section: MEM-RV, MEM-RV with lagged returns, AMEM-RS and AMEM-RS with lagged returns.We employ the simplest form GARCH model for all of the four models-GARCH (1, 1), which is already adequacy in most applications according to Bollerslev, Chou, and Kroner (1992).Table 3 presents the estimated parameters of the four models.

Models Comparison
According to Hansen and Lunde (2005) we continue to use the four loss functions employed by them as criterions for model: The first two loss functions are regular ones.QLIKE is proposed by Bollerslev (1994), and is also called Gaussian quasi-maximum likelihood function, which can easily be recognized that it is originated from the likelihood function of GARCH model from its formulation.R 2 LOG is proposed by Pagan and Schwert (1990), it aims to give some penalty to the asymmetry of the volatility forecasting.Different from the quadratic loss function, it was a proportional loss function.We focus on the out of sample comparisons for finding useful models in prediction of real world.In table 4, r 2 , |r|, range, realized volatility, realized range and realized bipower variance are used as measurement volatility (MV) to judge the performance of the four models' fitting value in last section.It is clear that with most loss functions the lagged realized semivariance (RS-Lag) performs better than other forecasted volatilities (FV).

Conclusion
Volatility is one of the core problems in many financial practices, but the asymmetry of volatility is often confused in arbitrage and risk management because downside volatility is definitely not equal to upside volatility in these fields.Separately modelling the two sides of volatility would be more informative than just mixing them together.
In this paper, we use a new modelling approach to model the realized semi variance with high frequency data in Chinese financial markets.Then the empirical study shows that when measured by six different volatility proxies, the realized semi variance (RS) performs better than the traditional realized volatility estimator (RV).
These findings reveal that when measuring volatility or fluctuations of financial assets, the usage of our new estimator will increase the performance of many financial practices like pricing or risk management.With the development of the information technology, high frequency data are more and more available.Developing an accurate and robust estimation of assets volatility is increasing important.One feasible way to extend this paper is to incorporate the correlation effects between upside realized semivariance and downside ones.

Table 1 .
The descriptive statistics of raw and daily data Raw returns, daily returns and range are all multiplied by 100; squared returns and absolute returns are respectively the squared value and absolute value of daily returns.

Table 2 .
The descriptive statistics of realized estimators

Table 3 .
MEM type models for realized volatility and semivariance Note.Model selection is based on AIC and BIC and numbers in parenthesis are the standard deviations, and stars refer to significance level of 10% (*), 5% (**) and 1% (***).

Table 4 .
Out of sample forecasting comparisons with different loss functions For RS models we use upside RS and downside RS forecasting to synthesize RV forecasting and the model with minimum forecasting errors under four types of loss functions and six types of "true volatility" measurements.

Table 5 .
In sample forecasting comparisons with different loss functions