Extreme Value Volatility Estimators and Realized Volatility of Istanbul Stock Exchange : Evidence from Emerging Market

This paper evaluates the forecasting performance of alternative models for the one-day ahead forecasts of BIST-30 index (Istanbul Stock ExchangeBorsa Istanbul major index that contains 30 blue-chip stocks) volatility. Realized volatility is used as the relevant benchmark for the evaluation of forecasts. We document evidence, which shows that realized volatility is a less noisy estimator than the daily square benchmark explaining more of the variation in the volatility. In addition; the benefit of using extreme value estimators as volatility proxies are discussed. It is empirically demonstrated that the extreme value estimators are 5 to 8 times more efficient than historical volatility measures. The use of extreme value estimators with simple forecasting models provide better short-term forecasts than the GARCH based volatility forecasts due to higher efficiency of extreme value estimators.


Introduction
A traditional approach to volatility forecasting is to use historical volatility, which however is not reliable since the volatility tends to vary over time.For this reason; the forecasting of volatility is mostly done using either option implied volatility or using simple time series models (e.g.exponential smoothing, autoregressive model, etc).A more sophisticated group of time series models include ARCH and GARCH family.
Option implied volatility has the benefit of reflecting the market's expectation of future volatility but it is a model-based approach.The forecasting power of implied volatility depends critically on the validity on the option market efficiency and the use of a correct option-pricing model whose violation may lead to a non-negligible bias in implied volatility estimates.In any case; the organized options exchange market in Turkey is not yet liquid enough to provide reliable volatility forecasts.
The results are also not conclusive as to whether the ARCH (GARCH) class models outperform the more simple models.While some earlier studies (Tse, 1991;Tse & Tung, 1992;Figlewski, 1997) report poor forecasts of ARCH (GARCH) class models, several other articles have reported results more in favor of GARCH models.(Brailsford & Faff, 1996;Srinivasan, 2011).
The conflicting views partially stems from a deeper theoretical issue.Andersen and Bollerslev (1998) have argued that the failure of GARCH models to provide good forecasts may be due to a failure to correctly specify the true volatility against which forecasting performance is measured.They suggested the use of realized volatility as a benchmark instead of ex-post daily squared return, which traditionally served as a benchmark.Daily-realized volatility is based upon the cumulative squared returns extracted from intra-day data.
A separate strand of literature emphasizes the use of extreme value (EV) estimators.Extreme value (range) estimators use additional information such as high, low and open prices while the classical volatility estimators are based on close to close or open to close prices.It is argued that high/low prices contain more information regarding volatility than open/close or close/close prices.Empirical results suggest that variances measured by extreme value estimators efficiently approximate the true daily variance (Akay et al., 2010).They are proven to be highly efficient (Shu & Zhang, 2006).Martens and Van Dijk (2007) find that the extreme value estimators are five times more efficient than other volatility proxies.Corrado and Truong (2007) further show that intraday high-low price range provides volatility forecasts as efficient as the high quality implied volatility indices published by Chicago Board Options Exchange (CBOE).
The out of sample volatility forecasts of BIST-30 and BIST-100 indices of Istanbul Stock Exchange received considerable attention with numerous articles written on the subject in the last decade or so.Some of the reported work generate out of sample forecasts of BIST-30 or BIST-100 volatility on the basis of relatively simple GARCH (1;1) or E-GARCH(1,1) models (Gökçe, 2001;Korkmaz & Aydın, 2002;Sarıoğlu, 2006;Atakan, 2009;Güris & Saçaklı, 2011).Some other studies employed a wider class of GARCH models including quite sophisticated ones (Mazıbaş, 2005;Akgün & Sayan, 2007;Bildirici & Ersin, 2009;Gökbulut et al., 2011;Er & Fidan, 2013).It is quite surprising, however, that none of these published work on the subject involve the use of either the realized volatility as a benchmark or the EV estimators despite their widespread application in world academic literature.This paper attempts to fill this gap by analyzing a wide range of naive forecasting models as well as GARCH models to forecast BIST-30 daily volatility.The performances of the out of sample forecasts are evaluated on the basis of theoretically more relevant realized value benchmark.Another goal is to examine the possible contribution of extreme value estimators to out of sample forecasts with a special focus on possible efficiency gains expected from these alternative measures.Another novel aspect of this article is the use of Mincer-Zarnowitz (Mincer & Zarnowitz, 1969) regressions for the evaluation of forecasting performance.This evaluation standard was not used in the literature of BIST-30 or BIST-100 volatility forecasting.
The rest of the article is structured as follows.Section 2 includes a brief definition of realized value and extreme value estimators presenting as well the forecasting models used in this article.Empirical results are discussed in Section 3 and the last section (Section 4) summarizes and concludes.

Realized Volatility and Volatility Proxies
The theoretical justification for realized volatility is well known in the finance literature.Assuming that the logarithmic asset prices are semi-martingale and follow a continuous path; the following stochastic differential equation is frequently used in finance literature to describe the price dynamics (Andersen et al., 2005).
where W t is a Wiener process, μ t is a drift term and σ t is instantaneous volatility.Given (1); the one period continuously compounded rate of return can be formally defined as where the integral is evaluated between t-1 and t.The corresponding integrated volatility, which theoretically is equal to the actual volatility, is then defined as Despite its appeal as the correct measure of otherwise latent volatility, integrated volatility is not directly observable since return calculations and volatility measurements are necessarily restricted to discrete time intervals.Empirically the realized volatility is derived by the summation of the 1/h intraday squared returns with the following definition where Σ runs from 1 to 1/h and r t 2 = log[(S t )log(S t-1 )] 2 converges uniformly to IV t as h→0 by virtue of the quadratic theory of variation (Andersen & Bollerslev, 1998).
The concept of realized volatility relies on the fact that an increasing number of finer sampled high frequency returns defined as in (4) approach to IV t , which theoretically is the true measure of volatility.However; the market structure frictions (e.g.bid-ask bounces, discreteness, irregular trading) put a limit on the possible number of sampled observations per unit of time.Thus there is a trade-off between the measurement error and the microstructure induced stochastic bias.The 5-minute frequency is regarded in the literature as the highest frequency at which the effects of market microstructure effects is minimal.This paper also adopts this approach.
The volatility proxies used in this paper are defined as follows.The daily return R t is defined as the first difference of logs between the successive closing prices as R t = ln P C,tln P C,t-1 (5) where P C,t is the closing price of trading day t and P C,t-1 is the closing price of trading day t-1.The square of R t is the daily squared return mentioned above.An alternative to this close to close historical volatility (denoted as To construct the 5-minute returns; the last price is recorded before the relevant 5-minute time mark and the difference between successive log prices is calculated by where P C,t,d is the index value at the end of the 5-minute mark d on trading day t while P O,t,d is the index value at the beginning of the 5-minute mark d on trading day t.The square of R t,d is the 5-minute volatility and the summation of all observed 5-minute volatilities in trading day t yield the realized volatility of day t. The principal disadvantage of the above mentioned historical volatility measures is the fact that they ignore other available information which may contribute to estimator efficiency (Garman & Klass, 1980).It is argued that the highest and lowest prices observed during a session contain more information regarding volatility.Engle and Gallo (2006) showed that the spread between the highest and lowest prices of a daily price series is a function of the volatility observed during the day and its utilization can lead to improved volatility estimates.
There are three commonly used measures of extreme volatility in the literature, which are also used in this paper.The first one is Parkinson (PK) estimator defined as where the H t and L t are log transformed highest and lowest prices of day t.(Parkinson, 1980).Garman and Klass (1980) noting the downward bias of the PK estimator proposes the following alternative GK estimator using also the opening and closing prices of the day.
Both PK and GK estimators assume a driftless price process, which may lead to overestimation of volatility when security prices (index values) exhibit a distinct trend.Rogers and Satchell (1991) offered an estimator that includes the drift.Their estimator (RK), which is expected to be more efficient than the others in case of a drift, is defined as The main justification for using PK, GK and RS estimators is their higher efficiency.The relative efficiency of these extreme value indicators with respect to historical volatility can be measured by where σ ev 2 is the variance measured by one of the EV estimators above and σ hv is the close to close (or open to close) historical volatility.A value lowers than 1 indicates higher efficiency of EV estimators with respect to historical volatility.

Data and Forecasting Models
This study uses high-frequency BIST-30 data provided by Matriks Corp.The data set reports the value of BIST-30 index at the beginning and at the end of each five minute interval as well as the highest and lowest index values observed in these 5-minute intervals.The BIST-30 index observations cover the period from January 29, 2010 to September 3, 2012 with a total of 648 observations (Note 2).The first 350 observations are used for the estimation of forecasting models while the remaining 298 observations are reserved for forecasting and the evaluation of out of sample forecasting frequency.
There are wide variety of models for forecasting volatility.The ones that are chosen in this article are the ones most widely used by practitioners.They include random walk (RW) model, moving average (MA) model, Exponential Smoothing (ES) Model, Exponentially Weighted Moving Average (EWMA) Model and Autoregressive (AR) model (Note 3).
A primary aim of this article is to check possible contribution of extreme value indicators (PK, GK and RS) to the out of sample forecasting accuracy.To this end; we first generated out of sample forecasts on the basis of naive models mentioned above by using close-to-close version of historical volatility (Note 4).In the second stage; the out of sample forecasts are generated by the above mentioned naive models but replacing the close to close version of historical volatility by the open to close version of historical volatility since this version of historical volatility is more directly comparable with EV estimators whose definitions are based on open to close prices of a trading day.
In the third stage; the out of sample forecasts are generated in a similar manner but by using the PK definition of volatility in the naive models as the relevant proxy.The same process is then repeated by substituting the PK estimator with the GK estimator first and then by the RK estimator.The benefit of this approach is two-fold.First; it shows if the use of EV estimators lead to an improvement in the out of sample forecasting accuracy when they are used as volatility proxies instead of historical volatility proxies.Second; this approach also enables us to identify the best EV estimator in terms of forecasting performance.The benchmark that is used for the evaluation of out of sample forecast accuracy of the one-day ahead forecasts is the daily-realized volatility in the light of the theoretical arguments above.
GARCH based forecasts are also included given the extensive use of these models in the literature.The GARCH models have the additional benefit of capturing the fat-tail characteristics and the volatility mean reversion.The benefit of combining the EV estimators with GARCH models is also analyzed.
We estimated two different versions of the E-GARCH model.The first one is an E-GARCH (1, 1) with normal distribution for returns and the second one is an E-GARCH with a student t-distribution for returns (Note 5).The BIST-30 index returns display skewness and original GARCH models cannot cope with such skewness and forecasts derived from these models may be biased.The model that can generate skewed time-series patterns is the E-GARCH model (Wei, 2002) (Note 6).
The possible contribution of the EV estimators to the E-GARCH model is examined by including the variances estimated by PK, GK and RS estimators as exogenous regressors, which leads to: where σ 2 z,t-1 is the volatility estimated for day t-1 using the EV or historical volatility estimator Z where Z = PK, GK, RS, HV(CC) or HV(OV).The θ is a coefficient significance of which would indicate if any of the EV estimators, taken one at a time, contains additional information for forecasting conditional volatility.The optimal estimation set for E-GARCH (1,1) is determined by examining the stability of the coefficients and checking the violations of the stationary conditions.Furthermore, ARCH-LM test is done and it shows that there is no ARCH effect in the residuals of the E-CARCH model, The one-day ahead forecasts of volatility are then generated using the optimal set by a moving window (Note 7).
A final remark is in order.The E-GARCH approach uses the close-to-close historical return.So the volatility forecasts generated by E-GARCH must be compared with the volatility of the entire day.Then the realized volatility, which is the benchmark to measure the forecast accuracy, must also be defined for the entire day (Vipul & Jacob, 2007).To deal with this problem; we scaled up the realized volatility by considering the ratio of the observed sum of daily close to close historical volatility to the observed sum of open to close historical volatility in the relevant in-sample period.
A major purpose of the paper is testing the possible gain in efficiency provided by EV estimators and to test the possible improvements in out of sampling forecasting accuracy when these alternative estimators are used.The EV estimators, however, may be more biased than classical volatility estimators despite their higher efficiency.The gain in efficiency is measured in this paper by the mean squared error (MSE) and mean absolute error (MAE) using the in-sample observation period (first 350 observations).The bias is measured by the Mean Bias (MB) and the Mean Relative Bias (MRB).Standard error is preferred to variance in the construction of these measures since the former involves fourth moments.
Root Mean Squared Error (RMSE) and Median Absolute Percentage Error (MDAPE) measure the out of sample forecasting accuracy of the suggested models.The use of MDAPE is due to the fact that this measure is relatively more immune to outlier effects and extreme outliers seem to be a characteristic part of BIST-30 volatility series (see Figures 1 and 2).In addition; the out of sample forecast accuracy of the E-GARCH models are also tested by Mincer-Zarnowitz regressions using: where σ 2 k,t is the historical volatility or realized volatility (k = RV or HV(CC) and σ 2 f,t is the forecasted volatility.

Empirical Results
The comparative charts of realized volatility (RV) and close-to-close historical volatility of BIST-30 index are presented in Figure1 in an easy to interpret bar chart format.Figure clearly demonstrates the fact that realized volatility is a significantly less noisy estimator of BIST-30 volatility compared to historical volatility.Extreme outliers are detected in both series.Figure 2 plots EV estimators and compares them with RV.The EV estimators are less noisy than historical volatility but not as much as RV providing further evidence in favor of realized volatility as a relevant benchmark.EV estimators, though more efficient, are known to be more biased than historical volatility measures.The mean bias (MB) and mean relative bias (MRB) also reported in Table 2 showing that this is indeed the case for the BIST-30 volatility.The MRB points to a higher bias for GK and RS estimators but a lower bias for the PK estimator.There is evidently an impressive gain in efficiency when the EV estimators are used but this advantage comes at the cost a higher negative bias.The PK estimator, though less efficient than GK or RV estimators is more immune to bias problem.In fact; the best trade-off between efficiency and bias is achieved in the case of the PK estimator .(it is 5.5 times more efficient on the basis of MSE with a negative bias of only 25% according to MB) (Note 9).The RMSE measure is sensitive to extreme outliers, which are observed clearly in case of BIST-30 (see Figure 1).When the Median Absolute Percentage Error (MDAPE), which is more immune to outlier effects, is used; the benefit of using EV estimators becomes more visible (see  The results suggest the superiority of naive models over the EGARCH model.The best models in terms of forecasting performance are MA and EWMA models in addition to the ES model using PK and GK estimators.Similar results indicating better performance of naive models were also reported in the previous studies (Balaban, 1999;Balaban et al., 2006).The poor performance of EGARCH model may be due noise of squared return innovations and/or the stale information content of long data series required for GARCH estimation (Vipul & Jacob, 2007).
The substantial improvement in the forecasting performance of naive models is due to the efficiency gain provided by the EV estimators.Table 5 below extends the efficiency analysis to the one-day ahead forecasts generated by competing models.The entries in this table report the percentage gain in the efficiency of one-day ahead forecasts when EV estimators are used instead of historical volatility.The efficiency gain is measured by the change in MSE and MAE (Note 11).The results are not conclusive if MSE is used due to outlier effects.When, however, the more robust MAE measure, which is more immune to outlier effects, is used; a clear picture emerges.There is a significant efficiency gain in almost all the naive models when EV estimators are used.A significant gain in efficiency ranging between 15% and 40% can be observed regardless of which EV estimator is used with the best results achieved by the GK estimator.Wilcoxon test affirms the statistical significance of the efficiency gains for almost all entries at the 99% significance level.
Comparatively worse forecasting results of EGARCH model is not due to the use of an inappropriate benchmark such as using a noisy benchmark like daily squared range.Following the approach initiated by Andersen (Andersen & Bollerslev, 1998;Andersen et al., 2001); this paper uses the more robust realized volatility as the proper benchmark.The fact that realized volatility is indeed the more proper benchmark is demonstrated clearly below by using Mincer-Zarnowitz regressions whose results are reported in Tables 6 and 7 below (Note 12).The daily E-GARCH model explains only 8% of the variation in one-day ahead forecasts when evaluated against historical volatility (Table 6).The R 2 rises to 0.45 (Table 7) when the dependent variable is the realized volatility suggesting that the E-GARCH model is able to explain the nearly half of the variation when the benchmark is realized volatility.The probability of the beta coefficient being equal to one also rises substantially.The Mincer-Zarnowitz regressions clearly show that the poorer performance of the EGARCH models is not due to a measurement problem such as using an inappropriate evaluation benchmark but rather stems from the inherent inability of the GARCH models to capture the ex-post return variation (Note 13).

Conclusion
The paper addressed certain novel issues regarding BIST-30 volatility forecasting employing alternative measures and techniques, which were mostly neglected in the previous research.These include the use of realized volatility as a relevant benchmark, the use of extreme value estimators and the evaluation of GARCH forecasts by Mincer-Zarnowitz regressions.
Our results indicate that the realized volatility is a more relevant and less noisy benchmark than the daily squared range despite the fact that the latter was commonly adopted as the benchmark in the previous BIST-30 volatility literature.In addition to strong theoretical arguments favoring the use of realized value as a benchmark; this paper also presents empirical evidence showing that realized volatility is a less noisy volatility proxy than the daily squared range.The Mincer-Zarnowitz regressions show that realized volatility is able to explain nearly half of the future conditional volatility while daily squared range can explain only 8% of the future variation.
A second contribution of the paper is the empirical evaluation of extreme-value volatility estimators such as Parkinson, Garman-Klass and Roger-Satchell estimators.The literature suggests that the EV estimators are more efficient than historical volatility measures though they are more biased.The empirical results show that the use of EV estimators instead of historical volatility indeed presents a very favorable trade-off between the efficiency and bias of BIST-30 volatility.The EV estimators are 5 to 8 times more efficient.
The use of EV estimators instead of historical volatility also lead to higher efficiency in the one-period ahead forecasts generated from naive forecasting models leading to an improvement in the short-term forecasts of naive models.The contribution of EV estimators to the forecasting performance ranges between 15% and 70% for different models when evaluated on the basis of MDAPE measure.

Notes
Note 1. See French (1980) for an extensive discussion of the issue.Note 2. Istanbul Stock Exchange maintains two separate sessions during the day.Due to holidays or other reasons; trading is limited to only one session on some days.These half days are excluded from our data set in order to ensure consistency with normal days.Note 3. The relevant equations of these models are not explicitly mentioned due to brevity considerations.The lags used in MA model is restricted to ten and the lags of AR model is restricted to 5 in the published results since the coefficients beyond these lags were insignificant.The smoothing parameters θ and λ of ES and EWMA models start with an initial value of 0.01 and then are increased each time by 0.01 until we reach 1.The values of θ and λ which minimize the MAE in these runs are selected as optimal values.Since the forecasting accuracy of AR model is sensitive to chosen in-sample windows, we experimented with two different windows including 250 and 350 observations respectively.The AR forecasts are generated by taking the average of forecasts from these two windows.Note 4. The use of close to close version is due to its definition as the classical volatility indicator and also due to the fact that this definition was frequently referred in previous research regarding BIST-30 volatility forecasts.Note 5.The reason for using only E-GARCH models in this paper is two-fold.The first reason is brevity.The primary purpose of the paper is not to compare the relative performance of different GARCH models.E-GARCH models are the best models to employ given the characteristics of BIST-30 returns and volatility as mentioned above.Also we have encountered serious stationarity problems and negative coefficients in case of utilizing T-GARCH and other GARCH models.FIEGARCH is a popular approch also but estimation of these models require an arbitrary truncation of infinite lags leading to a biased mean.Note 6.It is also possiblr to capture the asymmetric impact of news with negative shocks having a greater impact than positive shocks of equal magnitude with E-GARCH models leading to the result that volatility clustering is better captured.Also; the use of log form in E-GARCH model allows the parametrers to be negative without conditional variance becoming negative which usually is a problem in empirical GARCH estimation (Walsh & Tsou, 1998).
Note 7. Estimation sets including less observations (e.g.150 observations) led to serious stationarity problems.The sets that best satisfied the stationarity conditions were the sets including 250 and 350 observations.The first one-day ahead forecast was generated by using the past 250 observations.This process is then repeated by dropping the oldest observation and adding the newly observed value for forecasting the values of sucessive days.The same procedure is utilized in the second stage using the past 350 observations.Given the sensitivity of the results to chosen window; we preferred to report the average values obtained from these two different sets.Note 8.The daily return series (in log form) and all the volatility estimators are stationary as expected on the basis of Augmented Dickey-Fuller and Philips-Perron tests.The null hypothesis of unit root is rejected both for constant and trend at the 1% level of significance.The table for unit root tests is not reported for the sake of brevity but is available upon request.
Note 9.This means that a 1 unit gain in efficiency is achieved by accepting only a 0.045 unit of bias.In fact; it is even less biased than the historical volatility (on the basis of MRB) thus presenting an optimal tradeoff higher efficiency and lower bias.
Note 10.The forecasting performances of MA(q), EWMA(q) and AR(q) methods are presented only for lags of 2,5,10 for brevity.Note 11.A positive number indicates that there is an efficiency gain in the one-day ahead forecasts attributable to EV estimators.The numbers in parentheses give the statistical probability of the null hypothesis.The null hypothesis is the equality of EV based efficiency to the historical volatility based efficiency in the forecasts.
Note 12. Table 6 shows the results of regressing the actual volatility measured as close to close historical volatility on a constant and the one-day ahead out of sample forecasts.Table 7 shows the results of regressing the realized value on a constant and E-GARCH forecasts.A relatively good forecasting fit required a zero intercept and a beta coeeficient close to one whereas R 2 measures model's explanatory power.Standard errors are constructed as in Newey-West to deal with possible heteroskedasticity of residual series.The column under the Wald shows the probability of the regression's beta coefficient being equal to one.Note 13.Though not reported for brevity reasons; we also checked the the ability of the models to correctly predict the one-day ahead direction of change(if volatility will increase or decrease by the next day).All the models have a success ratio over 50% which implies that the models are able to correctly predict the direction of change in more than 50% of the cases in the 298 iterative one-day ahead forecasts.The range of sucess ranges between 50% and 64%.The use of EV estimators instead of historical volatility leads to improvement in the correct assessment of next day's direction.

Figure 1 .Figure 2 .
Figure 1.Realized Volatility (RV) and Close-to-Close Historical Volatility (HV (CC)) between 29.01.2010 and 03.09.2012 HV (CC)) is the open to close historical volatility.An open to close return is defined as ,t is the closing price of trading day t and P O,t is the opening price of day t.The square of open to close return yields the open to close historical volatility (denoted as HV (OC)).The close-to-close volatility considers the effect of incoming information on volatility when the session is closed whereas the open to close volatility restricts the definition of volatility to the price movement during the open hours of the session (Note 1).
Garman andKlass (1980), andRogers andSatchell (1991).000185).The daily standard deviation of realized volatility (0.000164) is only 34.7% of the daily standard deviation of close-to-close realized volatility and nearly half of the daily standard deviation of open to close historical volatility (0.000335).Realized volatility is clearly a less noisy estimator than the historical volatility estimators.The EV estimators have higher daily standard deviations than realized volatility but still their standard deviations are lower than historical volatility measures as expected.All the volatility estimators display significant positive skewness and a highly leptokurtic structure given the kurtosis values in Table1.The assumption of normal distribution is clearly rejected both for returns and all volatility estimators on the basis of Jacque-Bera statistics at the 1% significance level (Note 8).Note.In the table RV refers to daily-realized volatility.R is the daily close-to-close return.PK, GK and RS are extreme value (EV) estimators ofParkinson (1980),Garman andKlass (1980), andRogers andSatchell (1991)respectively.HV(CC) is close-to-close historical volatility and HV(OC) is open-to-close historical volatility.JB is Jarque-Bera test for normality.
Table 1 which reports the mean, median, standard deviation, skewness, kurtosis and Jacque-Bera statistics of daily return, historical volatility, realized volatility and EV estimators.The daily mean of realized volatility (0.000157) is considerably less than close-to-close historical volatility (*, **, *** refer to 10%, 5% and 1% significance level respectively.

Table 2
below deserves special attention.This table compares the efficiency and bias of EV estimators with the open to close historical volatility.The p-values of Wilcoxon test are also reported in parentheses.The null hypothesis here is if two estimators selected for comparison are equally efficient (or biased).The MSE and MAE show very clearly the higher efficiency of EV estimators with respect to open to close historical volatility.In fact; they are a few times more efficient.The MSE of Parkinson (PK), for example, is 5.5 times more efficient than the historical volatility.The Wilcoxon test clearly rejects the null hypothesis of equal efficiency at the 1% significance level.The GK estimator is the most efficient estimator and it is 8.37 times more efficient than the historical volatility.The higher efficiency of all the EV estimators is statistically confirmed at the 1% significance level.A further check using the MAE criterion, which is more immune to outliers, also validates the results above.The GK is 2.45 more efficient than historical volatility while PK is 2.09 times and RS is 2.3 times more efficient.The statistical significance of higher efficiency when MAE criterion is used is confirmed by Wilcoxon test for all the EV estimators.

Table 2 .
Performance of volatility estimators Rogers and Satchell (1991)V(OC)shows open to close daily historical volatility and PK, GK, RS refer to extreme value estimators ofParkinson (1980),Garman and Klass (1980), andRogers and Satchell (1991)respectively.The p values in parentheses based on the Wilcoxon Signed Rank test, show the statistical significance of the bias and efficiency of the EV estimators as compared to open to close historical volatility.MSE, MAE, MB and MRB represent Mean Squared Error, Mean Absolute Error, Mean Bias and Mean Relative Bias and the values are multiplied by 104.First 350 daily observations of the data set are used in the analysis.

Table 3
below presents the forecasting performance of different models on the basis of RMSE and brings forth the possible contribution (or lack of it) of using EV estimators in these models (Note 10).A better forecasting performance is achieved by AR (1) and ES models.The random walk, MA, EWMA and EGARCH methods are the worst performers.A modest gain in out of forecasting accuracy is observed if EV estimators are used as volatility proxies instead of close-to-close (or open to close) historical volatility.The use of PK definition of volatility instead of historical volatility lead to 15.1% improvement in forecasting accuracy and modest gains can be observed in other models as well.The use of EV estimators as exogenous regressors in E-GARCH model also lead to a minor improvement in forecasting performance (5.9% improvement when PK is used and 7.8% improvement when GK is used).

Table 3 .
Performance of Volatility Forecasting Methods (RMSE) Exponential Generalized Autoregressive Conditional Heteroscedastic Model respectively.EGARCHT indicates Exponential Generalized Autoregressive Conditional Heteroscedastic Model with student t distrubution.(student t EGARCH)."NO" means that extreme value volatility in EGARCH models is ignored.In the table, the error terms of RMSE for each model are given.Numbers are multiplied by 10 2 .

Table 4
volatility and to 0.3605 when GK is used (pointing to gains of 59.8% and 70.8% respectively).Substantial gains are also achieved in the other models with the exception of EGARCH model.The use of EV estimators as exogenous parameters in the EGARCH model does not have a significant effect (a MDAPE value of 0.5237 with PK compared to a MDAPE value of 0.5251 with historical volatility).

Table 4 .
Parkinson (1980)olatility Forecasting Methods (MDAPE)Note.PK, GK, and RS are the extreme value estimators of "PK"Parkinson (1980), "GK"Garman ve Klass (1980), "RS"Rogers ve Satchell  (1991).HV(CC) shows close to close historical volatility and HV(OC) is open to close daily historical volatility."RW", "MA, "EWMA", "ES", refer respectively to Random Walk Model, Moving Avearge Model, Exponentially Weighted Moving Average Model, Exponential Smoothing Model.AR (i) and EGARCH show Autoregresive Ordinary Least Squares Model (i time period back) and Exponential Generalized Autoregressive Conditional Heteroscedastic Model respectively.EGARCHT indicates Exponential Generalized Autoregressive Conditional Heteroscedastic Model with student t distrubution.(student t EGARCH)."NO" means that extreme value volatility in EGARCH models is ignored.In the table, Median Absolute Percentage Errors are given for each model.

Table 5 .
Rogers and Satchell (1991) Extreme-Value (EV) Estimators Relative to Historical Volatility(MSE and  MAE)Note.PK, GK, and RS indicate the extreme value estimators of "PK"Parkinson (1980), "GK"Garman and Klass (1980), "RS"Rogers and Satchell (1991).MSE and MAE are Mean Squared Error and Mean Absolute Error.RW, "MA, "EWMA", "ES, refer to Random Walk Model, Moving Average Model, Exponentially Weighted Moving Average Model, Exponential Smoothing Model respectively.AR (i) and EGARCH show Autoregressive Ordinary Least Squares Model (i time period back) and Exponential Generalized Autoregressive Conditional Heteroscedastic Model respectively.EGARCHT indicates Exponential Generalized Autoregressive Conditional Heteroscedastic Model with student t distribution.(Student t EGARCH)."NO" means that extreme value volatility in EGARCH models is ignored.The entries in the table are percentage gains in the efficiency of the EV estimators compared to open to close historical volatility.The p values in parentheses based on the Wilcoxon Signed Rank test indicate the statistical significance of the efficiency of the extreme value estimators as compared to open to close historical volatility.

Table 7 .
Results of Mincer-Zarnowitz regression with realized volatility Independent variables are constant and forecasted volatility.Wald (β=1) is the p-value of F-statistics.In the table, t statistics are shown in parentheses.*, **, *** refer to 10%, 5% and 1% significance level respectively.