The Determinants of Sovereign Bond Yields in the EMU : New Empirical Evidence

This paper investigates the determinants of sovereign bond yields in the case of ten Economic and Monetary Union (EMU) countries (five core economies and five peripheral countries) for the period of 2001-2015. To this end, we carry out a two-step methodology based on (i) a principal component analysis of the countries’ yields, which is aimed at splitting our sample into sub-periods, and (ii) a random forest model to investigate the determinants of bond yields in any identified sub-period enhanced with a variable selection process with simulated annealing. Our analysis indicates that macroeconomic fundamentals (especially the unemployment rate, the inflation rate and the government debt to the GDP change rate) are the main variables responsible for the sovereign bond yields in all the countries analyzed, both core and peripheral. In contrast, the bond yields do not seem to be intensively influenced by global indicators over the whole sampling period.


Introduction
The high costs of economic policies carried out to resolve the 2008 US sub-prime crisis has brought significant increases in the fiscal imbalances of many Economic and Monetary Union (EMU) countries.In turn, this has provoked a sovereign debt crisis in some peripheral EMU economies-especially in Greece, Portugal and Ireland-that were forced to resort to financial rescue packages from the European Central Bank and faced a number of credit rating downgrades (Afonso et al., 2015).In this framework, therefore, many empirical studies have tried to empirically investigate the determinants of sovereign bonds yields in the EMU to shed light on the sovereign debt crises in these countries.Indeed, according to the traditional term structure models, yields should be determined exclusively by three factors, namely the interest rate, the risk of default and the expected loss in case of default (Liu et al., 2009).However, these models do not take into account other possible determinants of government bond yields that could be relevant to explaining why the EMU peripheral countries were affected by the sovereign debt crisis.(Boumparis et al., 2015;Bratis et al., 2015;Bernal et al., 2016;Dufrénot et al., 2016;Ho, 2016).With variables that measure financial linkages and regional/global shocks, it is possible to distinguish the effects of the economic situation and the interconnections of countries on bond yields (Gómez-Puig & Sosvilla-Rivero, 2016); this is also known as -fundamentals-based determinants‖ (Calvo & Reinhart, 1996;Kaminsky & Reinhart, 2000).Moreover, some studies suggest that bond yields are due to the behavioral reactions of investors or market participants, not just economic fundamentals (Masson, 1999;Mondria & Quintana-Domeneque, 2013).Benchmark stock market indices, economic policy uncertainty indices, rating announcements from rating agencies, political announcements, economic indicators/information announcements and events are incorporated into models to better understand the determinants of yields (Brutti & Sauré, 2015;Apergis, 2015;Baum et al., 2016).While these studies generally conclude that sovereign bond yields stem from both fundamentals and market sentiment variables, Pragidis et al. (2015) do not confirm this.
Starting from this framework, this study follows Gómez-Puig and Sosvilla-Rivero (2016) by setting two main variable categories: 1) Market sentiment variables that proxy the participant behavior in the economy related to existing information.These variables ultimately shape future expectations.
2) Macroeconomic fundamentals variables that proxy the financial and economical linkages that may adversely affect several countries simultaneously.
Within the above two categories, we have then defined two sub-categories of variables: country-specific and global.

Data
Ten of the EMU countries (Austria, Belgium, France, Germany, Greece, Ireland, Italy, the Netherlands, Portugal and Spain) were selected, and their sovereign determinants of bond yields were assessed.The countries with serious sovereign debt problems are known as -PIIGS‖ (Portugal, Ireland, Italy, Greece, Spain): they are considered to be EMU peripheral countries and exhibit potential default risks.In contrast, the EMU core countries (Austria, Belgium, France, Germany and the Netherlands) exhibit more solid economies: they struggle to support problematic members and are considered to be risk-free by investors.
The yields of the EMU sovereign 10-year mature bonds were used as dependent variables.We initially considered 28 global variables as determinants of the bond yields.To lower the dimension of the problem and to obtain more reliable results, we dropped some variables with a correlation score of higher than 0.9.The remaining global variables were then combined with each country-specific data set, thus obtaining a final database of 17 variables.These are summarized in Table 1, and their full descriptions can be found in Appendix I.
Observations have monthly frequencies from 2001:01 to 2015:12 with a total of 179 observations for each variable for each country (Note 1).

Methodology
The empirical examination was carried out in two steps: 1) A PCA was used to uncover the latent relationships of the countries' yields and sub-periods, which were then detected according to changing relationships and 2) An RF model enhanced with a simulated annealing algorithm was exploited to investigate the bond yields' determinants in any sub-period determined in the previous step.
We then interpreted the relative importance of any variable for any country analyzed in any identified sub-period.

PCA
Many economic and financial data series possess relationships with each other.Although they can sometimes be revealed by statistical models, in many cases, it is not possible to find out the true connections and transmissions.
In this sense, analysts use several methods to discover latent relationships.In this framework, one of the well-known and successful methods is the PCA, which is an unsupervised technique for finding patterns in data sets.The primary objective of the PCA is to reduce dimensions, especially in cases of limited observations with many variables.PCA creates principal components (PCs) composed of normalized linear combinations of variables; each component supplies information about relationships and can be independently examined.These relationships can also be used as proxies of integrations among variables.We used PCA findings to understand how these integrations change through time for the bond yields and thereby the EMU countries.
In the presence of integrity/diversity shifts, different characteristics of the relationships in different periods may affect the impact levels of the determinant variables.A determinant with a relatively strong influence on an idiosyncratic period will lower the chance of distinguishing other influencing determinants over the remaining periods.For a more reliable analysis, dividing a long and heterogeneous horizon into more homogeneous periods provides a less biased examination opportunity and a better understanding of determinants of bond yields.
Before creating PCs, it is necessary to ensure that variables used in PCA are stationary.For this reason, we started our investigation by checking if bond yield series have unit roots with the Augmented Dickey-Fuller test (Dickey and Fuller, 1979).Then, the first principal component (PC1) was constituted by calculating loading values on each variable.Following James et al. (2000), the combination of loading values on the PC1 can be written as where z n1 are scores of sample observations on PC1.Thus, PC1 allowed us to explain most of the variance over the variables used.
The second principal component (PC2) was found in a similar way as PC1.However, since it must be uncorrelated with Z 1 , orthogonality constraint between directions of φ 1 and φ 2 was added to the optimization problem.Such constraint was represented by the loading vectors of the combinations of the loadings of variables on PCs (k), shown as follows: Following this procedure, it was also possible to find the remaining PCs.
Although components can be interpreted in many ways, we focused specifically on the examination of co-integrated behavior among variables to reveal any changes.So, we were able to identify different sub-periods and assess their differentiation with further analysis.We employed the Pruned Exact Linear Time method proposed by Killick and Eckley (2013) and Killick et al. (2012) to identify the structural breaks that take into account the changes in the mean in the score values (i.e., those values associated with integrity) provided by the PCA.

RF and SA Enhancement
After sub-periods were determined, we ran an RF model for any sub-period defined in the previous step by using the variables described in Section 3.1.RF is an approach included in -tree-based‖ methods that recursively subdivides data into smaller groups by means of binary splits and that selects the combination that has the lowest error to explain response.In a classic decision tree, the best split is made at the top of the tree and continues until a predetermined number of splits is made.However, the resulting tree may be too complex to retrieve useful information; a highly dominant variable may inhibit the possibility of considering other predictors and may ultimately result in a biased output.High variance-related problems may emerge, hindering the possibility of generalizing the learning process.RF allows researchers to overcome most of the aforementioned problems by selecting a random subset of predictors every time a new tree is created (James et al., 2000).With RF, each tree is built de-correlated and maintains diversity throughout the process, allowing low biased predictions.Two tuning parameters determine RF performance, namely (i) the size of each predictor subset and (ii) the number of trees to be grown.
In this study, we used one third of predictors for any subset and set the total number of trees to 200.The randomforest function from the randomforest library (Liaw & Wiener, 2002) was used.This tool provides the predictors' explanation powers with an increase in the mean square error of predictions when the corresponding variable is excluded from the subset.Such a measure was, therefore, referred to as the indicator of the influences of determinant variables on bond yields.
Model enhancement was carried out using an SA algorithm.SA provides a variable selection process in data set and aims to improve the success of the RF learning process.Indeed, low and/or non-optimal performances on learning may arise with high-dimensional and complex data sets, such as those used in our investigation.A search algorithm such as SA may significantly reduce the number of variables to be considered by the RF model, thus improving the explanation success achieved (Seyedhosseini et al., 2016;Lwin & Qu, 2013;Wang et al., 2010).SA algorithm use is inspired by the controlled cooling processes of the metals.In practice, the material is heated to a high temperature; then, the temperature is gradually lowered to obtain a minimum-energy crystalline structure.In analogy to this, the SA algorithm begins with a high value of T parameter-which is the system state indicator-and as the algorithm runs, it is gradually lowered to a predetermined termination level.The representation of a solution, which depicts the selected variables, is a random vector consisting of cells with a size equal to the number of determinant variables.Each cell corresponds to a specific variable with a binary value (1 or 0).A cell with a value of 1 indicates that the corresponding variable is used in RF, while 0 indicates exclusion.Selected variables are used for learning, and the variance explained is used as performance measure of the solution.For a predetermined number of times, a new solution (neighbor) is created by flipping a random cell's value to its opposite (1 to 0 and vice versa) at any state.If the neighbor solution has a better performance, it is accepted as the current solution.To maintain the exploration ability of SA, worse performances are also considered.In this case, the neighborhood movement can still be accepted, depending on the following probability: which is known as the -metropolis acceptance criterion‖ (Zapfel et al., 2009).This criterion lets the algorithm avoid a local optimum trap.Furthermore, ∆C is the difference in performances between the current solution and the neighbor multiplied by --1.‖This is because the original criterion is meaningful for minimizing problems, while a sign change is necessary for maximizing problems.

Empirical Findings
We implemented all models and algorithms in the -R‖ statistical computing software (version 3.3.1)via the R Studio interface.The following sub-sections describe, in detail, the results achieved in any step of our investigation.

PCA Results
From the unit-root test, we found that none of the yields were stationary at a 95% significance level.We then calculated logarithmic first differences of all bond yields and derived stationary series as I(1) variables.We then performed the PCA on the resulting data set.
In Figure 1, the first two PCs for bond yields are shown, and the cumulative proportion of variance explained by up to 10 PCs is given.Together, PC1 and PC2 explain 88% of the variance (72% for PC1 and 16% for PC2).Apart from Greece, a high correlation of bond yields can be seen with loadings on PC1.On the contrary, PC2 corresponds to the differentiation of bond yields (the top of 0 belongs to the core countries and the bottom to the peripheral countries).It is possible to say that this component refers to the economic power of a country, Germany and Greece being on opposite sides.This is in line with the actual situation.
Scatterplots of the scores from the observation on PC1 and PC2 are shown in Figure 2. PC1 dynamics do not change before the 150 th month (left plot)-which is roughly around 2.5 years before 2015:12-suggesting that high co-integration has been maintained for a very long period.On the other hand, after the 100 th month (approximately 2009:05) on the right plot, differentiation emerges, proving some breakpoints for the EMU countries' systemic integrity throughout the period.Using the PELT method, 11 change point locations were detected for PC1 (Figure 3, left plot), and 5 points were detected for PC2 (Figure 3, right plot).At TP1, an adequate score was achieved with an explanation rate of 92% by PC1, which can rely on its significance alone.All bond yields show a strong correlation during this sub-period.Thus, TP1 can be defined as the -pre-crisis period‖.
The first two PCs do not have enough explanation power for TP2.A deeper examination is needed.Explanation rates and loading scores of the first four PCs can be seen in Table 2.During this period, the first two PCs suggest a core and peripheral duality.With a priori knowledge, it is clear that investors perceive Germany and the Netherlands to be secure economies while Spain, Italy and Greece are considered the riskiest.PC3 and PC4 of TP2 display more information.PC3 represents the further divergence among the core and peripheral group.PC4 points to Greece's position in particular.Thereby, we concluded that this period is highly associated with changes in the dynamics in the EMU countries and therefore can be addressed as the -crisis period.‖A sufficient explanation rate (82%) was achieved with the first two PCs for TP3.All strong economies tend to move together.PC2 loading shows that countries whose economies suffer from sovereign debt problems are moving in the same direction, but when both PC1 and PC2 loadings are considered, the integrity among peripheral economies is not high as that of the core economies.The crisis impact is distinguishable: this period still refers to a de facto crisis phenomenon, and combined with TP2, it can be considered the crisis period.
Last, for TP4, the first two PCs substantiate the re-integration of the core and peripheral countries-except Greece-with a high explanation power (82%).Observable distinctions still exist among groups on PC2.The most notable point is Greece's position compared to that of the other countries on PC1 and PC2, and this comparison deserves more emphasis.Excluding Greece, this period can therefore be considered the -post-crisis period.‖

RF and SA Results
After running each model 20 times, we calculated the average importance values of the indicators and consulted the rate of variance explained as the performance measure of the RF model.Table 3 shows that the explanations for both the pre-crisis and crisis periods are adequate for all countries, while the performance evaluation for the post-crisis period is not satisfactory.This may be due to the lower number of observations over this period that, in view of the large number of variables, may have caused an overfitting problem.In this framework, a variable selection technique is an ideal tool for overcoming a high dimensionality problem by excluding the irrelevant variables from the model.The SA algorithm was hence implemented as a variable selection approach to obtain an optimal or near-optimal subset of predictors.In Table 4, the enhanced learning performances are reported.The table shows a substantial improvement in the magnitude of the post-crisis period.Paired-samples t-tests and the non-parametric Mann-Whitney tests were also conducted to assess the significance of explanation performances.At a 1% level of significance, the equality hypothesis of the mean explanation performances of the RF model and SA-RF model was rejected.This indicated that the results from our enhanced model may conveniently be relied on.

Empirical Findings on the Determinants of the Sovereign Bond Yields in the EMU
In this section, we finalize the results achieved from our investigation by assessing the relative importance of the determinants of bond yields for any country in any identified sub-period.

The Pre-Crisis Period
The importance of each variable in the pre-crisis period is reported in Table 5.

Core countries
The EMU's unemployment rate, the EMU's inflation rate, the US's unemployment rate and the country's unemployment rate are found to be the main influencers for Belgium.Apart from the EMU's area inflation, Germany was affected by the same influencers, with the US dollar playing a major role.These results are also identical to those of the Netherlands.US unemployment and financial corporations' debts had relatively minor effects when compared to France's unemployment.With the EMU's unemployment and the financial corporations' debts, the Brent oil price was one of the primary influencers for Austria.
Overall, these results suggest that during the pre-crisis period, macroeconomic fundamental variables were highly determinant on the bond yields of the core countries, with the EMU's unemployment rate playing a major role.

Peripheral countries
The unemployment rate was the primary influencer for Ireland, followed by the EMU's inflation, US unemployment and the EMU's industrial production.Spain's bond yields were mainly modified by the US dollar, the EMU's inflation, the country's inflation, the unemployment and the government debt to GDP ratio change rates.Five remarkable influencers were found for Italy: the country's unemployment rate, the US dollar, the EMU's inflation rate, the EMU's unemployment rate and the Brent oil price.In the case of Portugal (where a market sentiment indicator was found to be important), bond yields were influenced by the consumer's confidence indicator, the country's and US's unemployment rates and the EMU's inflation rate.Consequently, for the pre-crisis period, the countries' unemployment rate and the EMU's unemployment rate were found to be the two most significant variables that affected the bond yields in the countries analyzed, followed by the US's unemployment rate and the EMU's inflation rate.In contrast, the Kansas City Financial Stress Index appears to be the only market sentiment variable that exceeded the average (39.408) of the grand total of importance values.

The Crisis Period
Table 7 reports the importance of each variable during the crisis period.

Core countries
During the crisis period, the fundamental variables that affected the core countries during the pre-crisis period still preserved their relevance.For Belgium, the EMU's inflation and unemployment rates, the US dollar and the US's unemployment rate were in foreground.The EMU's unemployment rate was crucial for the Netherlands and was a determinant for all the core countries.Germany, France, the Netherlands and Austria's bond yields were all influenced by their own country's unemployment rates.Another primary influencer for the bond yields in both Germany and Austria was the US's unemployment rate.The US dollar affected France, the Netherlands and Austria's bond yields.Austria's GDP change rate was another relevant variable the country.The economic sentiment indicator of the Netherlands seems to be the only market sentiment variable that had a notable effect among core economies.

Peripheral countries
The country unemployment rates, the GDP change and the US's unemployment rate represented the three primary influencers for Ireland with an increased impact compared to the pre-crisis period.Higher exposure to the country's economic sentiment indicator, the EMU's inflation rate and the country's government debt to GDP ratio change was received by the Greece bond yields.The country's inflation rate, the country's government debt to GDP change, the US dollar, the EMU's inflation rate and the EMU's industrial production rate were the most significant bond yields influencers for Spain, with the Kansas City Financial Stress score being a relevant market sentiment indicator.The country's unemployment rate and the EMU's inflation rate were important for Italy and Portugal bond yields.The world GDP's change rate, the US dollar and the US's and EMU's unemployment rates were other noteworthy determinants for Italy.For Portugal's bond yields, the country's GDP change rate, the consumer confidence indicator and the Kansas City Financial Stress score were in the foreground.
Overall, the EMU's unemployment and inflation rates were the biggest influencers on bond yields for the countries analyzed, followed by the countries' unemployment rates.Also, the US dollar had a growing effect on yields.This may indicate a slightly more determinant global spillover effect on bond yields rather than a country-specific one.Moreover, during this period, one more market sentiment variable (the country's ESI) exceeded the average (39.886) of a grand total of importance values.Fundamental variables seem to become more effective during the crisis period for all countries except Portugal, Germany and France (Table 8).The most notable increases are observed for Ireland, the Netherlands and Austria, while, in contrast, a significant decrease is remarkable for Portugal.

The Post-Crisis Period
Finally, the importance of each variable during the post-crisis period is reported in Table 9.

Core countries
The EMU's unemployment rate is the most significant determinant for all of the core countries, excluding Belgium, which is also the only core economy affected by the Kansas City Financial Stress score.The country's and world's GDP change rate, the EMU's inflation rate and the financial corporations' debt were other important variables affecting the bond yields in Belgium.Germany appeared to be connected to the US-related indicators during this period, with the Netherlands only bound to the US dollar and Austria to US unemployment.The country's unemployment rate is the primary influencer for Germany and France, and it still influences the Netherlands, although less than the EMU's unemployment rate.The European Brent oil price is the last notable variable affecting bond yields in France.Other significant variables are the country's economic sentiment indicator and the financial corporations' debt for the Netherlands, and the country's GDP change rate for Austria.

Peripheral countries
The country's unemployment rate is the common influencer for bond yields for all countries except for Spain.
Ireland and Portugal are affected by the country's GDP change rate, while the US's unemployment rate is particularly significant for Ireland.Compared to the crisis period, therefore, many variables now exert a lower impact on the bond yields for all countries, although with a similar average (39.132) of the grand total of the importance values.Both the EMU and the country's unemployment rates represent the two most relevant variables affecting the bond yields, exactly as in the pre-crisis period.The Kansas City Financial Stress Index has still an influence, while, instead of the country's ESI, the EMU's ESI seems to play a more relevant role.The world's GDP change rate is another determinant exerting a conspicuous influence.
Overall, the fundamental variables continue to be the leading indicators (Table 10).Most notable increases can be observed for Greece and Spain, while a significant decline may be seen in the Netherlands and Austria.

Conclusions
Understanding the determinants of sovereign bond yields can be particularly relevant for policymakers to draw worthwhile hedging strategies and design the most suitable crisis-management policies.At the same time, it may be useful for investors who can derive relevant information for orienting their investment strategies.In this framework, this paper has contributed to the empirical literature on the sovereign bond yields by investigating their determinants in the case of 10 EMU countries (5 core economies and 5 peripheral countries or -PIIGS‖) for the period of 2001-2015.Our analysis indicates that macroeconomic fundamentals (especially the unemployment rate, the inflation rate and the government debt to GDP change rate) are the main variables responsible for the bond yields in all the countries analyzed, both core and peripheral.These variables can, therefore, be conceived as the main representative of the economic health of a country and may reveal its vulnerability to the international speculators (mainly institutional investors).In other words, by empathizing the economic fragility of countries, these variables act as signs for attracting bear market speculative attacks that, in turn, generate financial tensions upon the public accounts.In contrast, the sovereign bond yields in the EMU do not seem to be intensively influenced by the global indicators (e.g., the US's unemployment rate, the Brent oil price) during the whole sampling period.Another interesting achievement of our analysis is that previously active linkages between US and EMU have diminished through time, ultimately leading the core and peripheral countries to a position detached from the global economy.This finding provides further evidence that, due to the structural and political weaknesses of the EU, the EMU has been under the speculative attack of investors, attracted especially by those countries with significant structural deficiencies.
In light of this, our results can be particularly useful for designing more tailored policy interventions and strategies in the EMU, provided that understanding the determinants of bond yields can also be of great importance for overcoming the debt crisis affecting many EMU countries.It is worth noting that the economic literature has suggested other possible determinants of bond yields in addition to those considered in this paper, such as a change in risk aversion or an updating in creditors' beliefs about the likelihood of a sovereign default.Therefore, further lines of research could assess such additional determinants so that policymakers can be fully informed about the potential externalities from a sovereign default.Finally, our findings could induce the detailed analysis of the risk of contagion resulting from speculative strategies of institutional investors, above all in those countries or areas with structural weaknesses that may significantly suffer from a speculative attack, as happened in several EMU countries in recent years.

Figure 1 .
Figure 1.The first two PCs for bond yields and the cumulative proportion of variance explained by PCs Source: Own elaboration.

Figure 2 .
Figure 2. Scatterplots of the score values of the observations

Figure 3 .
Figure 3. Changes points in means of score values of observations on PC1 and PC2 Source: Own elaboration.

Figure 4 .
Figure 4.The first two PCs for bond yields and the cumulative proportion of variance explained by PCs (TP1) Source: Own elaboration.

Figure 5 .
Figure 5.The first two PCs for bond yields and the cumulative proportion of variance explained by PCs (TP2) Source: Own elaboration.

Figure 6 .
Figure 6.The first two PCs for bond yields and the cumulative proportion of variance explained by PCs (TP3) Source: Own elaboration.

Figure 7 .
Figure 7.The first two PCs for bond yields and the cumulative proportion of variance explained by PCs (TP4) Source: Own elaboration.

Table 2 .
Loading on the first four PCs of TP2 and their explanation rates

Table 3 .
Means of the variances explained by the RF model

Table 4 .
Means of the variances explained by the SA-RF model

Table 5 .
Variables' importance during the pre-crisis period

Table 6
reports the selection rates of variable categories, showing that fundamental variables were more effective for all countries analyzed during the period.

Table 6 .
Selection rates of variables' categories during the pre-crisis period Source: Own elaboration.

Table 7 .
Variables' importance during the crisis period

Table 8 .
Selection rates of variables' categories during the crisis period The country consumer confidence indicator is noteworthy for both Spain and Portugal, while the country's inflation rate, the US dollar, the EMU's inflation rate and the financial corporations' debt are other highly influencing variables for Spain.The European Brent oil price was a relevant influencer for both Italy and Portugal.The country inflation, the EMU's inflation rate and the US dollar are other notable variables for Italy.Finally, the EMU's inflation rate and the Kansas City Financial Stress scores are remarkable determinants for Portugal.

Table 9 .
Variables' importance during the post-crisis period Source: Own elaboration.

Table 10 .
Selection rates of variables' categories during the post-crisis period