Beyond Beta-Convergence: Convergence in Differences and its Application to the Russian Regions

The purpose of this paper is to propose a new empirical model capable of highlighting some aspects of cross-economy convergence which cannot be caught by the popular beta-convergence and sigma-convergence models. The idea is to analyse the growth of the economies as a function of the distance between the observed output per capita and the average output per capita within the sample, separating the behaviour of poorest and richest economies. After its specification, I applied the model to the case of the Russian regions over the period 1995-2015 using the fixed-effect estimator. The results show that, although the existence of a significant beta-convergence process, there is a lack of convergence in differences. When the differences between regional and national output per capita are negative, a positive and significant relationship between growth and levels emerges. Such a relationship turns to be negative and non-significant when the differences are positive, therefore denoting weak non-linearity between growth rate and level of output per capita. Similar findings have been found for labor productivity.


Introduction
Nowadays the issue of economic growth and convergence is no more as debated as in the second half of the previous century. It reached the peak of the attention in the economic literature in 80's and 90's. We can distinguish two main groups of economic growth theories: the exogenous and the endogenous growth models. These theories have been largely empirically tested using cross-countries and cross-regional data for several time periods, thus letting the convergence controversy arise (Note 1).
The neoclassical (or exogenous) theories of economic growth, based on the pioneering paper of Solow (1956), predict long-run convergence in levels of output per capita between economies with the same structural characteristics (absolute convergence), thus implying that all the economies within the sample will reach the same steady state level. If the structural characteristics are not homogeneous, the economies will converge towards different steady state levels (conditional convergence) (Note 2) but their growth rates will not differ in the long-run equilibrium. The neoclassical theories of growth are based on two key-assumption: decreasing returns to scale of physical capital and exogenous technology level (Romer, 2006). A plenty of so called "endogenous growth theories", often referred to as "AK models" have been developed already since 60's (Arrow, 1962;Uzawa, 1965;Nordhaus, 1969;Shell, 1973;Romer, 1986;Lucas, 1988;Barro, 1990;Kremer, 1993;Jones, Manuelli, & Stacchetti, 2000). One of the most important innovation was to allow for the existence of increasing returns to scale of physical capital (Dixit & Stiglitz, 1975;Skiba, 1978;Nishimura, 1983;Romer, 1986).
The most common ways to empirically test the neoclassical growth theories are the so-called "unconditional beta-convergence" and "conditional beta-convergence" models.
The unconditional beta-convergence model, firstly introduced by Baumol (1986), can be represented by the following equation: The existence of a beta-convergence process implies that the coefficient is negative and statistically significant. From the economic point of view, the beta-convergence process implies that the poorest economies grow, on average, faster than the richest economies (Note 3).
In order to control the heterogeneity between the economies, many scholars adopted and extended the version of the above-mentioned model, which is called "conditional beta-convergence". It can be represented as follows: which is an extension of the equation 1, represented by the vector of variables in which are included the conditioning factors (Note 4).
The empirical models testing for the endogenous growth theories are often based on the conditional beta-convergence model, with the inclusion of factors which are considered exogenous by the neoclassical growth theories (Note 5).
Nevertheless, due to the complex nature of the phenomenon under analysis, the literature is still far from providing robust results which can be considered generally valid. Indeed, Mankiw, Phelps, and Romer (1995) state that the payoff resulting from endogenous growth theories lacks of clarity. Furthermore, he says that physical capital is even more relevant than human capital in explaining cross-country differences in output per capita, since knowledge can more easily spread between economies.
From the empirical perspective, several sources of biases in the growth regressions arise (Aghion & Durlauf, 2005;Acemoglu, 2009). Panizza and Persbitero (2013) state that almost the totality of empirical studies in the field of economic growth provides results which could be biased, mainly because of endogeneity.
In the most recent contributions regarding economic growth, the system GMM estimators developed by Arellano and Bond (1991) and Blundell & Blond (1998) have been widely employed in order to address endogeneity. Nevertheless, the GMM approach, although enjoys several advantages compared to other approaches, could lead to non-reliable results due to the problems of "over-instrumentation", which in turns leads to weakness of instruments (Roodman, 2009).
Furthermore, Roodman (2009) suggests to using a more straightforward fixed-effect estimator when the time series are sufficiently long, since the dynamic panel bias becomes insignificant.
In addition to beta-convergence model, the so-called "sigma-convergence model" has gained relevance over the years. According to the definition of Sala-i-Martin (1996), a group of economies experiences sigma-convergence if the dispersion of their real GDP per capita tends to decrease over the time. He also highlights the direct relationship between beta and sigma-convergence: if poor economies grow faster than rich economies, the dispersion of the real GDP per capita within the sample will decrease (Note 6). Thus, a necessary condition for the existence of sigma-convergence is the existence of beta-convergence, but the existence of beta-convergence does not automatically imply sigma-convergence.
Regardless the choice of the econometric techniques, I believe that both beta and sigma-convergence models, although are capable of providing very useful information, are characterized by some limitations so far ignored in the economic literature.
Firstly, they are very sensitive to the presence of outliers, also because this kind of analysis is generally carried out within a limited sample size (equal to the number of economies, regardless the length of the time span).
Secondly, the average causal effect measured by the coefficient might led to stating that, within the sample, there is a significant beta-convergence process, thus implying that the economies have been converging toward the same steady state level. This conclusion is often imprecise. The reason is that the behaviour among the economies whose output per capita level is below the national average might considerably differ between them: the beta-convergence process can be totally driven by the economies whose income level is below the national average, but that still are not the poorest. In such a situation, both the processes of sigma-convergence and beta-convergence can coexist with the persistence of differences in terms of output per capita levels.
For the above-mentioned reasons, the purpose of this paper is to propose a new empirical model capable of analysing the process of convergence/divergence in differences in terms of output per capita within a sample of economies. This kind of analysis is not an alternative to the popular beta and sigma-convergence models, but it can be carried out alongside them in order to understand better the convergence dynamics within a sample of economies in a given period of time. The mathematical framework and the empirical construction of the model ijef.ccsenet.org International Journal of Economics and Finance Vol. 12, No. 10; 2020 will be carried out in the section 2.
After the construction of the model, it will be performed an empirical application to the case of the Russian regions over the period 1995-2015. The process of convergence between the Russian regions represents an interesting case of study since the Russian Federation is known in the literature as one of the countries characterized by the highest internal disparity at regional level (Benini & Czyzewski, 2007;Badunenko & Tochkov, 2010). Furthermore, testing a convergence model to analyse the cross-regional convergence rather than cross-countries gains some advantages in terms of homogeneity of several macroeconomics characteristics, which can be considered region-invariant within the same country (Note 7). The observable region-invariant characteristics should be controlled in cross-country analysis.
The studies focusing on the process of convergence/divergence between the Russian regions are characterized by relevant differences in terms of methodology, time intervals and variable taken into account. Furthermore, the scarcity of variables covering the entire Russian territory and the years of transaction from a socialist to a market economy represented an obstacle to the implementation of deepened empirical analysis. Nevertheless, there exist general robust findings which describe well the evolution of income disparities between the Russian regions, mainly measured by the beta-convergence models. Indeed, there is a clear common evidence for beta-divergence between the Russian regions during the first part of the transition period (Popov, 2001;Dolinskaya, 2002;Fedorov, 2002;Berkovitz & Dejong, 2002;Carluer, 2005;Benini et al., 2007;Badunenko et al., 2010) whereas evidence of beta-convergence has been found in the "Putin era" (Akhmedjonov, Chi, & Izgi, 2013;Oshchepkov, 2015) which in particular increased during the last decade (Vakulenko, 2016;Durand-Lasserve et al., 2018;Kaneva & Untura, 2019).
In a recent contribution, Lehmann, Oshchepkov, and Silvagni (2020), in the spirit of the neoclassical growth theories, estimated growth equations including human capital and migration as conditioning factors. They do find convergence over the period 1996-2017, and a determinant role is played by interregional migration, whereas no significant impact is found for human capital.

Convergence in Differences
The idea of strong convergence in differences can be mathematically represented as follows: where is the real output per capita, denotes the -th economy, denotes the time, represents the economy with the highest output per capita at time , is the average output per capita within the sample at time . The process of strong convergence implies that all the economies within the sample converge to the same steady state level.
The weak convergence in differences can be represented as follows: where represents the growth rate of real output per capita. Weak convergence allows the existence of economy-specific steady state level. Strong convergence implies weak convergence but not vice-versa.
It is interesting to note that neither beta-convergence nor sigma-convergence imply convergence in differences. Indeed, the persistence in differences can coexist both with an above average growth of a subset of the poorest economies and with an aggregate reduction in income dispersion. Instead, the presence of convergence in differences implies both beta and sigma-convergence. Therefore, convergence in differences is the strongest condition of convergence. Although it could be too demanding in real world, its analysis can provide useful insights about the convergence dynamics and the behaviour of the economies on their balanced growth paths.
Empirically, it is not straightforward to test the equations 3 and 4. I would like to propose the following empirical specification: where the variable | _ | represents the absolute value of the difference between the observed level of output per capita and the average level of output per capita within the sample, is a vector of control variables. The dummies and represent, respectively, the time invariant and temporal effects, whereas the indexes and denote, respectively, the -th economy and the -th period.
The coefficient will tell us how responsive is the growth to the distance between the average and the observed level of output per capita. Nevertheless, the equation 5 is not our main focus since the variable | _ | includes both observations whose output level is below and above the output mean (Note 8). The following two ijef.ccsenet.org International Journal of Economics and Finance Vol. 12, No. 10; 2020 equations can be more useful.
The variables in the equation 6 are the same as the variables in the equation 5, with the difference that here appears the indicator function ( , < , ), which takes values equal to one for the observation whose level of output per capita is below the national average in a given year and zero otherwise. The interaction between the variable | _ | and the indicator function will provide an estimate which will tell us the average change of the growth rate to a unit increase in the absolute value between the observed and average output per capita. Since in the equation 6 are taken into account only the observations for which the output per capita level is less than the average output per capita, we can refer to such equation as convergence from below. , = + _ , * ( , ≥ , ) + ′ , + + + Ɛ , The equation 7 is the same as the equation 6, with the only difference that the indicator function takes value one for the observation whose level of output per capita is above or equal to the average output per capita in a given year and zero otherwise. Furthermore, since the interaction term assures that the differences between observed output per capita and average output per capita is positive, there is no need to employ the absolute value. We can refer to such equation as convergence from above.
We can say that the process of convergence in differences is strong if the coefficient in the equation 6 is positive and statistically significant and in equation 7 is negative and statistically significant. Indeed, the combination of these two conditions implies that the differences between the average output per capita and the observed output per capita decrease both from below and from above. Such a condition is quite strong since it implies that the economies within the sample converge toward the same or similar steady-state level of output per capita. If only one of the two above-mentioned conditions is met, we can say that the process of convergence in differences is weak.
If the parameter is negative and statistically significant, it means that only the richer among the poorest economies converge toward the average output per capita. If the parameter is positive and statistically significant, it means that the average growth increases as the output per capita level increases, therefore violating the neoclassical assumptions.

If parameters and
are not statistically significant, it means either that the economies reached their steady state level or that the output level does not explain variations in the growth rate.
It is necessary to note that the split of the analysis in the estimation of two separate equations, although can provide a more detailed analysis, implies a loss of observation within the two models.
Anyway, an important pros of the convergence in difference model compared to the classical beta-convergence model is that in the first case, taking advantage of the panel context, we have observations whereas in the second case we have only observation. Indeed, in such a context, each observation is made by an economy at a given period, therefore obtaining observations for each country/region. It further implies that we can control for the behaviour of each economy at each point in the time.
Following the general suggestions of Islam (1995) and Roodman (2009) for the choice of the estimator for the growth regressions, I will estimate the equations 5, 6 and 7 through the fixed-effect estimator.

Empirical Application to Russian Regions
In this section, taking advantage of the recent dataset built by Mirkina (2017), I will apply the methodologies described in the previous section to the case of Russian regions in order to analyse the process of convergence between them over the period 1995-2015. The dataset includes time series for 82 Russian regions (namely Republics, Krais, Oblasts, Federal Cities, Autonomous Oblasts, Autonomous Okrugs) (Note 9).
After a proper descriptive analysis, I will estimate beta-convergence and convergence in difference models both for Gross Regional Product per capita (GRP) per capita and labor productivity.
The definitions of the variables are attached in the Appendix.

Beta and Sigma-Convergence
In order to estimate the equation 1 for both output per capita and labor productivity, I will employ as dependent variables not only the initial levels of the variables, but also the absolute value of the difference between regional level and national mean of output per capita (| _ _95|) and labor productivity (| _ _95|) at the year 1995. Consistently with the main literature, the estimates show a significant process of beta-convergence between the Russian regions over the period 1995-2015 both in terms of GRP per capita and labor productivity. Nevertheless, the relatively low 2 values suggest that the growth of the regions depend on further factors which are not included in the table 1.
The estimates are the same between the initial level and the differences of the variables, with the only small variations in the values of the intercepts. ijef.ccsenet.org International Journal of Economics and Finance Vol. 12, No. 10; 2020 In the graph we can observe the dynamics of beta and sigma-convergence of output per capita and labor productivity over the period 1995-2015. In all the cases, the beta-coefficients are estimated taking into account the year 1995 as initial period. We can notice some interesting characteristic: a) GRP per capita and labor productivity follow almost the same dynamics; b) the process of beta-convergence has been intensified since the post-crisis period; c) the intense process of beta-convergence was not accompanied by a process of sigma-convergence, whose levels have been quite persistent since 2007 with a slight increase in the latest three years of the sample.

Convergence in Differences
In this section I will provide the results of convergence in differences models. First of all, it is worth to analyse graphically the phenomenon in order to perform a useful preliminary analysis. Figure 3. Output per capita growth and differentials between regional and national output per capita (Yearly observation) ijef.ccsenet.org International Journal of Economics and Finance Vol. 12, No. 10;2020 In figure 3 we can observe the relationship between the output growth and the distance between regional and national output per capita. The observations located above the horizontal red line experienced a positive growth in a given year, representing the majority of the distribution. The observations at the right (left) of the vertical red line represent the regions whose output per capita is, in a given year, above (below) the national mean.
Four main insights emerge from the graph: a) the majority of the observations lies at the left of the vertical red line, implying that relatively few observations are much above the national average; b) among the observations whose level of output is above the average, there are some outliers which push above the national mean; c) among the observation whose output level is below the national mean, there is a clear positive relationship between output and growth; d) conversely, among the observations whose output level is above the national mean, the relationship between growth and output appears to be negative.
Despite the strong evidence of both beta and sigma-convergence among the Russian regions, it appears that the relationship between growth and output level is non-linear. Indeed, contrary to the predictions of the neoclassical growth models, being among the poorest does not imply having the highest growth rates. In this case, the neoclassical growth hypothesis seems to be verified only among the group of observations whose output levels are above the national average. Therefore, the existence of a significant process of beta-convergence is strongly affected by the presence of outliers whose levels are much above the national mean and whose growth rates are very close to zero.
It is also interesting to notice that the highest growth rates are mostly linked to output values close to the national mean. Figure 4. Boxplot of the difference between Regional Output Per Capita and National Output Per Capita The box-plot in figure 4 highlights the presence of a considerable number of outliers in the right tail of the distribution.
It is now interesting to analyse the estimates obtained through the estimation of the equations 5, 6 and 7, applying it both to output per capita and labor productivity. Robust standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1.
The estimated coefficients are almost identical between output per capita and labor productivity regressions, but the formers are characterized by lower explanatory power.
The variables | _ | and |D_LP| are not statistically significant, thus highlighting that differences between the regional levels of output per capita and labor productivity and their national averages do not explain variation in the growth rates of output per capita and labor productivity.
The estimated coefficients turn to be negative and statistically significant when the interaction terms | _ _ | and | _ _ | are considered: therefore, there is not convergence in differences from below. Indeed, the negative estimated coefficient of the two variable implies that within the group of the poorest and less productive regions, the relationship between levels and growth is positive, therefore suggesting for a process of divergence in differences from below.
When the richer and more productive regions are taken into account, the estimated coefficients are not statistically significant. Economically, this result could be due to the fact that the richest and more productive regions have reached their steady state levels over the years (Note 10).
It is now worth to estimate the model with the addition of the control variables. The control variables I choose approximate the main conditioning factors suggested by the pioneering literature on convergence (Note 11). Even though the model specification differs from the conditional beta-convergence models proposed by the main literature, I believe that the conditioning factors employed in the conditional beta-convergence model are likely to improve the performance of the convergence in differences models. With the addition of control variables, there are not dramatic changes in estimates and their significance. Therefore, the estimates of the variables of our interest seem to be robust.
The significance of the variable |D_LP_NEGATIVE| increases and, in general, the controls are statistically significant only when the poorer and less productive observations are considered.
There are therefore enough elements to state that being among the poorest does not imply performing better. The highest performances occur when the regions have an output level close to the national mean, denoting a possible non-linear impact of the level of output per capita and labor productivity on growth.

Concluding Remarks
The methodology implemented in this paper and its application to the case of the Russian regions put in light some interesting insights. Firstly, the process of beta-convergence between the Russian regions over the period 1995-2015 was not accompanied by a process of convergence in differences: conversely, divergence in differences occurred, consistently with the fact that the sigma-convergence process has slightly increased over the latest 5 years of the time interval. Secondly, the relationship between output growth and output level is non-linear: high distances between regional per capita output and average per capita output (both from below and above) are linked to the lowest growth rates, whereas as the distances decrease the growth rates increase. Thirdly, the above-mentioned non-linear relationship between per capita output growth and per capita output level could be due to nonlinearity in the returns to scale of physical capital: low levels of output per capita can be connected to decreasing returns to scale, middle levels of output per capita to increasing returns to scale and high levels of output per capita to decreasing returns to scale, depending on the evolution of the production structure of the regions. Fourthly, once the regional output per capita exceeds the national output per capita, the growth rate does not depend on the level of output, therefore suggesting that in such a situation the regions reached their steady state level.
An important pros of the convergence in differences regressions is the capability of considering the growth performance of each economy in the sample at each point in the time (and, therefore, at each level of output per capita). Indeed, the behaviour of a single economy is likely to differ across the time, and such differences underline interesting growth dynamics.
Thank to this approach, we have detected a non-linear behaviour of output per capita and labor productivity growth in a convergence framework.
The sample size and the way the core independent variables are constructed mitigate the endogeneity bias, therefore justifying the use of the fixed-effect estimator. Such a choice is strengthened by the likely presence of time invariant effects which differ across the groups, considering that Russia, given its high number of regions and its size, is characterized by strong internal structural heterogeneity. Nevertheless, different estimators applied to the equations 5, 6 and 7 can perform well when different space and time features of the sample are taken into account.