Linear Relationship between CO2 Emissions and Economic Variables: Evidence from a Developed Country and a Developing Country

Multiple linear regressions (MLR) analyses have been used to explain various linear relationships between CO2 emissions and economic variables. Previous research have suggested that predictors of CO2 emissions can be vary depending on economic development of a country and also region where a country is located. This paper investigates the linear relationships between CO2 emissions and its related economic variables using MLR analyses for the UK and Malaysia. Differently from the typical MLR analyses which directly identify the most prevalent predictors, these analysis includes Ftest to check linearity property of the relationships, regression equations and also error analysis to validate the robustness of the multiple linear regressions as predictive tool. It is shown that the linear relationships of the UK data outperformed the linear relationship of Malaysian data where agriculture and transport are the most effective predictors for the UK and Malaysia data respectively. The most effective predictor from the linear relationship would provide valuable information for policy holders and environmental management authority on potential causes of CO2 emissions.


Introduction
Over the past century there has been a dramatic increase in economic activities in all over the world.One of the side effects from the increasing of economic activities is escalating of CO 2 emissions.The trend of CO 2 emissions has been increased exponentially in the past few decades.Almost 30 billion tons of CO2 enters the atmosphere as a result various human activities each year (Goodall, 2007).The effect of the higher concentrations of CO 2 to the world population could not be taken lightly.It has been reported that CO 2 emissions is held responsible for 58.8% of green house effects (World Bank, 2007).The effects may cause major environment threats.The increase in CO 2 emission would give disastrous environmental consequences such as droughts, storms, floods and other environmental calamities.According to a research, global sea level has increased by 10 -20 cm during the twentieth century (Mukhtar et al., 2004).Not only was the increment in sea level but temperature of sea also reported in risen trend.Spence (2005) reported that global CO 2 emissions have increased by 30 % and the world temperature has risen by 0.3-0.6 degree Celsius.These environmental instabilities are said to be caused by a combination of intangible and tangible variables such as population growth, economic growth, energy consumption, industrial activities and CO 2 emissions.Therefore, the intertwined relationships among all these variables with CO 2 emissions are a subject that attracted many concerned parties.Developments in these relationships have heightened the need for empirical research.
Many research have been conducted using various scientific methods across regions to investigate these relationships.Lizano and Gutierrez (2007), for example, employed non-parametric frontier approach to model the relationships among population, gross domestic product (GDP), energy consumption and CO 2 emissions.Moran and Gonzalez (2006) proposed a combination of sensitivity analysis and linear programming methods that could identify the main productive linkages between CO 2 emissions and human activities.The case study in Spain suggested that CO 2 emissions were related to productive relationships within economics activities.Freitas and Kaneko (2011) analyzed the decoupling of CO 2 emissions and economic growth in Brazil.They examined the occurrence of a decoupling between the growth rates in economic activity and CO 2 emissions from energy consumption in Brazil from 2004 to 2009.This decoupling was highlighted when economic activity and CO 2 emissions moved in opposite directions in 2009.Ang (2009) explored the determinants of CO 2 emissions in China using aggregate data for more than half a century.The results indicated that CO 2 emissions in China are negatively related to research intensity, technology transfer and the absorptive capacity of the economy to assimilate foreign technology.Their findings also indicated that more energy use, higher income and greater trade openness tend to cause more CO 2 emissions.
The relationships between CO 2 emissions and economic variable were further investigated from comparative studies perspective.Other than research at a specific region, there were also handful of comparative research that have been embarked on different nations and different economic development status.Su et al. (2009), for example, analyzed CO 2 emissions embodied in trade.The effects of sector aggregation in trade were investigated to see the possible effects of sector aggregation.They conducted empirical studies using the data of China and Singapore where energy related CO 2 emissions embodied in their exports are estimated at different levels of sector aggregation.Martínez-Zarzoso and Maruotti (2011) investigated the impact of urbanization on CO 2 emissions.Evidence from developing countries was provided to analyze the impact of urbanization on CO 2 emissions in developing countries from 1975 to 2003.It corroborates the existing literature by examining the effect of urbanization.Dynamism and the presence of heterogeneity were taken into account in the sample of countries.The results showed an inverted-U shaped relationship between urbanization and CO 2 emissions was really exist.However, the relationships between CO 2 emissions and economic variables by embarking on cross tabulating analysis among developed and developing countries were given less attention.It was hypothesized that CO 2 emissions from developed countries are more efficient than developing countries.Rosa and Tolmasquim (1993), for example, proposed an analytical model to compare energy efficiency indices and CO 2 emissions in developed and developing countries.The index of CO 2 emissions was about ten times higher in Brazil than in the USA, Japan and Germany.This analysis shows that efficiency of CO 2 emissions between developed and developing countries seems differ significantly.Development status of a country is much depending on economic development.A developed country is known as a country which has a highly developed economy and advanced technological infrastructure relative to other less developed nations.There are no conclusive criteria for evaluating development status of a country.However, the three most typical criteria in evaluating economic development status are GDP, per capita income, and level of industrialization.A developing country contrarily is defined as a nation with a low living standard and undeveloped industrial base (Sullivan, et al., 2003).Clearly, a developing country and a developed country were segregated solely based on economic variables.It is imperative to investigate how the nations' segregation based on economic variables may affect the efficiency in CO 2 emissions.This paper aims to investigate linear relationship between CO 2 emissions and economics variables for a developed country and a developing country.Data of CO 2 emissions and its related variables from Malaysia and the United Kingdom (UK) are employed to investigate the relationship.The rest of this paper is structured as follows.Section 2 describes a brief theoretical background of linear relationship between predictors and response variables.Section 3 presents analysis and results using data from developed and developing countries.Section 4 concludes.

Predictors and Response Variables in Linear Relationship
Linear relationship between two variables is typically explained by linear regression model.This model gives a straight-line relationship between two variables (Mann, 2007).Linear regression was the first type of regression analysis to be studied rigorously, and to be used extensively in many practical applications.The main reason behind this popularity is characterized by linear relationship between two variables.The linear relationship depends on their variables that easier to fit than models which are non-linearly related to their variables.The statistical properties of the resulting predictors are easier to determine.The function form which is most frequently used for expressing the linear relationship is given by bX where Y ′ is the predicted value of the Y variable for a selected X value.Constant a is the Y -intercept.It is the estimated value of Y when X = 0. Another way to put it is: a is the estimated value of Y where the regression line crosses the Y -axis when X is zero.b is the slope of the line, or the average change in Y ′ for each change of one unit (either increase or decrease) in the independent variable X .The equation ( 1) is function to describe the relationship between one response variable (Y ′ ) and one predictor.However, in many practical applications, there are more than one predictor that can be related to one response variable.The linear relationship between one response variable and several predictors is explained by multiple linear regressions.
Multiple regression analysis has been viewed only as a way to describe the relationship between a response variable and several predictors.In multiple linear regressions, additional predictors (denoted , , , 2 1  X X and so on) are used to help better explain or predict the response variable ( ) Y .The general descriptive form of a multiple linear equation is shown in Equation ( 2).The number of predictors is represented by k .So k can be any positive integer.
where: a is the intercept, the value of Y when all the X's are zero.k b is the amount by which Y changes when that particular k X increases by one unit with all other values held the same.The subscript j can assume values between 1 and k, which is the number of predictors.
Response variable and predictors are predominantly used in the next section for describing the relationship between CO 2 emissions and economic variables.

Analyses and Results
Data of CO 2 emissions and its associated economic variables from Malaysia and the UK were employed to be tested the relationships using multiple linear regression.This study collects data on CO 2 emissions for the period between 1990 to 2010 and 1981 to 2005 for the UK and Malaysia respectively.Different period of data were considered due to limitation in retrieving secondary data.Data of the response variable and predictors were retrieved from the official websites of World Bank (2012).
Predictors of the UK data are energy supply, business, transport, population, agriculture, industrial process and waste management.It is labeled as x , and 7 x respectively.Predictors of Malaysia data are fuel mix, transport, GDP, and population.These predictors are labeled as M1 x , M2 x , M3 x , and M4 x .
CO 2 emission is the response variable and denoted as ỹ .The relationships between the predictors and the response variables are executed using the multiple linear regressions.Results are divided into four subsections as to accommodate the comprehensive analysis of linear relationships.

Excluded Predictors
Decision on any of the predictors that can be excluded from the Malaysian model was made.Table 1 shows the predictor of fuel mix and population can be removed from the full model due to the significance level that greater 0.05.These two predictors are not significance when probability of rejection is defined at 0.05.The not excluded predictors were used in the next three analyses.These predictors are used to determine the contributions of predictors towards CO 2 emission, to test the model linearity and to construct multiple regression equations.

Analysis of Variance (ANOVA) and Coefficient of Determination
The selected predictors were used to model the CO 2 emissions using multiple linear regressions.Table 3 shows summary results for the linear regression model for Malaysia data.The coefficient of determination and the adjusted R squared are 0.477 and 0.430 respectively, which indicates that about 48 % of the variation in the CO 2 emissions is explained by transport and GDP.The analysis of variance indicates that the p value (probability of rejecting the null hypothesis) for the F test statistic is 0.01, which provide strong evidence against the null hypothesis.In other words, there is a linear relationship between the predictors and CO 2 emissions for Malaysia data.
The similar analysis was made to obtain the linear regression for the UK data.The regression statistics and F-test are shown in Table 4.The coefficient of determination and the adjusted R squared are 0.991 and 0.989 respectively, which indicates that about 99 % of the variation in the CO 2 emissions is explained by business, energy supply, transport and agriculture.The analysis of variance indicates that the p value for the F test statistic is 0.000, which provide strong evidence against the null hypothesis.There is a linear relationship between the predictors and CO 2 emissions for UK data.

Multiple Regression Equations
Regression coefficients for the not excluded predictors for Malaysia data are shown in Table 5.The regression equation shows that x M2 is the best predictor.Hence, transport is the most effective predictor in CO 2 emissions in Malaysia.
Regression coefficients for the UK data are computed with the similar fashion.The four not excluded predictors are energy supply, business, transport, and agriculture.The regression coefficients for the predictors are shown in Table 6.The regression equation shows that x 5 is the most effective predictor for the UK data.Of the four not excluded predictors, agriculture is the most effective predictor.

Prediction Errors
Analysing the linear relationship seems incomplete without investigating the performance of predicted values against actual values.The predicted values of CO 2 emissions are computed using the regression equations and its deviations from the actual values are measured.The deviations of predicted values from actual values are called as prediction errors.The mean average percentage errors (MAPE) and root mean square errors (RMSE) are used to measure the prediction errors.These errors are computed using the following two equations.


where A t is the actual value, n is number of data and F t is the predicted value.
Errors for the UK model and Malaysia model are summarized in

Conclusions
An important element in analyzing CO 2 emissions and its variables is a method which can take into account multiple variables and easily interpreted its relationships.The model should establish a decision to reflect the contribution of each accounted predictors toward CO 2 emissions.Apart from the identification of the best predictor, the method is also able to provide evidence on the efficiency of the predictors for two countries with different economic development profiles and regions.In this paper, the multiple linear regressions was utilized to capture the effective predictor of CO 2 emissions for the United Kingdom and Malaysia data.Furthermore, this paper also contributed to the identification of the better model for CO 2 emissions.The approach has successfully offered the prediction of CO 2 of the two countries from two different socioeconomic profiles.The CO 2 emissions prediction model of the United Kingdom outperformed the model of Malaysia.The results indicate that the chosen predictors of CO 2 emissions in the United Kingdom were the better predictors.One the other hand, the predictors of CO 2 emissions in Malaysia were not sufficient to be considered as good predictors.Therefore, it is suggested that several new predictors should be considered in predicting CO 2 emissions in Malaysia.Future research could be explored not only considering the number of predictors but also could be extended in searching the best model for prediction of CO 2 emissions.

Table 1 .
Excluded predictors for the Malaysia dataThe similar analysis was made to decide predictors that can be excluded from the UK model.With significant level at 0.05, the predictors of population, industrial process and waste management can be removed from the full model.The significance values, t value and predictors are shown in Table2.

Table 3 .
Linearity test for Malaysia data

Table 4 .
Linearity test for the UK data

Table 5 .
Regression coefficients for Malaysia dataThe t test statistic shows that the p-value for the model intercept and the coefficient associated with the rejection region is less than 0.05.This result provides strong evidence against the null hypothesis.The p-value associated with transport and GDP predictors are less than 0.05.These results indicate that at least one of the predictors is useful for predicting CO 2 emissions.From the value of regression coefficients, the multiple regression equation can be written as

Table 6 .
Regression coefficients for the UK data

Table 7 Table 7 .
Errors of the UK Model and Malaysia modelThe errors show that the UK model has smaller errors than Malaysia model.It indicates that the predictors of the UK model performs better than predictors of Malaysia on CO 2 emission prediction.