Causes and Effects of Cost Overrun on Construction Project in Bahrain : Part 2 ( PLS-SEM Path Modelling )

In part 1 of the paper, cost overrun factors associated with the construction industry in Bahrain were identified, and risk maps were developed based on a survey and actual construction projects records. This part of the paper adopted structural equation modelling (SEM) approach to assess the effects of cost overrun factors on project cost in Bahrain. SEM is the graphical equivalent of a mathematical representation to study relationship between dependent variable to explanatory variable. SEM is regarded as extension of standardized regression modelling and is important tool to estimate the causal relationship between factors. The collected data from the questionnaire and actual projects of part 1 were modelled and analyzed using SmartPLS v3.0 software. Results showed that approximately 60% of cost overrun was influenced with the factors identified in part 1. The Global fit index (GoF) value of the developed model was 0.591, indicating that the model has enough power in explaining the relationship between identified factors and cost overrun.


Introduction
Part 1 of this paper had focused on identifying and ranking causes of cost overrun; however it didn't substantively estimated causal relationships among the construction cost factors.Hence, this study adopted Structural Equation Modelling (SEM) to assess the causes of cost overrun.SEM is a second-generation multivariate data analysis method that can test theoretically supported linear and additive causal models.With SEM, it's possible to visually examine the relationships that exist among variables of interest (cost overrun factors) in order to prioritize resources and make decisions.The unobservable, hard-to-measure latent variables, which are underlying variables that cannot be observed directly, can be measured using SEM, and that makes it ideal for tackling research problems.There are several distinct approaches to SEM such as the Partial Least Squares (PLS), which focuses on the analysis of variance that can be carried out using SmartPLS software.PLS is a soft modelling approach to SEM with no assumptions about data distribution.Thus, PLS-SEM becomes a good tool when the following situations are encountered; sample size is small, applications have little available theory, predictive accuracy is paramount, and correct model specification cannot be ensured (Bacon, 1999;Hwang et al., 2010;Wong, 2010).PLS-SEM Path Model is formed by two sub models: the structural or inner model, and the measurement or outer model.The structural model is the part of the model that has to do with the relationships between the latent variables (cost overrun and cost overrun groups).In turn, the measurement model is the part of the model that has to do with the relationships of the latent variable with its block of manifest variables (cost overrun groups and cost overrun factors).

Literature Review
Project success is usually measured by its schedule, budget and quality.Broadly, various risks can affect these three basic dimensions against the success of a project.Such associated risks may be of various kinds which depend on many factors, due to the uniqueness, complexity and dynamic nature of the construction activities.These risks can cause losses that lead to increase in costs, time delays and lack of quality of projects.One of the challenges facing the construction field is how to assess the risk of cost overruns and deliver projects within budget.There are many approaches in literature that has studied and examined the risks in construction industries, notable are the following: Zou et al. (2007) used a systematic and holistic approach to identify risks, analyze the likelihood of occurrence, and impacts of risks associated with the development of construction projects from project stakeholder and life cycle perspectives.The research identified twenty major risk factors and found that these risks are mainly related to (in ranking) contractors, clients and designers, with few related to governmental bodies, subcontractors/suppliers and external issues.This research also found that these risks spread through the whole project life cycle and many risks occur at more than one phase, with the construction stage as the most risky phase, followed by the feasibility stage.Chileshe and Fianko (2012) explored elements of risk by using an opinion survey approach to collect data from 103 professionals in the Ghanaian construction industry.Significant differences were found between the perceptions of these sub-groups regarding the likelihood of occurrence of threat risks in five categories: construction method; price inflation; exceptional weather; ground conditions and site contamination; and poor communication among the project team.Significant differences between the sub-groups were found in most categories.Nevertheless, there was complete agreement among the three stakeholders (clients, contractors and consultants) regarding the ranking of the financial risk factor "delay in payment" and the economic risk factor "inflation".Cantarelli (2011) in Netherlands, discussed the impact of the implementation phase on cost overrun, and concluded that the longer the implementation phase the higher the cost overrun, and the pre -construction phase was significantly shorter than the construction phase but it had the highest influence on cost overruns.Tipili et al. (2014) conducted a survey to identify the likelihood of occurrence and degree of impact of the risk factors on construction projects within the Nigerian construction industry.The results were analyzed using descriptive statistic and analyses of variance (ANOVA), and subsequently exposure rating levels were determined, which enabled the categorization of the probability impact score into low, medium and high levels.The study indicates a disparity of the ranking of the degree of occurrence and impact among the group.Based on composite of risk factors, cost related risk and time related risk were found to be the most likely to occur, and had the most impact on project, whereas environmental risk factor was found to be a low risk, since it had the least likelihood to occur and the least impact score.Sharma and Goyal (2014) carried out a systematic literature review on cost overrun factors, and project cost risk assessment of construction project.They found several factors related to cost overrun, and classified them into 11 groups according to the sources of cost overrun.They also discussed the use of the Fuzzy set theory (FST) which is a branch of modern mathematics to model vagueness.FST was used for modelling of uncertainties that involves human intuitive thinking as a vital solution for assessing risk for construction Industry.The main objective of modelling of risks and uncertainty in estimating and f orecasting construction cost was to analyse the effect of associated uncertainty in the cost estimating process to have more realistic estimate.
Compared to traditional methods of data analysis, PLS-SEM is regarded as extension of standardized regression modelling and is important tool to estimate the causal relationship between factors.The functionality of PLS-SEM is better than other multivariate techniques including multiple regression, path analysis and factor analysis in analyzing the cause-effect relations between latent constructs.Shanmugapriya and Subramanian (2016) focused on investigating the key factors for safety improvement in construction using PLS-SEM.A model for improving safety performance was developed based on the European Foundation for Quality Management (EFQM) model.It provides the framework for organizational management systems to focus on the implications of leadership factors and process factors in improving the safety performance within an organization.Abdul Rahmana et al. ( 2014) studied significant factors affecting construction waste generation; to minimize construction waste generation, a PLS-SEM model was developed to determine their significant level contributing to construction waste.
The PLS-SEM modelling has been deployed in many fields, such as behavioural sciences (Bass et al., 2003), marketing (Henseler et al., 2009), groups and organization research (Sosik et al., 2009), management information systems (Chin et al., 2003), and business strategy (Hulland, 1999).In the construction industry limited number of studies were carried out using the PLS-SEM method, mostly to investigate the factors influencing performance of construction organizations, or to test the effects of different factors on construction projects.However, this paper undertakes the analysis of risks of cost overrun to ascertain if the findings can lead to more accurate budget estimates in the future by making more realistic allowances in estimates of the identified cost overrun risks, and to open a door for further research in Bahrain to develop relationships between the measures of the contingency percentage and construction project variables.

Methodology
The research required a sample of construction projects that was appropriate for this area of investigation.The data sample was required to be large enough to allow statistical analyses of cost overrun factors and project costs.Forty building construction projects were selected based on the types of the projects and availability of estimated and actual cost data.A structured questionnaire survey was considered to support and enhance the limited amount of actual projects data that has been obtained.The survey material was first examined and identified through a relevant literature review and sought advice from experienced construction practitioners.It included 74 respondents made up of 42 contractors, 11 clients and 21 consultants (refer to part 1 of the paper for more information).Both actual and survey data were used to construct a model using the Partial Least Square and the Structural Equation Modelling.The model then was evaluated and validated.

Partial least square structural equation modelling (PLS-SEM)
The PLS path modelling method was developed by Wold (1982), every PLS path model is formed by two sub models: the structural or inner model, and the measurement or outer model.The structural model is the part of the model that has to do with the relationships between the latent variables (cost overrun and cost overrun groups).In turn, the measurement model is the part of the model that has to do with the relationships of the latent variable and its block of manifest variables (cost overrun groups and cost overrun factors).The PLS algorithm aims at estimating the values of latent variables; it is essentially a sequence of regressions in terms of weight vectors.The weight vectors are obtained when convergence satisfies fixed point equations.The basic PLS algorithm, as suggested by Lohmöller (1989), includes the following two stages: Stage 1 is the iterative estimation of latent variable scores, consisting of a 4-step iterative procedure that is repeated until convergence is obtained or the maximum number of iterations is reached.Stage 2 is the estimation of the path coefficients (Figure 1).The minimum sample size required depends on the maximum number of arrows pointing at a latent variable as specified in the SEM.There are 8 groups of cost overrun factors, one of them consist of 10 variables, therefore, minimum sample size required is 91.The sample size used in developing this model consists of 114 samples which is acceptable according to Marcoulides and Saunders (2006).

Model Development
In order to assess the effect of causative factors on cost overrun as hierarchical conceptualization, reflective construct was adopted.A hierarchal model based on groups and factors identified in part 1 of this paper showing relations to endogenous latent variable (cost overrun) is shown in Figure 2. In the inner model, each group of cost overrun (exogenous latent variable) was represented by a circle, and linked to the cost overrun circle (endogenous latent variable); where the groups are the cost overrun indicators.The outer model was built by linking the indicators (cost overrun factors) to their latent variables (group); each group indicator is represented by a rectangle.A two-step process was adopted to evaluate PLS model validity and ensure the strength of each factor is reliable and consistent.The PLS path model validation steps are: • outer model (measurement model) evaluation with regard to the reflective factor' reliability, convergent validity and discriminant validity.
• inner model (structural model) evaluation in respect of variance accounted for, path estimates and the predictive relevance of the inner model's explanatory variables for the endogenous latent variable.
The sequence ensures that reliability and validity of measures of constructs are ascertained before attempting to draw conclusions about the nature of the relationships between constructs (Aibinu et al., 2011).

Measurement Model Assessment
a) Factor reliability and convergent validity: Individual factor reliability is the extent to which measurements of the latent variables measured with multiple-factor scale reflects mostly the true score of the latent variables relative to the error (Hulland, 1999).It is the correlations of the factors with their respective latent variables.To evaluate individual item reliability, the standardized loadings were assessed.According to Hulland (1999) factors with loadings of less than 0.4 should be dropped.
Convergent validity is the measure of the internal consistency which ensures that the factors assumed to measure a particular construct actually measure it and not another construct (Hulland, 1999).Composite Reliability scores (CR), Cronbach's Alpha and Average Variance Extracted (AVE) tests were used to determine the convergent validity of measured constructs.The cut-off value for AVE, CR and Cronbach Alpha were 0.5, 0.7 and 0.6, respectively.1) that not all of the indicators have a reliability values (loadings) that are larger than the minimum acceptable level of 0.4 (Hulland, 1999); therefore these indicators have to be eliminated   (Hulland, 1999).The discriminant validity of the measurement will be evaluated using analysis of the average variance extracted based on the criteria that "a construct should share more variance with its measures than it shares with other constructs in the model" (Fornell and Larcker, 1981;Aibinu et al., 2011).This can be examined by comparing the AVE of construct shared on itself and with other constructs.
For valid discriminant of construct, AVE shared on itself should be greater than that shared with other constructs.Latent variable correlations are calculated with SmartPLS software and are shown in Table 2.The values of square root of AVE in the diagonals of matrix are higher than off-diagonal values in the model.This confirms that all the variables represent their constructs and the discriminant validity is well established.

Structural Model Assessment
Structural model can be assessed by testing the explained variance on endogenous latent variable and path co-efficient, also termed as beta (β) values of each path, while R² of the endogenous latent variable is used to assess the explained variance.R² value of endogenous can be assessed as substantial power = 0.26, moderate power = 0.13 and weak = 0.02 (Cohen 1992).It is perceived that R² of the endogenous latent variable (cost overrun) is 0.593, this means that the 8 latent variables (groups) explain 59.3% of the variance in cost overrun.This proves that the developed model has substantial explaining power to represent the relationship of construction groups of cost overrun towards cost overrun.In assessing the path coefficient, beta value of all structural paths is compared, the higher the path co -efficient the significant effect on endogenous latent variable.Table 3 shows that CSM has the highest co -efficient value of 0.203.This means the CSM shares high value of variance and large effect on cost overrun.The second major construct affecting cost overrun is DDF with path co-efficient of 0.178.
Further, the significance of the path co-efficient was tested by calculating t-value using non-parametric bootstrap procedure with the Smart PLS software to provide confidence intervals for all parameter estimates, and building the basis for statistical inference.In general, the bootstrap technique provides an estimate of the shape, spread, and bias of the sampling distribution of a specific statistic.Bootstrapping treats the observed sample as if it represents the population.The procedure creates a large, pre-specified number of bootstrap samples (e.g., 5,000).
The PLS results for all bootstrap samples provide the mean value and standard error for each path model coefficient.This information permits t-test to be performed for the significance of path model relationships (Henseler et al., 2009).Table 3 shows the summary of the path results and the corresponding t values and estimated p value associated with each t value calculated with bootstrap run using 5000 bootstrap samples.It can be observed that all the paths have significant effect on cost overrun, except human resources and environmental constructs.

Model Evaluation
Since PLS makes no distributional assumptions for parameter estimation.The evaluation of PLS model is therefore, based on prediction-oriented measures that are non-parametric (Chin, 1998).The PLS model is mainly evaluated by Goodness-of-Fit (GoF) (Tenenhaus et al., 2005).GoF was employed to judge the overall fit of the model, it is the geometric mean of the average communality and the average R 2 .It represents an index for validating the PLS model globally, as looking for a compromise between the performance of the measurement and the structural model, respectively.The GoF index is bounded between 0 and 1, GoF-small = 0.1, GoF-medium = 0.25, GoF-large = 0.36 as cut-off values for global validation of PLS model as adopted by Akter et al., (2011).The following equation was adopted to calculate GoF, (Akteret al., 2011): (1) The calculations are: AVE= (0.601+0.500+0.556+0.641+0.585+0.590+0.572+0.673)/8=0.5897 R 2 = 0.593 GoF = √ 0.5897x 0.593 = √ 0.3497 = 0.591 For this model the GoF index was 0.591, which exceeds the cut-off value in comparison of baseline value.This shows that the model has substantial explaining power.

Model Contribution
The model can be used in construction management as a key business tool to improve construction performance, through determining the risks which are present during the delivery of construction projects in Bahrain that lead to significant project cost overruns, and explain how each risk correlates with the total amount of cost overrun.This could essentially help construction firms to transform their data into cutting edge decision support systems for business improvement and gain competitive advantage.The results can also be applied to the problem of final cost estimation of construction projects, through allocation of contingency percentages, based on the effect of each risk on the total cost overrun.Table 4, not only can be used to prioritize grouping of risks, but also aimed at simply providing a better understanding of the contingency management of a project, thus validating better risk contingency weightings in budget estimates.Environmental factors (EV) -0.03 0.00

Conclusions and Recommendation
This study investigated various factors affecting cost overrun using partial least square approach to structural equation modelling.The data was collected in Bahrain, through actual projects data and structured survey.The study provided a novel causal relationship model for construction causative factors affecting construction cost overrun.The fitness level of the model is 0.591 which indicates that it has substantial explaining power.In other words, power analysis has shown that the model can be used to assess the impact of hypothetic relationships among the factors and cost overrun.The project management and contract administration, non-human resources, and contractor's site management factors were major contributing causes of cost overrun.
Estimating procedures for construction projects should take into account the potential cost impacts resulting from each of the risk factors identified in this paper:  To avoid frequent design changes a detailed feedback from similar previously constructed projects needs to be considered, site surveys and measurements must be assured, and maintenance engineers should be part of the design team from the start to eliminate most of the changes. A precise review of the contractor's drawings, specifications and procurement documents to find deficiencies or conflicts early will provide a significant savings in cost and time as well. Prioritize of cost overrun risk factors in projects leads to better risk contingency weightings in budget estimates.Further research could be undertaken to identify if any project variables (size, type, delivery process, etc.) have a relationship to the accuracy of project cost contingency to predict more accurate project cost estimates.

Figure 1 .
Figure 1.The Flowchart for the PLS Algorithm

Figure 2 .
Figure 2. Conceptual Hierarchal Model of Cost Overrun Factors

Figure 3 .
Figure 3. PLS-SEM Results (Iteration 1) (Shaded  factors).Most of the Composite Reliability scores (CR), Cronbach's alpha value, and Average Variance Extracted (AVE), were unacceptable.The indicators with loadings of less than 0.4 are removed, and a second iteration is carried out.In the second iteration (Figure4) all indicators have individual indicator reliability values that are larger than the minimum acceptable level of 0.4 and some are close to the preferred level of 0.7.Cronbach's alpha values are shown to be larger than 0.4, so moderate to high levels of internal consistent reliability have been demonstrated among all eight reflective latent variables.The values of the AVE of latent variable in the tested model vary from 0.500 to 0.673; which exceeded the cut off value of 0.500.Thus, the constructs are considered satisfactory with the evidence of adequate reliability and convergent validity.

Figure 4 .
Figure 4. PLS-SEM Results (Iteration 2) b) Discriminant validity of constructs: Discriminant validity indicates the extent to which a given construct is different from other constructs(Hulland, 1999).The discriminant validity of the measurement will be evaluated using analysis of the average variance extracted based on the criteria that "a construct should share more

Table 1 .
Results Summary for the Reflective Outer Models (individual factor reliability and convergent validity) It can be seen from (Table

Table 3 .
Mean, SD, T-Statistics and P Values of Path Coefficients (Inner Model)

Table 4 .
Groups of Cost Overrun Effects on Total Project Cost Overrun (Model Results)