A Model Selection Procedure for Stream Re-Aeration Coefficient Modelling

Model selection is finding wide applications in a lot of modelling and environmental problems. However, applications of model selection to re-aeration coefficient studies are still limited. The current study explores the use of model selection in re-aeration coefficient studies by combining several suggestions from numerous authors on the interpretation of data regarding re-aeration coefficient modelling. The model selection procedure applied in this research made use of Akaike information criteria, measures of agreement such as percent bias (PBIAS), Nash-Sutcliffe Efficiency (NSE) and root mean square error (RMSE) observation Standard deviation Ratio (RSR) and gragh analysis in selecting the best performing model. An algorithm prescribing a generic model selection procedure was also provided. Out of ten candidates models used in this study, the O’Connor and Dobbins (1958) model emerged as the top performing model in its application to data collected from River Atuwara in Nigeria. The suggested process could save software and model developers lots of time and resources, which would otherwise be spent in investigating and developing new models. The procedure is also ideal in selecting a model in situations where there is no overwhelming support for any particular model by observed data.


Introduction
Reaeration coefficient (k 2 ) modelling, as a relatively new and specialized field of study, has evolved over a period of ninety years through contributions by researchers from different parts of the world (Palumbo & Brown, 2013;Omole, 2012;Gayawan et al., 2009;Ye at al., 2008;Longe & Omole, 2008).This has resulted in the development of hundreds of k 2 models, often through processes that cost large sums of money, labour and time (Wang et al., 2013).Model developers agree that it is possible to save lots of resources by comparing existing models and selecting the most representative from a pool of carefully compiled models (Palumbo & Brown, 2013;Wang et al., 2013;Omole et al., 2013;Ritter & Munoz-Carpena, 2013).Indeed, some developed countries have provided guidance relating to the simulation and assessment of water quality in their respective environments by specifying certain models that have been found useful, thus setting the pace for developing countries to follow suit (Wang et al., 2013).In furtherance of this, hydrologic modellers have arrived at a consensus on the following modelling issues: i.
ii.That the use of coefficient of determination (R2) and common error statistics such as standard error (SE) and normalized mean error (NME) are not sufficient for evaluating the performance of k 2 models (Palumbo & Brown, 2013;Ritter & Munoz-Carpena, 2013;Moog & Jirka, 1998).
iii.That in the process of evaluating models prior to selection, both graphical and error statistics should be considered (Harmel, et al., 2014).It is also popularly accepted that statistical evaluation of models must include both absolute error and dimensionless error indices in the analysis of goodness of fit (Omole et al., 2013;Moriasi et al., 2007;Harmel, et al., 2014;LeGates and McCabe, 1999).
Hydrologic model developers, however, are yet to reach a consensus on the exact procedure to be adopted in the process of model selection.Also, there is no unanimity in the interpretation of some of the results from their analyses.In their article, Omole et al., (2013) proposed the use of corrected Akaike Information Criteria (AICc) in comparing the capacity of the models to interpret data from River Atuwara.The current study, however, takes a step further by quantitatively integrating graphic analysis into the procedure for model selection.

Theoretical Framework
The starting point in the model selection process is the short-list of candidate models.This should be carefully done to avoid wasted efforts.Basis of selection should be objective and based on researcher experience and scientific markers.This is because AIC would only select the most representative model out of the candidate models.This does not necessarily make the most representative model (among the candidate models) the best model for the data (Johnson & Omland, 2004).Information criteria should, in itself, be sufficient to select the best model.However when a single model does not provide overwhelming evidence of representation for real data, it becomes necessary to conduct further statistical and graphic analysis as proposed by Johnson & Omland, (2004).Overwhelming support for data being defined as w i > 0.9 (Johnson & Omland, 2004), where w i is the information criteria (IC) weight of model i obtained from a given set of candidate models.In the current study, both AICc and BIC were used for comparison purposes even though AICc would have been sufficient since all the models have the same parameters namely velocity and hydraulic radius.If some of the models included other known k 2 parameters such as slope, temperature, Froude number, time and/or discharge, then BIC would be more appropriate because it penalizes model complexity (parsimony) more than AIC.Both AICc and BIC are respectively defined by equation 1 and 2 (Omole et al., 2013;Burnham & Anderson, 2004;Johnson & Omland, 2004). and where n = sample size, p = count of free parameters; y = data; L y Following the IC analysis, statistical analysis using measures of agreement was done.Ordinarily, based on the recommendation of Royall (1997), only the candidate model with the highest w i , i.e. ( ) and other candidate models having w i ≥ 10% of the value of ( ) max i w should be considered for further statistical tests.In this study, however, all the models were considered for both measures of agreement and graphic analysis since there was no model that had a distinct performance at any of the stages of analysis.
The measures of agreement used for this study are Percent BIAS (PBIAS), NSE and RSR.They are defined as: Next is the graphic analysis.Each model was plotted as simulated data against observed data and the most visually representative model was allocated the highest weight of 10 (out of 10 candidate models), while the least representative model received the least weight allocation of 1.The allocation of the highest weight of 10 for the best performing model was also done at each stage of IC and measure of agreement analysis.At the end of all the analytical process (as detailed in the appendix), the average of all the weights were found for each model.
The model with the highest score (in percent) emerged as the most representative model out of the ten candidate models.
Data used for analysis in this study was obtained during the rainy season (high stream velocity, depth and dilution) in July 2009 while data for the dry season (dry weather flow) was obtained in January 2010.
For the purpose of this study, the candidate models and the justification for their short-listing are presented in Table 1.

Information Criteria (IC) Analyses
Results of the AICc and BIC analyses performed on the models listed in Table 1 are presented in Figures 1 -2.
The model having the lowest IC value is the most preferred model.The models are therefore ranked in order of IC value with the least IC value having the highest weight.Both AICc and BIC were in agreement regarding the order of weights of the candidate models for each data set.Agunwamba et al., (2007) model had the highest weight allocation for the dry season data while Bansal (Bowie et al., 1985) model emerged as the most preferred model for the rainy season.The ranking of the other models for either season are displayed in Figures 1 and 2 respectively.

Measure of Agreement Analyses
Since the IC analysis did not give overwhelming support to any of the models considered in the study, it became necessary to conduct more analysis using recommended absolute and dimensionless error statistics in accordance with the recommendations of Johnson & Omland (2004).Results of the measure of agreement analyse are presented in Figures 3 -8.Percent BIAS (PBIAS) is a measure of how accurately a model interprets observed data.The ideal PBIAS value is zero.Thus the closer a model PBIAS value is to zero, the better.However, when the value obtained is negative, it shows model overestimation and such value should be discountenanced.Using  3 and 4 respectively.Thus in the allocation of weights to the best performing models, all models that fall below zero were given zero weights while the other models were ranked according to their weights.For the dry season data, only five of the models were successful with Baecheler & Lazo (1999)    The Nash-Sutcliffe Efficiency (NSE), which is a dimensionless error statistic, measures the variance between noise and information in simulation problems.Values between 0.0 and 1.0 are optimal.However, NSE values closer to 1.0 are preferred.The results for the NSE tests for both the dry and rainy seasons are presented in Figures 7 and 8.It shows that the model with the best output among the candidate models for the dry season is Omole & Longe (2012) model while the best model for the rainy season is Owens et al., (1964)

Graphic Analysis
The plots of all the models against observed data for both the dry and rainy seasons are shown in  A summary of the result of all the three analyses were obtained by summing the weights obtained from each analysis and finding the cumulative average.This was used to rank the models in the order of performance (column 8 of Table 3).This process suggested that O' Connor and Dobbins (1958) model is the preferred model among the candidate models.The selection of O'Connor and Dobbins model appeals to sense for a few reasons.Butts et al., (1970;p.7]believe the model was developed based on a more general theory than most other models.The model also finds wide applicability because it was designed for rivers having depths between 0.3 -9.14 m and sluggish velocity ranging between 0.15 -0.49m/s [Omole et al., 2013, p. 87).River Atuwara had an average dry weather depth of 1.03 m and a dry weather flow of 0.22 m/s, which makes it to fall within the model constraints of O' Connor and Dobbins (1958) model.

Conclusion
The procedure for model selection procedure used in this paper was based on a combination of suggestions by different authors on the subject.The study suggested a procedure that used statistical tools (information criteria and measures of agreement) and graphical tools to rank the capacity of ten different models to predict observed stream data (Appendix).The procedure produced the top performing model which in this case was O'Connor and Dobbins (1958) model.When compared to Jha et al., (2001) model which was the recommended model in Omole et al., (2013), it could be seen that the Jha et al., (2001) model was the preferred model when the test is only statistically based.However, when statistics and graphic analysis is quantitatively combined, the output differed.The procedure described in this research is appropriate for model selection in situations where there is no clear evidence of support for observed data by any particular model among competing candidate models.
Although the original proponents of information criteria believe in its use as a self-sufficient model selection tool, this study has demonstrated that use of information criteria may not necessarily be the ultimate model selection tool as the different tests ranked the models differently.It is therefore recommended that re-aeration coefficient modelling scientist and software programmers research more into finding a means of compiling qualified candidate models in order to obtain more reliable results.
of observed data and σ 2 = standard deviation.

Figure 1 .
Figure 1.AICc and BIC values for Dry season , the PBIAS values obtained for the dry and rainy seasons are shown in Figures model having optimum PBIAS value.For the rainy season,Bennet & Rathburn (1972)  was the optimum model.

Figure 4 .
Figure 4. PBIAS for Rainy season Figure 7. NSE for Dry season

Figure 9 .
Figure 9. Plot of observed and simulated k 2 values for dry season (reproduced with permission from Omole and Longe, 2012)

Table 1 .
Candidate models

Table 2 .
Graphic Goodness of fit for the two data sets

Table 3 .
Order of model performance in the different analysis