Modelling Spatial Distribution of the Carob Tree ( Ceratonia siliqua L . ) in Azilal Province , Morocco

Factors determining forest species distribution include, in addition to external factors such as human interference and environmental management strategies, also soil and hydrological characteristics and climate conditions in any given areas. Modelling distribution has practical application in forest conservation and management, and help decision makers to develop strategies aimed at forest restoration, development of mountainous areas and the continuous and sustainable provision of forest-related services. Species distribution modelling (SDM) can be used for predicting species distribution based on tree presence records and on a number of environmental predictors. In this study we used MaxEnt for niche modelling in predicting carob (Ceratonia siliqua L.) trees spatial distribution in the Province of Azilal in Morocco. The results obtained show that a large area of the mountain regions is suitable for the expansion of Ceratonia siliqua stands. These findings will help decision makers in forest planning to better identify suitable sites for carob tree plantations and assess the potential of the exiting populations.


Introduction
The carob tree (Ceratonia siliqua L.) is an angiosperm belonging to the Fabaceae family which can reach fifteen meters in height and live over 200 years (Ait Chitt et al., 2007).Due to its inherent qualities and multiple uses, the carob tree has been cultured and exploited for many millennia with records dating back 4000 BC.More recently, due to its high tolerance to drought, the carob tree has been used to restore marginal semi-arid and arid areas in numerous areas around the Mediterranean basin (Ozturk et al., 2010;Osorio et al., 2011, Bakry et al., 2013).
The carob tree plays an important role from economic, ecological and social stand.The entire plant (i.e.leaves, flowers, fruits, wood, bark and roots) is in high demand and hence heavily exploited.Parts of the plant are suitable as food for human as well as forage for livestock.The tree also serves for other uses including for ornamental purposes, industry, carpentry, beekeeping and traditional medicines (Batista et al., 1996;Tous et al., 1996;Barracosa et al., 2007).
In the Kingdom of Morocco, data and information related to the distribution of the carob tree is scares, scattered and rather inaccurate.The estimated national coverage area of 30 000 ha quoted in the literature appears as a grossly underestimated value (Ait Chitt et al., 2007).Carob stands are widely distributed and found in the provinces of Agadir, Essaouira, Azilal, Beni Mellal, Meknes, Taza and El Hoceima.They consist of natural sparse formations or artificial plantations.Natural carob stands fit in the order of Pistacio-Rhamnenalia (Achhal et al., 1980), which includes matorrals, clear wooded or shrubby groups in association with olive (Olea europea), mastic (Pistacia lentiscus), cedar (Junepurus phoenicea) and argan (Argania spinosa) trees.The carob tree is spontaneous and remarkably present in the thermo-and meso-Mediterranean zones: semi-arid to sub-humid bioclimate except in the very arid areas (Sidina et al., 2009;El Kahkahi et al., 2014).
The carob is qualified as a mid-slope tree.Environmental factors limiting its distribution in Morocco are principally the absolute minimum temperature (>3°C) and altitude with its maximum reach at around 1 150m and exceptionally up to 1 600m (Gharnit et al., 1996).A limited number of studies have discussed the influence of these factors on the spatial distribution of the carob through empirical research.This paper is an attempt to highlight this issue.The Province of Azilal was selected as study area for the spatial distribution of carob populations.
Moroccan carob stands are found mostly in marginal and hilly terrains.Based on current Moroccan forest regulations, local inhabitants have the right to collect wood from the forests under the jurisdiction and management of the state to be utilized as fuel.These activities are difficult to monitor and control and they often lead to exploitation levels that often exceed the carrying capacities of the forests.Nonetheless, as the carob is a prized fruit tree, it is generally more protected by the local inhabitant compared to other native species as it contributions to local income.Furthermore, the carob tree plays a crucial role in protecting soils and regulating the water-cycle.In addition, carob stands are also home to a large number of other organisms hence increasing local biodiversity.Thus, the conservation of existing carob stands and promoting new plantations would almost certainly contribute to combat desertification as well as to improve local biodiversity.There is therefore a need to adequately assess carob land suitability in Morocco through modelling of its current spatial distribution.To achieve this goal, the use of species distribution modelling (SDM) is recommended as a useful tool for characterizing the natural distribution of carob tree within their current range particularly as SDM has a practical application in the development of forest management strategies.Species distribution modelling, or ecological niche modelling (Peterson, 2006), also provides valuable information required in support of nature conservation actions (Lemos et al., 2014).Furthermore, SDM contributes to a better understanding of landscapes variations as a result of climate changes and/or human activities (Saatchi et al., 2008).The principle of SDM is to link species locations with the environmental characteristics in order to predict species occurrence likelihood (response function) and to assess the contribution of each environmental variable to that function (Austin et al., 2006).
In SDM, many statistical models could be used (Hegel et al., 2010;Franklin, 2009).In addition to classical regression methods, machine learning based modeling is widely used including: Artificial Neural Networks (Ripley 1996); Maximum Entropy MaxEnt (Phillips et al., 2004); Random Forest (Lahssini et al., 2015); Classification and Regression Trees CART (Breiman et al., 1984).MaxEnt algorithm is the most popular and widely used and qualified as most efficient in handling complex interactions between response and predictor variables (Elith et al., 2011;Elith et al., 2006) and less sensitive to small sample sizes (Wisz et al., 2008).
The present study aims to highlight the likely occurrence of the carob tree in the Azilal Province, using maximum entropy and GIS-based modelling.

Study Area
This study concerns Azilal Province which extends over an area of 10 758 km² (Figure 1).Approximately 80% of province's terrain is mountainous with peaks exceeding 1 000 m in height.Annual rainfall level varies between 300 and 750 mm.The region is frequently exposed to storms with resulting heavy flood events, particularly during summer and fall.Summer months are usually very torrid due to the southwestly winds known locally as "Chergui" and with temperatures often exceeding 40°C.The climate is Mediterranean, semi-continental, with an arid bioclimate in the lower altitudes, while semi-arid to sub-humid fresh to cold as the altitude increases.
The orographic, lithological and bioclimatic conditions are quite diverse.Thus, forest ecosystems covers almost 35.4% of the province territory or an area of about 354 430 ha.Among the leading plant species, carob formations thrive at the mid slopes (locally called "Dir").

Modelling Approach
Based on species presence data and environmental predictor maps, spatial distribution modelling was used to predict the likelihood distribution for a species' occurrence as function of environmental limitations.Further, MaxEnt software was used to process the existing maps of environmental factors and the collected species occurrence data.Carob tree populations' occurrence data were recorded in Azilal Province (-7°20' 31°20', -5°50' 32°40') through field surveys along 600 km of roads and in areas where the carob tree is expected to be found (either in disturbed environments or forest formations).A global positioning system (GPS) device (Trimble Juno) was used to record spatial location (longitude and latitude) for each population/tree.In case of specific carob population, the registered location was the latitude and longitude taken from the centre of the population.587 points was recorded (Figure 2).Nineteen bioclimatic variables (Table 1) were extracted from WorldClim database (http://www.world-clim.org;Hijmans et al., 2005) to characterize occurrence area.These predictors represent biologically meaningful variables for characterizing species distribution (Lehmann et al., 2011).
In addition, as the carob is qualified as a mid-slope tree, Topographic Wetness Index (TWI) was also used.It is a steady state wetness index which quantifies the topographic control on hydrological processes (Sørensen et al., 2006).The Topographic Wetness Index is a function of slope and upstream contributing area per unit width.The index was described as highly correlated with several soil attributes such as horizon depth, silt percentage, organic matter content, and phosphorus (Moore et al., 1993).The index is defined in equation 1, where "a" is the local upslope area and "tan(b)" is the local slope in radians.

TWI=ln(a/tan(b))
Equation Species distribution modelling tools are used to predict the most suitable areas for a species and infer likelihood of its presence in regions where no systematic data are available (Elith & Burgman, 2002).They can also assess the potential expansion of introduced species in newly colonized areas (Jimenez-Valverde et al., 2011;Jeschke & Strayer, 2008), estimate the future range of a species under climate change (Sinclair et al., 2010) or assist in reserve management (Thorn et al., 2009).Furthermore, linking carob tree occurrence in some well-known locations with environmental data from their landscape and projecting it onto the geographical space (Deblauwe et al., 2008;Elith et al., 2006;Elith et al., 2011) allows to assume the likelihood of carob distribution within the study area.
The principle of maximum entropy, used in the scope of this work, has its origins in information theory (Shannon, 1948) and is incorporated commonly in a Bayesian framework (Jaynes, 2003), where inference using probability distributions requires a prior distribution to represent the current state of knowledge.MaxEnt utilizes a predictive algorithm to maximize the entropy of the observed sample distribution (i.e.species locations), maximize the distribution of the background sample, and minimize the relative entropy of the ratio between these two distributions (Elith et al., 2011).MaxEnt is able to create complex models from presence-only data with low sample sizes (Dudík et al., 2007;Phillips, Anderson & Schapire, 2006;Phillips & Dudık, 2008;Phillips, Dudík & Schapire, 2004;Wisz et al., 2008).
MaxEnt algorithm is implemented within the MaxEnt platform (Phillips et al., 2006).The software version used was 3.3.3k.It is a standalone Java program.Elith et al. (2011) provide an explanation of the algorithm (and software) geared towards ecologists.The MaxEnt software modelling outputs include assessments of model performance, tabulated contributions of each covariate to the models, and mapped probabilities of species at each pixel in the study area (the logistic output).
Furthermore, the free and open source geographic information system, QGIS (http://www.qgis.org),was used to prepare the data, to calculate TWI (using geoprocessing plugin and SAGA extension) and to compile the MaxEnt results.R (R Core Team, 2014) was used to analyse the results and to get zonal statistics.

Data preparation
Environmental predictors were assembled into raster brick of values.Furthermore, a location file of observed points on which carob tree exists across the study area was established.Prior to conducting analysis in MaxEnt, all locational and predictor grids were clipped to the same geographic extent and interpolated to a common resolution and to same geographic dimensions.As MaxEnt requires same pixel size of all rasters, a common spatial resolution of a 90 m per pixel was chosen.In addition, only one occurrence sampling point per 90 m pixel was used for model building and evaluation in order to reduce sampling and autocorrelation bias (Webber et al., 2011).Elith et al. (2011) found that even in the case of correlated variables, MaxEnt performs better than most of other modelling methods.Therefore, all described covariates were retained for the final model.Jackknifing was used to assess relative importance of each variable.

Model building
As recommended by Phillips & Dudık (2008), default settings for MaxEnt software were validated over a range of tests.Used settings have other adaptations (Galleti et al., 2013) such as:

•
Response curves were produced in order to evaluate the model's performance as function of each environmental variable.
• Jackknife test determined the prediction power of each variable.For each variable, model has been trained with that variable as the sole variable and without that variable in order to assess its relative contribution to the model.

•
Cross-validation was used to examine the variability in model building.The leaved part concerned 15% of the sample size.
• Different random seed was selected for each run.The sample data were divided randomly into different partitions for training and testing, and different random subsets of background samples were selected.

•
As noticed by the same authors, adding data for observed sample locations to the background data raised model performance; all samples were added to the background.

Model performance
Area under the curve.The receiver-operating characteristic (ROC) plot's area under the curve (AUC) is a threshold-independent measure of model prediction accuracy.The AUC of the ROC plot is constructed by plotting the false-positive error rate (1-Specificity) on the x-axis versus the true positive rate (Sensitivity) along the y-axis for every probability value predicted by the model.The AUC is the sum of the area occurring under the ROC curve (Hanley & McNeil, 1982).According to Araújo et al. (2005), the model is excellent where: AUC > 0.90, good: 0.80 < AUC ≤ 0.90; acceptable: 0.70 < AUC ≤ 0.80; bad: 0.60 < AUC ≤ 0.70; and invalid for values ranging between 0.50 < AUC ≤ 0.60.In addition, AUC may be used for fine tuning model settings by comparing results based on different sample sizes and predictors (Franklin, 2009).
Model gain.The algorithm within MaxEnt generates the optimum distribution that satisfies all constraints.Each unique model begins with a uniform distribution, which guarantees the maximum possible entropy for that model, and over a set number of iterations the algorithm increases the probability of correctly predicting the locations of the phenomena modelled.The increase in probability is displayed as the model gain, which is calculated as the log of the number of grid cells minus the average of the negative log probabilities of the sample locations (log-loss).The gain increases with each iteration of the model until it falls below the convergence threshold or until the model performs the maximum number of iterations allowed.Smaller log-loss values indicate a higher likelihood of suitable conditions necessary for species occurrence (Phillips & Dudık, 2008).
Predicted distribution map.MaxEnt software produces a map that displays the probability of carob occurrence as a value from 0 to 1 for each pixel (logistic output).From this logistic output, three thresholds were evaluated.The first evaluation was made for a threshold of 0.50 because it is an intuitive cut-off point for high and low probability of occurrence.The second is based on transition between the most suitable areas to unsuitable area with levels of: 0.2, 0.4, 0.6 and 0.8.The third allows a comparison of area strictly unsuitable with those with a small probability.Furthermore, maps realism has been verified with the help of forest managers who worked a long time on the study area.

Results
The main output of SDM is a continuous probability map showing the most suitable area for Ceratonia siliqua L.
(Figure 3).As presented on the receiver operating characteristic (ROC) curve (Figure 4), the AUC value is 0.926 which indicates excellent model predictions of carob tree distribution.This value falls in the upper range of the possible AUC.Thus, the adjusted MaxEnt model is excellent and able to discriminate clearly between random points and the environment associated with locations that are the most suitable for Ceratonia siliqua.
Using the predefined thresholds sets, the continuous probability map could be filtered to show the likely area of carob tree distribution.Based on these thresholds and for the most intuitive cut-off of 0.5, Ceratonia siliqua is predicted to occur in an environment that consists of a cumulative area of about 1 006 km 2 .
Contributions for each predictor were assessed during the process of model building in order to evaluate the gain in model performance with and without each covariate, providing a measure of its relative importance.In carob tree distribution modelling, the contribution of the various predictors differed considerably in both rank and magnitude.Indeed, temperature variables provide high contributions, followed by the rainfall during the warmest quarter.Five covariates provided negligible contributions: TWI, rainfall during the coldest quarter and during the wettest quarter, and annual mean rainfall.Moreover, according to the Jackknife test (Figure 5), the environmental variable which shows the highest gain when used as the sole variable is the Temperature Annual Range (BIO5-BIO6).This predictor seems to have the most useful information for model building.On the other hand, when the temperature seasonality and the mean diurnal range are omitted the model gain decreases.These two predictors seem to hold the most information that is missing from other variables.These statements shows that the two later predictors are necessary for MaxEnt model building in order to achieve a good fit to the training data.Further, the temperature annual range variable, when used as sole variable, gives comparatively better results than other predictors.
Analysing observed pixels value for suitable area (with an occurrence probability > 0.5) and for unsuitable area (with an occurrence probability < 0.1) shows that carob tree distribution varied throughout different bioclimatic locations.Summary statistics are given in Appendix A. The most suitable area seems to be confined to an area with a higher minimum temperature during coldest quarter (BIO 6 and BIO 11) and a low rain during the warmest quarter.Furthermore, compared to unsuitable locations, suitable areas are also characterized by: -a relative homogeneity for predictors' values -for each predictors, the standard deviation is lower compared to unsuitable area, except for TWI where the two values are similar (1.8 for suitable vs. 1.53 for unsuitable);  Where: preds_multi_bio.1 to preds_multi_bio.19= are respectively the BIOCLIMATIC predictors BIO1 to BIO 19 (defined in Table 1); twi_sampled_def_2 = topographic wetness index.

Discussion
The occurrence likelihood map of the carob tree (Ceratonia silique) in the Province of Azilal, Morocco, has cell values ranging from 0.1% to 100%.Reclassifying these values according to the thresholds leads to the identification of a suitability map.The AUC evaluation leads to an excellent predictive ability of the model which is concurred by forest managers through their positive feedback on the suitability map (using a threshold of 0.5).Moreover, the trained model could be extrapolated to the whole of Morocco (data not shown due to the lack of samples covering the whole of Morocco).
Identification of environmental covariates that are noteworthy for their substantial modelling contributions shows that temperature predictors (temperature range: max T° of warmest season -min T° of coldest season, T° seasonality, Min T° of the coldest month) in addition to rainfall levels during the warmest quarter were the most important bioclimatic variables related to Ceratonia siliqua occurrence in our niche modelling analysis.These four bioclimatic variables are related to the presence/absence of frost which is a limiting factor and to reduced rains during the drought season.Soil fertility has not been used in the modelling analysis due to the lack of information.The TWI, which is closely related to soil fertility, does not have a high contribution to the fitted model further indicating that the carob tree is indifferent to soil composition.
The highlighted predictors interfere with forest establishment and expansion.This is not surprising, since carob stands occur on hilly and mountainous sites which are arid and warm.Thus, the absence of a cold season (frost) during the year is considered as the main factor enabling carobs' establishment and development in the Province of Azilal.

Conclusion
This study provides the first predicted potential map for spatial distribution of carob tree in Morocco.The use of MaxEnt modelling shows an exceptional performance in predicting accurately the carob tree distribution in the Province of Azilal, using a few number of occurrence records of carob tree.The application of this approach can be an effective tool for forest conservation, monitoring and management in the context of climate change.
The carob is a multifunctional tree, appreciated by rural communities, due to its products of high value.The expansion of the carob plantations is highly recommended by local communities and the forest administration.Therefore, carob stands should be expanded in suitable areas with a human interference and on marginal lands.
This study can help decision maker to identify suitable land to implement carob plantation programs and optimize their effort in order to ensure forest conservation and restoration.

Figure 1 .
Figure 1.Study area in Morocco

Figure 2 .
Figure 2. Localization of sampling points within study area

Figure 3 .
Figure 3. Map of the logistic output showing the probability of occurrence for Ceratonia siliqua in Azilal Province

-
higher values of BIO1, Bio 6 and Bio 11; -lower value of BIO 18; and for the TWI, mean and median values are higher.It means that pixels with higher TWI values are likely to become saturated before those of lower values.Furthermore, negative TWI values are abundant and seem to be among the driest locations.

Figure 5 .
Figure 5. Variable importance in modelling process assessed through Jackknifing

1 Table 1 .
Predictors used in the modelling process