Spatial Mapping of Soil Properties Using Geostatistical Methods in the Ghazvin Plains of Iran

Geostatistical interpolation is widely used to map spatial variability in physical and chemical properties of soil, such as organic matter content, particle density; and pH. Geostatistical interpolation is a branch of applied science which predicts spatial concentrations at unknown locations at a study area by incorporating limited measured data, which is a major advantage over classical statistics. Although many studies applied geostatistical interpolation in agricultural land, there are still gaps in knowledge in selecting suitable models to map soil properties on a large geographical location. The objectives of this paper were to examine and to map the spatial distribution of the soil physico-chemical properties, including electric conductivity (EC), pH, sodium absorption ratio (SAR), organic matter (OM), percentage of sand, silt and clay, bulk density (ρb), saturate percentage (SP), and mean weight diameter (MWD), at 800 hectares of agro-industrial land at Sharifabad, Qazvin, Iran. The soil samples were collected in total 275 points in a regular grid (100 × 100m) over the study area. The exploratory statistical analysis was applied on the collected data for understanding the distribution of the dataset. The silt content, clay content and OM data showed normal frequency distribution, and the pH data show near to normal frequency distribution. The remaining soil properties data, including SAR, EC, SP, MWD, sand content and bulk density showed log-normal distribution, which was identified by the normality test of Kolmogorov-Smirnov with an error probability of 1%. The spatial characteristics of the dataset were assessed by semivariogram models in GS+ and GIS 10.3 software. Among the four different semivariogram models, namely linear, exponential, Gaussian and spherical, the best performing model was chosen following the highest R2 and lowest error values. The predictive geostatistical interpolation maps for each variable were drawn using ordinary kriging model.


Introduction
The soil properties such as pH, cation exchange capacity, calcium carbonate and organic matter show spatial variation across a study area; therefore, soil properties are studied across the world as it has application in the planning and management of industrial and agricultural land (Wang, Gertner, Parysow, & Anderson, 2000).In general, the soil properties are similar to the adjacent sampling sites compared to the distant sampling sites (Wang et al., 2000).Many studies applied classical statistical method to quantify spatial characteristics in soil properties (Salehi, Safaei, Esfandiarpour-Borujeni, & Mohammadi, 2013;Yemefack, Rossiter, & Njomgang, 2005).However, soil physico-chemical characteristics often exhibit spatial dependency, which cannot be captured by classical statistical methods (Burrough, 1993;Lin, Wheeler, Bell, & Wilding, 2005).To overcome this issue, many researchers applied geostatistical interpolation methods in estimating the spatial variability in soil properties (Cambardella et al., 1994;Webster & Oliver, 2007).
Geostatistical interpolation models are very effective tool in estimating spatial variations of soil properties at large geographic areas (Sokouti & Mahdian, 2011).In theory, the geostatistical models in corporate limited number of field measurement data to a large geographical area and estimate concentration surface of the variables over the whole study area (Sokouti & Mahdian, 2011).The application of geostatistical interpolation models to soil properties' spatial dataareimportant for accurate soil management (Goovaerts, 1999); (Sokouti & Mahdian, 2011).
The kriging is a well-established geostatistical interpolation model which is based on a logic of weighted moving average (Theodossiou & Latinopoulos, 2006).Mohammadi et al. (2000) applied both kriging and linear regression models in the Iran to quantify spatial distribution of several soil properties, including electrical conductivity, saturation percent, sodium absorption ratio and its percentage.The study concluded that the kriging was the preferred method in estimating spatial distribution of soil properties compared to linear regression (Mohammadi, 2000).Similarly, the kriging method was applied in many previous studies to map depth and thickness of soil materials (Bourennane, King, & Couturier, 2000;Marinoni, 2003;Penížek & Borůvka, 2006) as well as characterize soil textures (Jang, Chen, & Kuo, 2013).
In China, Sun, Zhou, and Zhao (2003) observed spatial variations in soil properties, including pH, organic matter, available phosphorus and available potassium, with a significant spatiotemporal variation in phosphorus contents in soil compared to pH levels.Wang, Zhang, & Huang (2009) found that spatial variability in soil carbon content was linked to the regional topography and soil types.Ghasemi et al. (2003) estimated the effect of landform and land-use parameters to the spatial variability of physico-chemical properties of soil at Khuzestan, Iran.The study found an increase in soil salinity from the north to south of the sampled area.The soil properties showed temporal and spatial variations due to the inefficient soil management, fertilization issues, and crop rotation (Quine & Zhang, 2002;Yemefack et al., 2005).
Overall, the application of geostatistical interpolation techniques isan important method for accurate soil management and understanding spatial variations in the soil properties.This study aimed to conduct a comprehensive investigation on the spatial variation of chemical properties of soil at Qazvin province, Iran.The study area consisted of agricultural and industrial land with an area of approximately 800 hectors.In total 275 soil samples were collected across the study area for the laboratory analysis.The soil samples were analysed to quantify the concentration of six soil chemical properties, including pH, electrical conductivity (EC), cation exchange capacity (CEC), percentage of calcium carbonate (CCE), sodium absorption ratio (SAR), and organic matter (OM)), and the recorded data were stored for the further analysis.At the end, the recorded spatial data were incorporated in to geostatistical interpolation models to generate predictive spatial map.

Study Design
The study was carried out in the northern plains of Sharifabadcity at Qazvin province, Iran, consisting of an area of 800 hectares of agricultural and industrial lands.The geographical coordinate of the study area lies between 50°41′E and35°50′N to36°19′N.The minimum and maximum annual average temperature at the study location are 2 °C and 18 °C, respectively, and July and August are the warmest and January and February are the coldest months of a year.The average elevation of the study area is 1250 meters above mean sea level.Figure 1 shows the study area and 275 soil sampling points.

Sample Collection
The aerial photos of the study area were used to identify and separate the different landscapes and land forms.In total 275 soil samples were collected across the study area with a uniform 100×100 meters grid and the topsoil depths at each sampling point were ranged between of 0 to 15 cm.On average 3 kilograms of the soil samples were collected at each sampling location.

Laboratory Analyses
The collected soil samples at each sampling location were first air-dried and then passed through a two-millimetre sieve.The electrical conductivity of the soil samples was measured by electrodes inserted directly into soil water which was extracted from a Lysimeter.The soil pH was measured by pH meter, with the mixing ratio of the soil sample to water was 1:1 (Rhoades, 1982).The sodium content in the soil solution was measured using a flame photometer, and calcium and magnesium using complexometry method.The SAR of each the soil samples was calculated using the equation (1). (1) The CEC was calculated using sodium acetate at pH 8.2 (Rhoades, 1982).The CCEwas recorded by pressure Calcimeter (Olsen, Sommers, & Page, 1982), and the OM in the soil samples was calculated using Walkley and black methods (Lal, 2004).

Classical Statistical Analysis
The exploratory statistical analysis was performed using SPSS 21 software.The distribution of the collected data was evaluated using the significance test of skewness (Zar, 1984).When the recorded data were lacking normal frequency distribution, the distribution was tested against logarithmic scale.

Variogram Analysis
A variogram can characterize the spatial variability of random variables between two locations.In practice, an experimental variogram,γ(h), is used to calculate pairs of data separated by a vector, h (Isaaks & Srivastava, 1989).
where N(h) is the number of data pairs, z(u i )is the random variable, and h is the lag distance.
In this study, the experimental variogram were fitted by a theoretical models, namely spherical, exponential and Gaussian models and the nugget effect (ci), sill (c), and range (a) were recorded (Cheng et al., 2007).When the variogram value in a theoretical model levels out at a certain lag distance, the beginning distance is the range (Fig. 2).The nugget effect represents the variogram value greater than zero at zero lag distance, which is attributed to the measurement errors or spatial sources of variation at distances smaller than the sampling interval.
When the variogram model reaches the range, the variogram value minus the nugget effect is called the sill (Isaaks & Srivastava, 1989).The functions of the spherical (eq.3), exponential (eq.4), and Gaussian (eq. 6) models are defined as follows (Isaaks & Srivastava, 1989): When lag distance is short, the spherical, exponential, and Gaussian variogram models typically exhibit linear, fast, and slow increases, respectively.This study selected the final variogram models based on least-squares (R 2 ) and mean squire error (MSE) value.The variograms were calculated in different directions to measure the anisotropy of spatial variability.In general, the variogram models include geometric and zonal anisotropies (Deutsch, Srinivasan, & Mo, 1996).The geometric anisotropy occurs when the range varies with the direction of the variogram for the constant sill.The zonal anisotropy occurs when both the range and sill vary with the direction of the variogram (Cheng et al., 2007).This study considered only the geometric anisotropy.

Ordinary Kriging
Ordinary kriging (OK) is the most fundamental geostatistical method for modelling spatial distributions of random variables, which is a linear weighted average interpolation technique.The OKestimator equation is expressed as (Isaaks & Srivastava, 1989): Where λj is a weighting factor of Z(u j ).
OKis unbiased for expected values of estimators and random variables, and it minimizes the variance of estimation errors (Isaaks & Srivastava, 1989).In addition, this kriging estimator provides a variance of estimation errors termed kriging variance, which quantifies the uncertainty of the estimate at each site (Cheng et al., 2007)

Indicator Kriging
Indicator kriging (IK) is a nonparametric geostatistical method for estimating the probability in which the attribute values are not greater than a specific threshold, zk, at a given location u (Goovaerts, 1997).In the IK, the spatial variable (Z(u)) is first transformed into an indicator variable with a binary response which is expressed as the equation below (Goovaerts, 1997): The expected value of I(u; z k ) is conditional on n surrounding data which is expressed as follows (Goovaerts, 1997): (Cheng et al., 2007).
Where z*(u i ) and z(u i ) are the estimated and measured values of random variables, respectively, at the i th site; N is the number of measured data; and σ k (u i )andσ K 2 (u i ) are the kriging standard deviation and variance, respectively, at the i th site (Cheng et al., 2007).

Descriptive Statistics
Table 1 shows the summary statistics of the investigated soil physico-chemical properties.According to (Carvalho, Silveira, & Vieira, 2002),when the values of skewness and kurtosis are close to 0 and 3, respectively for a specific parameter, the data are considered normal frequency distribution.Based on this criterion, silt content, clay content, and OM showed normal frequency distributions (Table 1).Figure 3 shows that pH have frequency distribution near to normal.Whereas, SAR, EC, SP, MWD, sand content and bulk density show a log-normal distribution, which is confirmed by the Kolmogorov-Smirnov'snormality test, with error probability of 1% .Table 1 shows that the lowest and the highest coefficient of variationare for pH (5%) and EC (78%), respectively.Dafonte, Guitián, Paz-Ferreiro, Siqueira, & Vázquez (2010) classified coefficient of variation as low (<10%), medium (10-20%), high (20-30%)and very high (>30%).Moasheri & Foroughifar (2013) identified that the low coefficient of variation for pH was caused by parent material composition in soils while the high coefficient of variation might be due to the land management factors such as fertilization and land-uses type.In most cases, variables with a high coefficient of variation (>50%) were of non-normal frequency distribution.The logarithmic transformation reduced the data coefficient of variation and skewness as well as increased normality (Table 1).According to Moasheri and Foroughifar et al. (2013), logarithmic transformation of skewed dataset is an effective approach in reducing the skewness and coefficient of variation.The spatial variability in soil properties is caused by the variations in depositing environment, difference in aggregation, or hydrological difference in different land forms.In addition, soil properties in agricultural lands can be affected by irrigation, fertilization, or drainage pattern (Moasheri & Foroughifar, 2013).These factors have effect on the data distribution pattern.

Spatial Autocorrelation
In this study, the experimental semivariogram variables were illustrated the spatial variability of physico-chemical properties in soil.The semivariogram analysis showed insignificant anisotropy; therefore, the isotropic semivariogram was used in the next stages.Figure 4 shows the best fitted semivariogram models of the investigated soil properties.The best models for the semivariograms was selected based on the lowest root sum square (RSS) and the highest correlation coefficient (R 2 ) (Robertson et al, 2008) (Table 2).The bestfittedsemivariogram models for each soil properties are shown in Table 2.The exponential model was fitted to the semivariograms of EC, pH, sand content , silt content, caly content, SP, MWD, bulk density and OM, and the spherical model was fitted to SAR.Siqueira, Vieira, & Ceddia (2008) reported that the spherical model is the most suitable semivariogram analyses for soil and plant attributes.Whereas, Liu, Shi, Jiang, Bae, & Huang (2009) and Júnior, Lana, & Guimarães (2007) reported that the exponential model is the most suitable for assessing spatial variability in soil chemical properties.Subramoniam, Bera, & Sharma (2011) found that the exponential model was among the best for this analysis because it explained maximum variability in the spatial dataset.In this study, the coefficient of determination for pH and EC was highest for the exponential model while it was lowest for Gaussian model.Mehrjardi, Jahromi, Mahmodi, & Heidari (2008) found the spherical model as the most appropriate spatial model for the EC and the SAR.Similarly, Miller, Singer, & Nielsen (1988) concluded that the spherical model was the most appropriate spatial model for EC.The spherical model is one of the most common geostatistical model fitted for soil properties (Miller et al., 1988).In this study, the nugget effect (C0) values for all semivariogram models were low (0.013-0.093), indicating relatively low measurement errors(Table 2).The values of the ranges (a) are varied between 600-2010 (Table 2).The mean error (ME) was lower for the EC, particle density, pH and OM, and RMSE was higher (7.74) for SP.
Figure 4 shows the selected varigram model specifications, namely nugget effect, threshold and their impact radius.The nugget effects are attributed to the measurement errors or spatial sources of variation at distances smaller than the sampling interval or both.In general, the measurement error occurs in the spatial data due to the errors in measuring devices.It is essential to gain proper understanding of the scales of spatial variation.Lag distance is a distance beyond which the samples do not affect each other or do not show enough dependence and the spatial points can be considered independently of each other.The lag distance values were varied in this study (Table 2).Such lag distance provides information in relation to the limit of permitted distance for sampling.The lag distance values for pH and SAR have strong spatial dependency compared to the remaining variables.
The lag distance of variograms ranged from 732 meters for calcium carbonate to 2046 meters for cation exchange capacity (Table 2).The pH showed high lag distance (450); whereas organic matter, electrical conductivity, and sodium absorption ratio are of low lag distance.The low lag distance in EC (about 220m) is probably linked to the low salinity in the study area.The low lag distance in organic matter and SAR can be attributed to the land management factors, including change of land-use, irrigation, fertilization, and plowing.

Spatial Characteristics of Soil Properties
The high values of SAR (>15) indicate greater risk of sodicity problem and increase likelihood of particle dispersion.The threshold between sodic and non-sodic soils is a SAR of 13.SAR is classified as very good (less than 5), good (5-10), rather hazardous (10-15) and dangerous (more than 15).In this study, almost 95% of the SAR was varied between very good and good, and less than 1% fell in the dangerous class (Fig. 5).The spatial distribution of EC showed that the 95% of salinity in the study area was below 2 dS/ m (Fig. 6).According to the study conducted on the quality of irrigation water in the area, the salinity was estimated at less than 2.0 dS/m.About 50% of areas had pH between 7.8 and 8.2 and about percent had between 8.3 and 8.7 (Fig. 7).The spatial distribution of the OM indicates the most favourable conditions for irrigation in the study area are owing to the manure from the live-stock farm (Fig. 8).About 90 percent of the soil in the study area contained organic matters more than 1.5%.According to the zoning map of silt percentage (Fig. 9), it can be concluded that the most of agricultural and industrial lands soils in Sharif Abad have 30-45% silt content, indicating the soil in the studied area is not heavy textured and hydraulic conductivity is from medium to high.Fig. 10 shows that the spatial distribution of clay in the studied area is stable as 93% of the area contains 15-30% of clay.In addition to its impact on soil texture, clay contents in soil play an important role in preservation of water and nutrients.Similar to this study, Sun et al (2003) observed spatial variation in soil properties, including pH, organic matter, available phosphorus, and available potassium, in China.The study also observed significant spatiotemporal variations in the phosphorus content in soil compared to pH levels.Wang et al. (2009) found that spatial variability in soil carbon content which is linked to the regional topography and soil types.
Figure 11 shows that the particle density is lower than that of the mineral soil across the investigated area, which is due to the high level of organic matters in the region.The soil textures in the region were in the group of Loam and Sandy Clay Loam.As such, the saturation moisture is in the medium range.Usually, saturation moisture less than 40 percent is related to sandy soil, between 40-60 is related to medium-texture soil and more than 60 is related to heavy texture.According to the spatial interpolation model map in Fig. 12, more than 90% of the area has saturation moisture of 40-55%, which is related to the soil texture.
Soil aggregation is influenced by several factors associated with the soil clay fraction such as exchangeable cations, clay content, clay type, and the amount of dispersible clay.Clay particles are involved in binding or cementing soil particles but when clay particles are flocculated and aggregated by calcium ions and organic matter, they may not be extensively involved in binding or cementing other particles.Beside clay content, the EC, SAR, and pH have influence in soil aggregation.The spatial distribution of MWD suggests high salinity and SAR caused that the soil aggregates be degraded (Fig. 13).Moreover, MWD is higher (>2) in cultivated and non-saline areas while the MWD is less (<1) in soils with high salinity and SAR.

Conclusion
In this study, 275 soil samples were collected at Qazvin province, Iran, to investigate spatial variation of six types of soil chemical properties, including pH, EC, CEC, percentage CCE, SAR, and OM.The overall results have been summarised below: 1) The frequency distribution for the collected phyco-chemical spatial data of SAR, EC, SP, MWD, Sand content and bulk density had log-normal while only pH had normal distribution.
2) Geostatistical interpolation identified a strong spatial variability (< 0.25) for pH and SAR data; whereas the remaining variables showed a medium spatial variability (> 0.5).
3) The best samivarigram models for all variables (except SAR) was spherical, followed by an exponential model.
4) The semivariograms showed the zones of higher variability which indirectly helped in identification of soil sample locations precisely.Therefore, the kriging outputs are very much useful for mapping spatial variability in large scale/cadastral level.
5) This study showed that the ordinary kriging interpolation model is very effective land management approach as it provides reliable probability distributions of soil parameters.
6) The developed spatial modelling technique is very effective in mapping soil physico-chemical parameters in agricultural land, which has application in improving site-specific fertilizer scheme and enhancing crop production.

Figure 1 .
Figure 1.Themap shows the geographic location of the study area in Sharif Abad city at Qazvin province, Iran,and position of sampling points across the study area

Figure 2 .
Figure 2. The components of a theoretical variogram include nugget effect (C 0 ), sill and range (a)

Figure 3 .
Figure 3.The histograms (bar) and density plots forpH and SAR

Figure 5 .
Figure 5.Spatial distribution map for SAR using ordinary kriging

Table 1 .
Descriptive statistics of soil samples analysesat 275 spatial points.

Table 2 .
The semivariogram parameters and the final exponential models