Development of a Bayesian Belief Network Model Framework for Analyzing Farmers’ Irrigation Behavior

Canal operators need information to manage water deliveries to irrigators, especially in the case of on-demand irrigation supply systems. Short-term irrigation demand forecasts can provide potentially valuable information for a canal operator who must manage such a system, especially if these forecasts could be generated by using readily available information about bio-physical conditions of the irrigated area and the decision-making processes of irrigators. Additionally, Bayesian models of irrigation behavior can provide insight into the likely criteria which farmers use to make irrigation decisions. This paper develops a Bayesian belief network (BBN) to infer irrigation decision-making behavior of farmers based on factor interaction and posterior probabilities. The model discussed here was built from a combination of data about biotic, climatic, and edaphic conditions under which observed irrigation decisions were made. From all the possible initial trials, the model which was built from data comprising of conditions on days the irrigation decision was made, and a day before it, was found to be the best and is presented and discussed here. The paper includes a case study using data collected from the Canal B region of the Sevier River, near Delta, Utah. Alfalfa, barley and corn are the main crops in the Canal B area. The model has been tested with a portion of the data to affirm the model predictive capabilities. It was found that most of the farmers used consistent rules throughout all years and across different types of crops. Soil moisture stress, was found to be the most likely, significant predictive variable of the irrigation decision. Irrigation decisions appeared to be triggered by a farmer’s perception of soil stress (or a surrogate thereof), or by a perception of combined factors such as information about a neighbor irrigating or an apparent preference to irrigate on a weekend. Soil stress resulted in irrigation probabilities (chance that the farmer will irrigate) of 94.4 % for alfalfa. Prediction accuracy of the timing for irrigations of alfalfa was observed to be 81.0%, and 61.0% for barley and corn. The study shows that BBNs can be a prospective tool to analyse likely decisions about irrigation in an on-demand system with good accuracy.


Introduction
Irrigation is an integral part of agriculture.Crop water demand fluctuates throughout the growing season, with high demands occurring during warmer and windier conditions.This brings an uncertainty in farmers' future irrigation decisions.A reliable ability to predict a farmer's future irrigation actions could be useful for better operation of irrigation canals to respond to fluctuations in short-term water demand.requirements can prescribe when and how much a farmer should have irrigated on a certain day (Smith, 1992;Jones et al., 2003;Merot & Bergez, 2010;Igbadun, 2012), but it typically cannot shed light on the inherent reasons why a farmer decides to irrigate.A soil moisture balance model would suggest irrigation occur as soon as there is stress (Allen et al., 1998) that would be indicative of deterioration in the crop condition.Deterministic models (Smith, 1992;Jones et al., 2003;Merot & Bergez, 2010;Igbadun, 2012) also need data on estimated amounts of water delivered, conveyance system design, system efficiencies, etc., to be able to make a reasonable model of irrigation practices.Many of these types of data are unavailable or, often, proprietary for any given irrigation company.
To anticipate future irrigation actions, an analysis of previous irrigation practices and identification of patterns in them is necessary.A wide range of data sources is available.These constitute scientific measurements and involve expert judgment about variables which are derived using prior experience.This problem also involves fields such as economics, hydrology, sociology/anthropology, and irrigation engineering.This means a model for analyzing irrigation decisions must combine categorical and continuous variables, which is not possible in conventional approaches such as in soil moisture balance calculations.Studies have been conducted individually in all these fields with regard to farm operations (Smith, 1992;Jones et al., 2003;Merot & Bergez, 2010;Igbadun, 2012), but there is no study in the literature which combines all these fields into a model for analyzing irrigation decisions.
Bayesian belief networks (BBNs) can be used to study problems that involve decision-making under uncertainty and make inferences about the related behavior (Pearl, 1988;Varis & Kuikka, 1999;Cain, 2001).These models can make use of available data and provide information to infer the reasons which led to the decision being modeled (Varis & Kuikka, 1999).Bayesian models are characterized by their simplicity, ease of interpretation, and viability.Such methods are cost-effective since they can provide results with available information about the problem.Bayesian models have been applied in ecology (Haas, 1991(Haas, , 1992;;Crome et al., 1996) and environmental management (Ellison, 1996;Wolfson et al., 1996).Some studies have been reported in the literature that focus on farmer decision behavior and have been presented in the following paragraphs.Becu et al. (2006) developed a multi-agent model to understand water sharing between two villages, one upstream and the other downstream.Farmer behavior in making decisions regarding planting crops, irrigation, harvesting, etc., was studied.Since different cropping patterns were identified in the region, agent farmers (Agents in a multi-agent model represent active constituents of an environment, such as a farmer.The behavior of agents is defined, put in an environment by connections to other agents, and then a simulation is run.)were divided into sub-classes.An irrigation decision was made on the basis of an irrigation schedule for each type of crop.The agent in this case had to decide the amount of water to be supplied to each plot, which was computed as the biophysical requirement for water.Bontemps and Couture (2002) studied farmers' water consumption while being charged minimally for water use.The farmers did not bear the full cost of irrigation supplies.The study formulated a sequential decision model to analyze farmers' irrigation behavior.Le Bars et al. (2005) developed a discrete event simulator called MANGA using a multi-agent systems paradigm.Two types of agents were considered: (a) cognitive, the human element, representing farmers and water supplier, and (b) reactive, which modeled crops, information suppliers, and climate.The objective was to simulate evolving farmer-agents over years, given a limited water resource.The model was useful for analyzing water use and its effects on yields at both individual and system-wide levels.It could also be used to verify various scenarios in a given problem without having to contend with them in the field.
Overall, these studies built representative farmers and created scenarios of how farmers might act.They did not study how target farmer groups actually behave.They did not look at the variables that might be affecting farmer decision behavior.
In terms of models built to analyze crop irrigation decisions, some tools have gained prominence in past years and we are presenting a literature review of those models.Several decision tools have been developed to assist farmers with irrigation scheduling such as CROPWAT (Smith, 1992), The Decision Support System for Agro-technology Transfer (DSSAT) (Jones et al., 2003), IRRIGATE (Merot & Bergez, 2010) and Irrigation Scheduling Impact Assessment Model (ISIAMOD) by Igbadun (2012).These models have the provisions to calculate crop water and irrigation requirements given soil, climate and crop data.They also allow for preparation of irrigation schedules for various crop water management scenarios.These models can also be useful for the evaluation of farmers' irrigation practices and can be used to compare yields under rain fed and irrigated conditions.
From the literature review about the present state of modeling tools available, all the tools do not analyze farmer's decisions.Instead they simulate farmers' decisions and provide a platform for growers to test their decisions and evaluate the outcomes.They can assist farmers in decision-making but cannot understand why a farmer irrigates on a certain day.
The main focus of this study is to analyze the factors believed to affect farmers' irrigation decisions and to utilize the results to provide a mechanism for analyzing short-term irrigation decisions.The work reported here is a first attempt at studying farmer irrigation decision behavior for which information is, or can be made available.The objective was to infer why farmers decide to irrigate on certain days as opposed to others.Is revenue maximization one of the goals for irrigation?Which measured variables in the soil-plant-water system best account for the decisions that are actually observed?We study irrigation decisions by using plant, weather, and soil conditions, on and one day before the day the decision to irrigate was made.Representative variables have been used to construct a modeling framework for the problem.Learning capabilities of BBNs have also been exploited here.Since learning is data-intensive, we have used data from years 2007-2010 for the case study area.The model was tested with a subset of the data and used to make inferences about future irrigation decisions.

Learning Bayesian Belief Networks
The problem involves classifying decisions into two mutually exclusive classes on any given day during the growing season, i.e., a decision to irrigate or a decision not to irrigate.Figure 1 shows a BBN with three nodes and illustrates the modeling of cause-effect types of relationships.BBNs represent a system as connections between variables (nodes) and define the relationships between variables with probabilities, denoting the magnitude of effect of one variable on another (Jensen, 2001).This makes it easy to visualize and interpret the relationships between variables.The network input parameters are prior probabilities, conditional probabilities, and the posterior probabilities (on outputs).The likelihood of an input variable to be in a certain state is called the prior or unconditional probability.If a node has inputs from two or more other nodes, then the likelihood of the state of that variable depends on the state of the input nodes affecting it and is called conditional probability.Posterior probability is the probability that a variable is in a certain state resulting from the combined effects of the input variables, conditional probabilities and linkages.
The variables of a BBN are known as nodes.A BBN is based on Bayes' probability rule.It updates existing beliefs with new evidence and finds the marginal posterior probability for each node/variable.It can use a combination of the following at the same time: (a) continuous and categorical variables, (b) empirical and variables based on expert judgment, and (c) deterministic or stochastic relationships, or probabilities learned from data.BBNs can evaluate the outcome of an event by forward propagation and learning, and they can find the probabilities of factors contributing to an output of a natural system through backwards propagation.
Learning in network models dates back to work done by Chow and Liu (1968).It is used when little is known about the marginal or conditional probabilities of certain nodes or when there is no expert opinion on them, for example, in our case the irrigation decision.By learning, either or both the marginal or conditional probabilities of the nodes can be estimated, given the structure of the network.Or, if we have the observed variables in the system, the network structure (commonly known as the Directed Acyclic Graph, or DAG) itself can be learned (Neapolitan, 2003).Creation of the network structure can result in different structures, depending on the data selected by the user.
A framework with joint probability distribution of 'n' discrete variables, x 1 , ... x n in a directed acyclic graph, G, associated with conditional probability tables (CPTs) is known as a Bayesian network (Pearl, 1988).Every node of the network is a variable.The CPT of that variable refers to the probability of each state of the variable with all possible state combinations with its parent nodes.These relationships are quantifiable by using historical data, models, expert opinion, etc. Set of parents of x i , represented as π i are those nodes which have an arrow pointing to x i .The network is defined by a pair B = (G, Θ).Each node x i is independent of its non-descendants given its parents in network, G. Θ denotes the set of parameters of the network, (θ 1 , ... θ n ), where θ i is the vector of parameters for the conditional distribution of variable x i .According to Lauritzen (1995), Heckerman (1995) and Neapolitan (2003), to find out the probability of an arbitrary event X = (x 1 , ... x n ), we need to compute the following: If x i , has no parents, its local probability distribution is unconditional.In the learning context, if a node is observed, then the node is called the evidence node.
For our case since the structure is known and we do not have any missing data, learning would accomplish the estimation of CPT parameters that maximize the log-likelihood of the training data set.The training set consists of m independent cases.According to Lauritzen (1995), Heckerman (1995) and Neapolitan (2003), the log likelihood of training set, one for each node, with training data set Ω = x 1 , ... x m , where x l = (x l1 , ... x ln ) T , and Θ is the parameter set, is given as a sum of terms as follows: According to Murphy (2001), each parameter vector is assigned a prior probability density function and the training data is used to compute the posterior parameter distribution and the Bayes estimates.For more details on the learning capability of Bayesian belief networks, please refer to Heckerman (1995).
Some relevant water management literature applying learning BBNs was found.Bressan et al. (2009) applied two Bayesian network classifiers to model risk of weed infestation in a corn-crop.The first classifier found a categorical variable for weed-crop competitiveness.This inferred categorical variable along with categorical variables from maps of weed seed production and weed cover are then used as an input to the second Bayesian network classifier.The output from this network is the categorical variable describing risk of infestation.The network was used to interpret classification rules of risk analysis.Farmani et al. (2009) combined BBNs with evolutionary multi-objective optimization to help optimal management of groundwater contamination for a well field outside Copenhagen city.The optimization algorithm was used to find the state variable values which are then used as an input to BBNs.The probabilities of all the nodes were then computed by belief propagation.
After the probabilities were updated, the values from the objective function were back propagated to the optimization model again and the process was repeated.Wang et al. (2009) used Bayesian network for integrating and representing knowledge pertaining to farm irrigation in Shepparton Irrigation Region of northern Victoria, Australia.The model considered biophysical components like salinity, evapotranspiration, rain, soil type, water table depth, etc. as inputs.Management options, such as land use, groundwater pumping, farm reuse as well as irrigation parameters like method, period, layout, management etc. were others used as inputs.The output of interest were the management outcome measures in the form of farm production, resulting root-zone salinity, farm runoff and recharge.The model was tested by local experts and stakeholders.The target group found that the model developed represented how they perceived the system.Chan et al. (2012) used BBNs to study fish-flow relationships in Daly River, Australia.The study was undertaken to evaluate the water extraction from the river for agricultural purposes.It was found that the extractions would have a significant impact on fish population.Looking at the past studies BBNs seem to be a potential tool for the problem posed in this study.
In terms of platform, belief network modeling for this work was done using Netica-J, the Java version of the Netica API (Norsys, 2011) for batch operations and ease of learning and testing from case files.Netica assumes independent conditional probabilities and the Dirichlet function (uniform probabilities with 0 and 1 limits) for prior probabilities (Spiegelhalter et al., 1993;Castillo et al., 1997).For learning, Netica has provisions to use counting, gradient descent, and expectation maximization algorithms (for more details please refer to Korb and Nicholson (2004); Neapolitan (2003); Russell and Norvig (2009).For this problem, all three of these options gave similar results but the last two algorithms took more time to solve the network.Hence, simple counting was used to learn the parameters of the networks.The BBN developed in this study takes into account those factors which, theoretically, can affect a farmer's irrigation decisions.In spite of including many factors, we may be missing some of the critical ones due to the lack of available data.

Variable Selection, Nodes and Links of Bayesian Network
The variables were selected for the BBN to represent the information pertinent to on-farm irrigation decisions.
The structure of the model was based on the classical soil moisture balance model (Allen et al., 1998) and allied literature in irrigation scheduling.To discretize the continuous time series data, reasonable limits for weather variables were used.The model calibration eventually fixed the number of states for various variables.
With respect to the environment being modeled, the network was divided into various groups, such as weather variables, domains affected by weather (e.g., soils, crops), independent factors such as canal flows, and a farmer's decision to irrigate.If the farmer irrigated, then it meant that there was water available to him.FAO-56 (Allen et al., 1998) documents the classical daily soil water balance model in terms of depletion at the end of the day, which was used to define the causal relationships between the variables.The model components and the relationships between them are shown in Figure 2.
Since this model was built to identify the likely factors leading to farmers' decisions to irrigate, variables were selected such that they could be measured or, with justification, assumed for such things as real-time soil moisture content, weather data available from a local station to which farmers have access, market prices, crop and soil condition indicators, etc.
Mathematically, the soil moisture depletion at the end of the day [mm], in root zone depth, r [mm], is given as (Allen et al., 1998):  In the following paragraphs, we describe the components presented in Figure 2 and the nodes representing the respective components using the soil moisture balance model.1) Weather Inputs -Weather inputs used were daily average air temperatures (node AirTemp), average relative humidity (node RH), average wind speed (node WindSpeed), precipitation (node Rain) and evapotranspiration (node ET).
2) Crop Data -Days after planting in terms of Julian days (node Jday), crop coefficient describe the progression of crop growth stages (node CropCoeff).Apart of these, product of crop coefficient and 3) Components of Soil Moisture Balance -Besides some of the weather and crop data from above, soil moisture balance model (refer to Equation 3) comprises of other variables like percolation amount (node AmountPercolation), field capacity as the initial soil moisture content (node SMCinit), depletion amounts in the start of the day and at the end (nodes DepInit and DepEnd), and the soil stress coefficient (node SoilStressCoeff) for computing actual evapotranspiration (node ET a ).4) Water Availability -The water availability is indicated by total water diversions (node CanalFlow) in a day from the canal serving the area of interest.5) Economic Returns -Yields (node Yield) are computed daily as affected by soil moisture stress.The revenue is subsequently computed at the node Revenue.6) Crop Condition -Crop health is indicated by accumulated growing degree days (node GrowingDegDays) and accumulated crop evapotranspiration (node CumET c ).The crop water requirement in soil moisture balance is computed for a linearly increasing rooting depth, hence the node RootingDepth is also included.7) Timing of Irrigation -Some farmers irrigate during the weekend and can be a crucial factor for irrigation decisions.The node WeekEndORNOT informs the model if the day is a weekend or a weekday.8) Irrigation Decision -Lastly the node 'Irrigate' is the decision to irrigate or not.If the decision is a 'Yes', irrigation amount (node IrrigationAmt) is computed based on the soil moisture balance till date.
The model, shown in Figure 3, had 31 nodes and 36 links.The parents (immediate) of the child node 'Irrigate' decision have two states.Other variables had three or more states to consider every possible condition.To simplify the architecture, the network description starts from the child node, 'Irrigate' which was a farmer's decision to irrigate.The node 'Irrigate' had two states, 'Yes' and 'No'.The contributing factors to this decision were the following irrigation needs from various components of the system: I. Node 'SoilIrrigNeed' -Soil condition is one of the most important criteria for an irrigation decision.Farmers are very familiar with the texture and feel of dry and wet soils.The soil condition is also reflected in the crop condition.Farmers sometimes irrigate when they see some plants with yellow leaves and presume it is time to irrigate.However, the irrigation principles state that this could be because of water logging.This factor helped to determine whether the soil need was the primary cause of irrigation in every instance the farmer thought of irrigation.If it is probably the main cause, then it would practically end the search for other significant, causal factors.The logic in the node is described below.
The classical FAO Penman-Monteith equation (Allen et al., 1998) uses relative humidity (node RH), windspeed (node WindSpeed) and air temperature (node AirTemp) with some other variables to calculate evapotranspiration (node ET).The other variables used in the calculation have not been used here since they have not been found to contribute to the irrigation decision directly.Crop ET (node ET c ) is obtained by multiplying ET and the crop coefficient, K c (node CropCoeff), followed by actual ET (node ET a ) which is a product of ET c and the soil stress coefficient, K s (node SoilStressCoeff) given as: (4) Total plant available water (TAW) is defined as the portion of water in soil root zone (RD) which can be extracted by the plant.Field capacity (FC) is the upper limit of water held in the soil when the gravitational water has been drained from the soil profile.The wilting point (WP) is the lowest limit of available water which the plant can use.
Readily available water is the amount of soil water the plant can extract from the soil profile without suffering any stress: Where, MAD is the management allowable depletion and may be different from farmer to farmer and might also be based on the crop.TAW and RAW are hypothetical limits for daily soil moisture depletion.The soil stress coefficient, 'K s ' (node SoilStressCoeff) is 1 until RAW is greater than depletion.As soon as depletion crosses the RAW limit, stress sets in and K s is computed by the following equation (Allen et al., 1998): The deep percolation amount (node AmountPercolation) was estimated by calculating a constant 'rate' of loss of water from the soil after irrigation, up to three days after irrigation (the approximate time it takes to reach the field capacity) and multiplying with total available water.TAW is used in this calculation since it is the amount of water held in the soil column.In simple words, it is a fraction of TAW (deduced from (Allen et al., 1998)) Irrigation amounts (node IrrigationAmt) (mm) were calculated (Allen et al., 1998) as the product of the difference between porosity and the soil moisture content on the day before irrigation, and the application depth (mm).
Where, IrrigationAmount is the irrigation amount (mm), and SMC i-1 (node SMCinit) is the initial Soil moisture content before the day of irrigation.
Actual rain amounts were used at node Rain.The study area has very localized and scarce rain events, hence a rain amount of 0 mm was categorized as state 'No' and amounts greater than 0 mm were grouped as state 'Yes'.
The initial depletion (node DepInit), D i-1 was zero making the field capacity for every soil type, the initial soil moisture content.The depletion at the end of the day (node DepEnd) is given by the soil moisture balance as follows.
( III.Node 'WkEndIrrigNeed' -This node was based on the observation that farmers may prefer to irrigate on a weekend because some might have an active job during the weekdays and restrict some farming activities to the weekend.A node for the Julian Day (node Jday) and another for determining if it is a weekend (node WeekEndORNOT) were the parents to this node.
IV. Node 'WaterSupplyIrrigNeed' -Some farmers might tend to irrigate when a neighbor irrigates.This node mimes that action of a farmer.It translates into whether the farmer chose to irrigate with the others (on a day of high flow) or took an independent decision (low flow) for irrigation.Canal flow (node CanalFlow) data were fed into this node.
V. Node 'GrowStageIrrigNeed' -Accumulated degree days (node GrowingDegDays) have been a valuable tool to represent the vulnerability of crop stage to pests (Miller et al., 2001).It can also provide an information surrogate for the growth stage reached.This factor is different for different crops.The air temperature (AirTemp) was summed up over the complete growing season.The base temperature was taken as zero (0 degC) for all the crops.
VI. Node 'EconIrrigNeed' -Crop ET (node ET c ) and Actual ET (node ET a ) feed into the node Yield according to FAO-33 (Doorenbos & Kassam, 1979).Daily values of predicted market price were the inputs to the node MarketPrice data.K y is the yield response factor.The product of market price and yields, resulted in the values for node Revenue.The actual yield as weighted by maximum expected yield values are calculated using the following equation: ( 11) This means that the farmer might be irrigating for higher revenues on a certain day because they are losing the quality of their crop.
VII. Node 'CropIrrigNeed' − Though the growing stages are reflected through the growing degree days, the node Rooting Depth accounted for the increasing root depth of the plants, which was assumed to increase with time.This can be important since newly planted crops like alfalfa stop root growth after the development stage and before first cutting, and have already stopped rooting further if they have been developing from previous years.This node denotes the plant/crop irrigation need due to increasing root depth and is important during the earlier part of the season (Allen et al., 1998) than the latter.

States of Variables
The number of states of the variables in the network have been presented in

Canal B, Lower Sevier River Basin, Utah
The study site selected covers 20 square miles near Delta, Utah in the Lower Sevier River Basin.Snowmelt is the major contributor to soil moisture in the early part of the growing season which is usually late spring.Irrigation is the biggest user of the water in this basin.Surface irrigation is the dominant method in the region.
Telephonic anecdotal accounts given by the water masters of the canal company, who are also farmers in the area, were compiled.They explained various reasons for their irrigation decisions, including observing a neighbor irrigating, the plant-soil condition, the amount of water remaining in their water right for the season, and the type of crop and the stage.They told us that the farmers order water but do not necessarily use it to irrigate as soon as they get water.They sometimes store it in the ditches themselves and use them when needed, or also might rent it out.We do not have any means to ascertain these claims, but these facts helped us in modeling the problem better.

Data
Weather data for the study area were obtained from the website (CEMP, 2011 A daily soil moisture balance was determined to compute the daily moisture depletion.No specific data for the date of planting for years 2007-2010 were available in the literature, so it was difficult to begin the growing period with a specific planting date.Wright (1982) describes the day of planting for crops grown at Kimberly, Idaho.These dates gave a very general idea of the day of planting, but since this was a field-by-field study, crop and farmer-specific dates were needed to address the difference in the observed irrigation dates.Planting date was estimated as the day when the initial depletion was 0 (i.e., starting from field capacity) and the soil moisture balance model indicated irrigation which coincided with the day of first recorded/observed irrigation.This procedure addressed the lack of knowledge of initial conditions and resulted in such assumptions as no stress during the period of crop establishment and soil moisture not being inhibiting initial plant growth.This also accounted for the fact that after snow melt at the site, the soil would not be completely dry.It was assumed that the crops are planted or emerge (in case of alfalfa) as soon as suitable temperatures are reached.The greatest challenge in establishing the water balance calculations to describe soil moisture through time is that the model indicates irrigation need in the beginning of the season too frequently, which does not agree with how the farmers irrigate.The reason for this is the shallow root depth for crops just planted.The crop demand is also not very high as represented by low crop ET in this part of the season owing to low temperatures.For practical purposes, the farmer does not apply water according to the root growth but considers an ''application depth" for early season irrigation which is uniform for all crops.Hence a constant application depth of 1m was assumed.Also, the deeper in the soil column, effects if factors such as ET are less significant.This method also worked for annual crops such as barley and corn, since at field capacity the farm implements can enter the ground more easily for plowing and planting.
A soils map of the study area indicated three soils types: silty clay loam, silty clay, and loam.The major soil characteristics governing water movement and retention are porosity, field capacity, and wilting point.Usually, porosity defines the saturation limit of a specific type of soil, while field capacity and wilting point put limits on the available plant water, which is important in this case since we are considering crop growth as well as soil water extraction.Standard values for these parameters were considered from (Allen et al., 1998).
Representative crop phenology coefficients for alfalfa, barley, and corn were obtained from Wright (1982) and FAO-56 (Allen et al., 1998).The other consideration in the calculation of crop coefficients was that we were using it for crop reference ET, hence we had to multiply the values by a factor of 1.2 to consider field crops as opposed to grass reference ET.Literature values for yield response factor, K y were used (Allen et al., 1998).
Since there was no evidence of capillary rise and runoff, they were considered negligible.A daily linear time series was constructed by interpolating the monthly values of market price data available on the USDA website (USDA-NASS, 2011).

Calibration and Testing
Calibration is the process of tuning the model such that its behavior is close to that of the system being modeled.Instead of using parameters to calibrate the process, the model will learn the process from the data.The model was trained using the data for the days the decision was taken and the one before it.This was done since the number of irrigations were infrequent in any given season.Separate models were built for all the three crops for the analysis of the crop-specific irrigation rules.The conditional probability tables (CPT) were populated by learning.The probability distributions for all the nodes are to be found, including the Irrigate decision node.The intermediate nodes (between parents and child nodes) were used to reduce the number of variables going into the decision/child node.A sample of the representative data was run through the network to define the states, to account for all the possible scenarios.
All the data from different years were used for training and testing of the networks.However, Netica does not allow for bootstrapping.The results of the analysis are presented in Table 2.The networks trained with 2009 data for alfalfa and tested with 2008 data gave the lowest testing accuracy of 81.0%.Also, a combination of years 2008-2010 for training and 2007 for testing for alfalfa resulted in error of 81.0%.A confusion matrix presents the number of correct and incorrect classifications produced by a classification model, and is a standard output for classification problems.The confusion matrix for the irrigation decision model is shown in Table 3.
The cases correctly classified by the model appear on the diagonal of the matrix.The confusion matrix shows that only two irrigation events were missed in the testing phase.The error rate is high because some irrigations were predicted by the model when the farmer did not irrigate.The lack of data for barley and corn crops resulted in low accuracies.Another strange observation was that whenever 2009 data for alfalfa was used for testing, it resulted in lower testing error as compared to training error.This happened because the testing cases may have been more representative of the posterior probabilities calculated in the network, and the farmers were irrigating more consistently with those rules in the said year.Note.Crop types are given as A -Alfalfa, B -Barley, and C -Corn.
For our problem, the learning results gave an insight into the irrigation decision-making process and the factors that likely affect it.Table 4 shows the possible reasons for decisions to irrigate across various combinations of crops and years.Soil stress was the leading probable rule (most often used, based on number of events) for irrigation for most years and crops, as can be seen in Table 4.  Note.
(1) denotes 'Beliefs to irrigate' in % -implying majority of farmers used the rule.

Model Sensitivity
If any variable was removed from the network, the model failed to perform well.Hence, all the variables were included during the various tests.None of the nodes were assigned initial probabilities based on expert judgment, hence model sensitivity testing on initial probabilities was not conducted.The network started out with equal probabilities of the states of the variables i.e. variables with two states had a 50-50 probabilities; with three states, 33.3-33.3-33.4probabilities.There were some variables which had no variance contributing to the rule, but if they were removed, the network predictions worsened.Examples of such variables were economic need (node EconIrrigNeed) and crop under heavy stress (node StressIrrigNeed).This happened because inherently these factors might have an effect but these were not the ''only'' factors considered by the farmers to irrigate.From a practical point of view also, there are several conditions which the farmers consider before irrigation.
The sensitivity analysis specific to networks built in Netica has been presented in

Discussion
The results obtained from the Bayesian belief network for studying irrigation behavior provided insights into the irrigation decision process, though the reasons for irrigations of barley and corn were not well captured by the network.To completely understand the process it is important to look at the conditional probability table (CPT) (Table 6).As explained, the CPT is comprised of seven parent nodes.
The factor combination learned from the data results in the calculation of the posteriors.If no such combination is found in the data, the probabilities remain unchanged.Again going back to Table 6, the factor combinations which result in 'Irrigate' probabilities of 41.9% (EconIrrigNeed = Yes) and 54.5% (EconIrrigNeed and WkEndIrrigNeed = Yes) can be compared.The probabilities resulted because many farmers were irrigating on a weekend, so the only possible difference in the two is that the latter has the weekend factor, too.During training, the data did not reflect as many irrigations under those conditions (EconIrrigNeed = Yes) which led to incorrect classification during testing.A counting algorithm goes through the data and puts similar groups together, but there was not a sufficient number of patterns to corroborate certain decisions.The error rate could be high also because the network got more of those factor combinations in the testing phase, while there was no observed irrigations due to those factors during the training phase.Also, we did not have as much data for barley and corn as for alfalfa.The other explanation could be that the factor combination was indicating irrigation (since the model had recorded such instances during training) but since the call time in Canal B ranges from 24 hours to 3 days according to operating rules followed by the canal company, the farmer might have taken a decision to irrigate but did not get water in time.This can only be verified if we had access to water order data from the canal company.
EconIrrigNeed and StressIrrigNeed were always 'Yes' for the irrigation events.As individual variables these might be insignificant, but in combination with other factors, they could have been contributing to the process.If they were removed, the model performance was poor.Since daily values were not readily available, daily time series was constructed for market prices by linear interpolation from the monthly values.The anticipated sales prices increased throughout the entire growing season.When we linearly interpolated, the prices constructed rose daily.Economic need was always important in the model because the market prices were always rising throughout the season.
Due to laser leveling the timing between two irrigations was sometimes more than 30 days.Hence, not all of the data from the growing season could be used because it would result in an imbalanced data set with fewer irrigation events in comparison to the number of no-irrigation days.
Table 6.A portion of the CPT of "Irrigate" node showing the learnt probabilities to Irrigate or not for the model built for alfalfa.The rest of the factor level combinations stayed at equal probabilities (50-50) after learning

Conclusions
Water managers, decision makers and canal operators are always challenged by lack of knowledge about the irrigation water demand that will develop over the short term.This can be partly solved by accounting for farmers' irrigation decisions.The decisions and the subsequent water orders can be eventually summed at the command area level to get short-term estimates of water demand.

Bayesian Belief Network as a Tool
BBNs provide a tool to analyze farmer irrigation decision behavior and predict his probable future irrigation decisions.They are easy to build, and provide various ways to interpret the results.The only requirement to construct and use them is that there should be some information about the relationships between variables.They can also learn the relationships from case data and then simulate future events based on the results of learning.
As with any learning algorithms, these networks have to be trained for each new geographic location.Clearly, irrigation decision making is a multivariate process.The more variables we have, the better model performance we can expect.These models are data-intensive and require a large number of events to improve the prediction accuracy.An important limitation of such models is that they can perform better for immediate decision-making (1-3 days before the decision may be made), but their usefulness for long-term forecasting may be limited.This will depend on the duration between the irrigations.Delays caused in irrigation due to harvesting of alfalfa, for instance, can be useful, if they can inform the model how long it took for the post-harvest process, and incorporate the valid reasons of delay in irrigation, apart from stressing the crop.

Rules Used to Make Irrigation Decisions
Soil stress (or a surrogate thereof that is directly visible to farmers, such as yellow leaves in parts of the field) was found to be one of the most important factors that is apparently used by farmers in Delta to guide irrigation decisions.From the perspective of irrigation principles, we know that soil condition is an important indicator for irrigation.The water in the soil profile is lost by the process of evapotranspiration, which is the immediate reason to irrigate.Soil moisture balance calculates soil moisture for a specific rooting depth.Though we still need supporting evidence, it is unlikely that farmers would track root depth during the growing season.Hence, it is a strange factor to contribute to the decision.Weekend irrigations and irrigating when neighbor irrigate have been some traditionally used triggers for irrigation decisions, and this study found data supporting this.Farmers often appeared to observe neighbors for their irrigation decisions.Most of the farmers in the Delta area have a full-time occupation during the week, which increases the likelihood that they will engage in agricultural activities such as irrigation on the weekend.By using these rules, the prediction accuracy of irrigation decisions was 81.0% for alfalfa and 61.0% for barley and corn.

Behavior
Irrigation decision-making is a complex process involving an irrigator's assessment of a combination of factors.This study simulates conditions which might have been similar to what the farmers saw when they made a decision to irrigate.Hence soil stress, rooting depth (crop needs), an active profession during the week, and a talk with a neighbor might be some of the possible reasons for irrigation decision behavior in Delta.It is also evident that farmers look at different factors for every irrigation.They clearly look for not one or two but many indicators for irrigation.This work shows that biotic, climatic, and edaphic conditions suffice the requirements of indicators to study irrigation decisions.

Figure 1 .
Figure 1.Framework of a Bayesian Belief Network with two child and one parent nodes Where, D r,i-1 is moisture depletion (D) by the end of the previous day [mm]; P i is the amount of rainfall on day i [mm]; I i is the depth of irrigation on day i [mm]; ET a,i is the actual crop evapotranspiration on day i [mm]; DP i is the deep percolation on day i [mm].

Figure 2 .
Figure 2. Components represented by nodes of the Bayesian belief network crop evapotranspiration (node ET c ) constitute the crop information provided to the model.

Figure 3 .
Figure 3. Resulting network relationships after learning.The network starts with equal probabilities of the states for all the variables Due to a deep rooting crop, the need to irrigate the crop frequently was eliminated.Deeper roots can utilize subsoil moisture.Hence the farmers might have considered this factor to irrigate barley crops in 2008-2009.For alfalfa in 2008 and 2010, barley in 2007 and 2009, and corn in 2010, farmers might have irrigated similarly to their neighbors.Irrigating on the weekend was found to be one of the dominating reasons for alfalfa in 2010, barley in 2007, 2009, and 2010, and corn in 2008.

Table 1
. All the nodes feeding into the 'Irrigate' node(CropIrrigNeed, EconIrrigNeed, GrowStageIrrigNeed, SoilIrrigNeed, StressIrrigNeed,  WaterSupplyIrrigNeed, WkEndIrrigNeed)and IrrigationAmt and Rain nodes had two states: 'Yes' and 'No', indicating presence or absence of the factor.The node WeekEndORNOT separates weekdays from weekends.The nodes which reflected time in the growing season had three states, viz.JDay and CropCoeff, reflecting early, late, and middle season.Also the node SoilStressCoeff had three states denoting the wetting and drying phases between two irrigations: Irrigated, Mid-stress (half-way through stress), and Stressed.The nodes representing weather or flow variables, such as AirTemp, CanalFlow, ET a , RH, and WindSpeed, had three states denoting high, medium, and low levels.The nodes such as AmountPercolation, DepEnd, DepInit, Revenue, and Yield, had four states to accommodate different water holding capacities of soil types, or different crop yields according to the area irrigated by the farmer (some farmers had larger fields in comparison to others).ET and ET c both had five levels in order to have smaller bins to account for day-to-day variations.Similar to the depletion variables, DepEnd and DepInit, SMCinit had five states to account for different starting values of soil moisture content to account for all soil type and crop type combinations.Finally, the nodes GrowingDegDays and RootingDepth had six, and CumET c had seven states, to have finer discretization during the growing season.

Table 1 .
Number of states selected for various variables.
(Berger et al. 2002)ere not found to be representative of the conditions at the site because localized showers are sometimes observed in the area during the irrigation season.Data calculated by Kimberly Penman Reference ET procedures are available on the foregoing website.The calculations were verified for accuracy.In 2007, the Utah Water Research Laboratory (UWRL), Utah State University (USU) established 44 stations with 88 sensors to record soil moisture content at 1 and 2 ft depths on various farms near Delta, Utah to study agricultural water use.The sensors are maintained by personnel at UWRL.Soil moisture content measured at these stations was used to determine the day of irrigation and the approximate amount of irrigation.These were obtained from ODM-USU (2011).Hourly measurements are available on the website, so daily average values were estimated for the first day of the season for beginning the day-to-day soil moisture balance calculations.SCADA (Supervisory Control and Data Acquisition) systems maintain the past records of water discharge across Sevier River Basin(Berger et al. 2002)and real-time conditions are accessible on the website (SRWUA, 2011).The site is sponsored by the Sevier River Water Users Association (WUA).Daily canal flows in cubic feet per second (cfs) in Canal B were obtained from the website SRWUA-CanalB, 2011).
).The station was established by the National Climatic Data Center (NCDC), NOAA and has historical weather data since 1965.The station can be located on the NCDC-NOAA website (NOAA-NCDC, 2011) using the following metadata:

Table 2 .
Results of calibration and testing of Bayesian belief networks

Table 3 .
A sample confusion matrix for Irrigate showing number of events predicted correctly by the model

Table 4 .
Factors resulting in highest beliefs to irrigate for different years and crops.

Table 5 .
The table presents the variance of beliefs at the child node ('Irrigate') given the values at various nodes.It can be seen that air temperature values do not affect the decision to irrigate.Also the values of ''StressIrrigNeed" and CumET c influenced the decision to irrigate for alfalfa crop in the year 2007.

Table 5 .
Sensitivity of variable of interest, 'Irrigate' to a finding at another node in the Bayesian belief network for alfalfa crop in 2007