Satellite-Based Crop Monitoring and Yield Estimation—A Review

To sustain food security and crop condition monitoring, yield estimation must improve at local and global scales. The aim of this review was to give a background of satellite-based crop monitoring and crop yield estimation, including the use of crop models. Recently, most advances in remote sensing techniques, aimed at complimenting the traditional crop harvest surveys, have focused on high-production and information-rich areas. However, there is limited research in dynamic landscapes using these techniques at local scales in most Southern African countries. Models such as the Decision Support System Agro-Technology’s (DSSAT) CERES-model, and Agricultural Production Simulator (APSIM) have been used to simulate maize biophysical parameters and yield variability in a changing climate. Despite the successes, there is still need to consider yield prediction using simplified models that decision-makers can use to plan for food support and sales. The application of freely-available satellite data with focus on maize crop as a staple for Southern Africa, highlights some challenges such as heavy reliance on agro-meteorological estimations and regional estimations of crop yield. It also raises questions of predicting across large growing belts without consideration of diverse cropping patterns. Conversely, future opportunities in crop monitoring and yield estimation using remotely sensed-data still shed a light of hope. For instance, employing multi-model configurations or multi-model ensembles is one of the major missing gaps needing consideration by crop modeling research. Other simpler, but versatile opportunities are the use of crop –monitoring applications on smart phones by small holder farmers to provide phenological data to decision makers throughout a growing season.


Introduction
Crop condition monitoring and yield estimations must continuously produce timely, and spatially dependable updates for decision support systems. However, exorbitant survey costs, and complexity of production systems often impede these. Meanwhile, millions are affected by severe food insecurity in the advent of projected extreme climatic events (Godfray et al., 2010;Hall et al., 2017;Allen et al., 2014;Wheeler & Von Braun, 2013;Knox et al., 2012;Leff et al., 2004). In this respect, crops such as rice, maize, and wheat, which constitute the world's major sources of energy, now demand increased special attention (UN SDGs, 2015). Correspondingly, global, regional and local scale monitoring of condition and production of these crops is essential to ensure food security. This, in turn, has great potential of achieving and sustaining the sustainable development goal (SDG) number two of "Zero Hunger" (UN SDGs, 2015).
Generally, agriculture accounts for the main source of livelihood, and is a key sector in economic development, especially in most developing nations. It is dominantly characterized by cultivation of maize, rice, soybeans, wheat, tubers and cotton. As the population in sub-Saharan Africa is projected to double by 2050, the status quo necessitates vigilance in managing water and land resources to meet heightening food production demands (Van Ittersum, 2016). Currently, water scarcity recorded in parts of Eastern and Southern Africa, following recurring droughts, due to El-Niño events, confirmed the need for effective resource management (Msowoya et al., 2016). Erratic rainfall prevailing in most production areas is increasing the hunger situation, especially in remote areas. These profound climatic impacts have a bearing on crop demand and supply, thereby requiring subtle monitoring and prediction systems. As unfavorable climatic events continue hampering agriculture, consistent crop monitoring and yield estimation will form an imperative basis of management and increased global resilience. With this information, relevant stakeholders can make timely and more accurate decisions during disasters and surplus production. These efforts are indispensable if the world is to overcome the eminent challenge of feeding Several other indices have been developed for monitoring vegetation and crop chlorophyll content and are presented in Table 2. These indices vary from each other depending on the spectral bands they use to monitor the crop phenology.
In a separate review on methods of estimating biomass and yield using low-resolution satellite data, Rembold et al. (2013) highlighted three main aspects, being that; Several indices to estimate leaf chlorophyll content exist (Hunt et al., 2013). The trade-off on which index to employ is also driven by the type of imagery available for a given study area and period. For instance, Sentinel 2 has the red edge bands which Landsat 8 does not, yet Landsat 8 is likely to cover more of a particular area.
Crop residues after senescence can quantify biomass using the characteristic lignin and cellulose reflectance at 2.0-2.2 μm.
Vegetation index derived LAI and fPAR are important in biomass and yield estimations. NDVI, SAVI, EVI from MODIS, SPOT-Vegetation and PROBA-V data using artificial neural networks and other computational techniques are among the frequently used biophysical parameters for yield estimation studies.
Crop canopies provide vital indicators in crop biomass accumulation and stress responses based on their spectral reflectance usually in the red and infrared bands. This is because healthy plants absorb the blue and red, and reflect the green of the optical spectrum, while they reflect infrared radiation. Therefore, NDVI reflects the photosynthetic activity of crops and shows the biomass conditions and stress of photosynthetically active crops (Liu, 2010). For estimating crop yields, Atzberger (2013) recommended that crop-specific masks be employed to eliminate analysis of other crops within observation area. Generally, this entails developing updated country or county crop masks of specific crops. With the advancement in technology and availability of satellite data, this should not be a hurdle in most cases. On the other hand, rigorous crop classification to identify the crops before analysis is an essential pre-requisite together with the crop masks.

Crop Yield Estimations
Satellite-based crop yield predictions employ similar approaches centered on spectral signatures and the estimated yields can be as reliable as actual yields. However, unprecedented (over or under-estimated) results obtained exist, and these can sometimes be alluded to edaphic and climatic conditions prevailing after the prediction. Particularly, paucity of data on actual yields in some areas can amplify the discrepancy between estimated and actual yields. Nonetheless, studies have shown success in maize yield predictions under varying environments Vergara-Díaz et al., 2016;Ban et al., 2016).

Machine Learning and Big Data in Crop Yield Estimations
Machine learning processes that capture information about a crop and deduce yields using various algorithms have proved to be a useful tool in predicting yields. However, these technologies need more scaling up on the continents if they are to provide meaningful contribution to crop yield estimations. You et al. (2017) using MODIS satellite data, applied deep learning techniques to train the data, and applied a Gaussian algorithm to obtain more smoothed results. Yield for the counties studied were accurately estimated thereafter. Machine learning's ability to handle simple and complex relations between variables presents itself as a powerful tool for processing big data (Biffis & Chavez, 2017). In related instances, other studies have employed satellite data with convolutional, and or artificial neural networks to monitor crop condition and estimate yields (Fieuzal et al., 2017;Ali et al., 2017;Saeed et al., 2017).
Scaling up such studies to regional areas can increase challenges of processing and storage of remote sensing data. Hats off to Google for the creation of Google earth engine (GEE), that allows performance of tasks involving several terabytes of data in a cloud platform. Google earth engine is a robust planetary-based platform, which successfully allows researchers and other stakeholders to perform numerous tasks without worrying about story space and preprocessing time. Interesting findings about corn yield variability were revealed in the corn belt of the USA through this platform. However, like all new algorithms, there is still room for improvement, especially in areas such as space provided per given user, user inability to influence a query once it is processing in the background, and the scaling challenge, which limits the configuration of huge machines using the platform.

Satellite-Derived Data Assimilation in Crop Models
Crop growth models are known to differ from applicability, development, purpose and robustness, though they generally fall under three main types; Remote sensing forecast, Crop simulation and Statistical models, which are built on the utilization of information from remote sensing forecasts and simulation models. Water is the main sub-model driving the differences in yield in these models. Short-comings arising from the use of most of these models are their limitation to be scaled up at regional level as most have been developed and work well under field conditions. To address this, and ensure models offer sustainable technological transfer, models such as DayCent, GLAM, and PEGASUS have been developed to cater for large domains. However, these models are still not very widely adopted for further research and offering decision-makers support. DSSAT models are distinguished for being reasonably robust and easy to integrate diverse domains.
As land use and climate change continue to affect the socio-economic status of people, it is timely to increase the incorporation of these factors in the crop models. Furthermore, synergistic efforts are needed amongst developers in order to create more dynamic, robust and acceptable models. The BioMA framework of the European Union offers an interesting platform for decision-makers, and scientists to alter model components, and even develop their own models to best suit their prevailing conditions. At a time when many models become obsolete because of lack of flexibility of changes, BioMA delivers a splendid way to deter such occurrences.  (Delincé Vol. 13,No. 1; in estimating ording to Deli vary in spatial ROPSYST, DS incé, 2017

Satellite Data, Yield Factors and Crop Monitoring Indices
In vegetation studies, addressing challenges of pixel-based separation, crop identification, weed detection at early stage and cloud contamination differs according to nature of study, satellite data used and percent of accuracy required. For instance, Ma et al. (2013) reported the mixed-pixel effect while assimilating satellite-derived LAI into a crop model. This can complicate and reduce the accuracy of results especially in small study areas with heterogeneous vegetation. Using Fourier functions, the recalibration of the world food studies (WOFOST) model was performed which resulted in improved prediction results. Mostly, algorithms and digital filters such as the Savitzky-Golay (SG) have equally played incredible roles in smoothing data and reducing noise, common to satellite data. In the case of missing or cloud contaminated data, fusion of satellite data has proved to overcome this challenge. For instance, resampled high-resolution data to a much coarser resolution successfully estimated yields (Kumhálová & Matějková, 2017). This information is important and serves as a basis for future crop and yield monitoring especially in data-scarce areas.
The unavailability of satellite data throughout the year on all the areas entails that other missions should be employed. The use of different satellite data to a small extent evokes the challenges of discrepancy in the spatial and temporal variability in the observed areas. For instance, the coverage for most of Sentinel 2 is low in the lower latitudes. The high swath of the satellite sensors compounds this challenge. However, this is a tradeoff from the manufacturer's perspective. Limitations of the empirical approaches, which have been the basis for the simple crop yield estimation, decelerate the process of institutionalizing robust regional estimation. Consequently, new areas cannot easily be explored because extrapolation of such approaches is difficult. Efforts of minimizing these constraints, using crop models such as the ones based on the light-use and radiation-use efficiencies has provided some fresh light for further prospects (Lobell, 2013;Monteith & Moss, 1977;Monteith, 1978). However, as far as crop yield forecasting at regional and local level is concerned, there is room to improve through the enhancement of data collection. There is still need to refine the spectral signatures for annual crops like cereals and pastures to improve yield predictions. Simultaneously, crop masks must be more updated and accurate to mask out the unwanted areas while preserving the area of interest. Similarly, incorporation of satellite-based biophysical parameters together with weather and climatic factors would better explain yield differences from a meteorological perspective.
Factoring in other yield-reducing factors such as wild fires, pests and diseases would also help improve forecasts and crop damage assessment by insurers. While some studies have isolated the spectral differences due to pest damage, more work on other crops remains (Abdel-Rahman et al., 2017). This will establish further relationships making crop yield estimations more objective in relation to underlying factors. Following this, Tonnang et al's (2017) recently well-tabulated holistic system of crop modeling will especially improve yield estimation in several aspects.
Inception works on NDVI have remained a strong basis for monitoring crop condition. NDVI is said to have an almost linear relationship between the fraction of photosynthetically active radiation and LAI. However, Vancutsem et al. (2013) reported that MODIS NDVI time-series data synthesis posed challenges when harmonizing which were overcome by employing more accurate crop masks. NDVI's saturating properties in dense canopies, or high biomass content has been reported and researchers are opting for combining it with indices that do not exhibit this characteristic, or altogether using other Vis such as EVI.

Machine Learning and Crop Models
As for machine learning, with on-going advances in computer technology, opportunities for further exploration of satellite data exist and require intricate investigation. Remarkably, the incorporation of ancillary data such as the local cropping calendar, calibrated and validated crop models for crop of interest will enrich the satellite-based inferences. Furthermore, there is need to harmonize both irrigated and rain-fed production, in estimation of phenological stages for the main energy crops. Clearly, remote sensing still has a greater role to play in agronomic, socio-economic and health of billions of people expected to survive under the scarcity of resources. As such, now is the best time to step up and refine the loopholes in promising methods. One place to start, following the availability of high-resolution data, is from previous models that performed well with coarse resolution data.
Like Morell et al.'s (2016), questions about predictions of crop yield along an entire growing belt still arise. Unless and until, efforts to scale up every option that has shown potential, such questions will linger in our minds and those of generations to come. Relying on agro-meteorological and statistical methods on crop yield forecasting alone should not be the main method of crop yield estimation. However, largely, a further expansion through local interactions by smart phone information provision by farmers to monitor cropping systems should be encouraged. Free pplications that capture crop type, planting dates, management practices, and phenological parameters can be installed in the smart phones for this purpose. This will help bridge the gap between some factors that influence yield estimation discrepancies, such intercropping, sowing dates, varieties, and weed or disease infestation. Additionally, it will also help whether to predict for an entire season or not, and for which crops (Li et al., 2015;Usaeed et al., 2017). Such aspects need to be agreed upon by experts.
Di Paola et al., (2016) argued that models do not explicitly include the accuracies as pertains to structure and functionality, and do not include alternative parametizations. Additionally, Incorporation of climatic projections in models increases precision in planning for extreme events according to different climatic forcing. Greenhouse gas concentration and temperature rise affect crop production as they affect photosynthesis and respiration processes. Therefore, employing multi-model configurations or multi-model ensembles is one of the major missing gaps needing consideration by crop modeling research.
Furthermore, researchers' steady provision of spatially reliable and consistent yield estimations will improve service delivery and strengthen decision-making among policy makers. Going an extra mile to understand other available satellite products, drivers of spatial variability in crop responses will result in more holistic results. Synergistic multi-disciplinary efforts equally need to be upheld both at regional and local levels to develop consistent methods of yield estimation. Simple models based on satellite data that stakeholders can use for planning purposes are required for maize crop in Southern Africa. Further research on crop yield estimation using high resolution satellite imagery among small holder farmers will improve decision making, and service delivery.