Developing automated machine learning approach for fast and robust crop yield prediction using a fusion of remote sensing, soil, and weather dataset

Published Date
April 25, 2024
Type
Journal Article
Developing automated machine learning approach for fast and robust crop yield prediction using a fusion of remote sensing, soil, and weather dataset
Authors:
Ahmed M.S. Kheir
Ajit Govind, Vinay Nangia, Mina Devkota Wasti, Abdelrazek Elnashar, Abdelrazek Elnashar, Mohie Omar, Mohie Omar, Til Feike

Estimating smallholder crop yields robustly and timely is crucial for improving agronomic practices,
determining yield gaps, guiding investment, and policymaking to ensure food security. However,
there is poor estimation of yield for most smallholders due to lack of technology, and field scale data,
particularly in Egypt. Automated machine learning (AutoML) can be used to automate the machine
learning workflow, including automatic training and optimization of multiple models within a userspecified
time frame, but it has less attention so far. Here, we combined extensive field survey yield
across wheat cultivated area in Egypt with diverse dataset of remote sensing, soil, and weather to
predict field-level wheat yield using 22 Ml models in AutoML. The models showed robust accuracies
for yield predictions, recording Willmott degree of agreement, (d>0.80) with higher accuracy when
super learner (stacked ensemble) was used (R2=0.51, d=0.82). The trained AutoML was deployed to
predict yield using remote sensing (RS) vegetative indices (VIs), demonstrating a good correlation
with actual yield (R2=0.7). This is very important since it is considered a low-cost tool and could be
used to explore early yield predictions. Since climate change has negative impacts on agricultural
production and food security with some uncertainties, AutoML was deployed to predict wheat yield
under recent climate scenarios from the Coupled Model Intercomparison Project Phase 6 (CMIP6).
These scenarios included single downscaled General Circulation Model (GCM) as CanESM5 and two
shared socioeconomic pathways (SSPs) as SSP2-4.5and SSP5-8.5during the mid-term period (2050).
The stacked ensemble model displayed declines in yield of 21% and5%under SSP5-8.5 and SSP2-4.5
respectively during mid-century, with higher uncertainty under the highest emission scenario (SSP5-
8.5). The developed approach could be used as a rapid, accurate and low-cost method to predict yield
for stakeholder farms all over the world where ground data is scarce.

Citation:
Ahmed M. S. Kheir, Ajit Govind, Vinay Nangia, Mina Devkota Wasti, Abdelrazek Elnashar, Mohie Omar, Til Feike. (25/4/2024). Developing automated machine learning approach for fast and robust crop yield prediction using a fusion of remote sensing, soil, and weather dataset. Environmental Research Communications, 6 (4).
Keywords:
climate change
wheat
remote sensing
machine learning
yield prediction