The Implementation of Machine Learning and Deep Learning Algorithms for Crop Yield Prediction in Agriculture


Download 0.67 Mb.
Pdf ko'rish
bet6/7
Sana30.08.2023
Hajmi0.67 Mb.
#1671574
1   2   3   4   5   6   7
Bog'liq
AGRI ARTIC 2nd Rahimov

3. Data preprocessing
Figure 2 illustrates the data preprocessing 
process used for model learning. Initially, the dataset 
was imported into Python from kaggle.com. 
Additional features were added to create a new 
dataset. Next, normalization was performed to 
analyze the data. Finally, certain specification data 
were identified as model learning data, while the 
Base specification data were set aside to evaluate the 
performance of the generated predictive model. 
Figure 2. The preprocessing process for prediction 
crop yield data. 
4. Results and discussion 
4.1.Evaluation metrics 
This study utilizes dataset that collected 
during over 20 years, including measure of rainfall, 
productivity of the each year, temperature. Machine 
learning techniques, including a multivariate 
regression (MR), deep neural networks (DNN), and 
multiple linear regression predict, were employed 
to construct the predictive model using Python, 
Scikit-learn and Seaborn libraries. The predictive 
performance of the models was evaluated using a 
mean absolute error (MAE), and a root mean squared 
error (RMSE), while a separate dataset was used to 
test and verify the selected model. The test included 
assessing the performance of each prediction model 
on a separate dataset and generating graphs to 
compare the predicted and actual values of crop yield 
such as changing temperature, rainfall. For 
regression problems MAE and RMSE metrics are 
most implemented. In this section we compare the 
results taken from four models through MAE and 
RMSE according to mentioned four algorithms in 
section 3. 
MAE =
1
𝑛

|𝑦
𝑎𝑐𝑡
− 𝑦
𝑝𝑟𝑒𝑑
|
𝑛
𝑖=1
(1) 
RMSE = √

(𝑦
𝑎𝑐𝑡
− 𝑦
𝑝𝑟𝑒𝑑
)
𝑛
𝑖=1
2
𝑛
(2) 
Where: 
n is the number of data points, 
y
pred
is the predicted value of the dependent 
variable for the i
th
data point, 


Bulletin of TUIT: Management and Communication Technologies
Nodir Rahimov, Dilmurod Khasanov 
2023.Vol-2(4) 
y
act
is the actual value of the dependent variable for 
the i
th
data point. 
4.2.Multivariate Regression Prediction Model 
Performance
The performance evaluation of the predictive 
model is presented in Table 2, where the MR model 
exhibits RMSE values of 83256.2 and 84955.1, 
and MAE values of 93365.8 and 64242.0 when 
predicting the crop yield prediction, respectively, 
based on the test dataset. The prediction results of the 
MR model for the test dataset are visualized in 
Figure 3 and Figure 4. 
Table 2. The performance evaluation results of the MR 
prediction model. 
Metric 
MAE 
RMSE 
Target 
Train 
Test 
Train 
Test 
Predictio

63365.

64242.

83256.

84955.

Figure 3.
High-correlation 
among features. 
Figure 4. True values 
(blue) and predictions 
(orange). 
4.3.Multiple Linear Regression Prediction Model 
Performance 
The performance evaluation of the predictive 
model is presented in Table 3, where the MLR model 
exhibits MAE values of 63879.3 and 64099.9, 
and RMSE values of 84145.8 and 84254.6 when 
predicting the crop yield prediction, respectively, 
based on the test dataset. The prediction results of the 
MLR model for the test dataset are visualized in
Figure 5 and Figure 6. 
Table 3. The performance evaluation results of the MLR 
prediction model. 
Metric 
MAE 
RMSE 
Target 
Train 
Test 
Train 
Test 
Prediction 
63879.3 
64099.9 
84145.8 
84254.6 
Figure 5. The dynamics Figure 6. True values 
of crop by years. 
and predictions. 


Bulletin of TUIT: Management and Communication Technologies
Nodir Rahimov, Dilmurod Khasanov 
2023.Vol-2(4) 
4.4.Deep Neural Network Prediction Model 
Performance 
The performance evaluation of the predictive 
model is presented in Table 4, where the DNN model 
exhibits MAE values of 63713.9 and 63747.2, 
and RMSE values of 83510.5 and 83493.9 when 
predicting the crop yield prediction, respectively, 
based on the test dataset. 
Table 4. The performance evaluation results of the DNN 
prediction model. 
Metric 
MAE 
RMSE 
Target 
Train 
Test 
Train 
Test 
Prediction 
63713.9 
63747.2 
83510.5 
83493.9 
The study has revealed that multiple linear 
regression (MLR) outperforms other algorithms that 
were evaluated in terms of dataset size, sorting, and 
key features. Although models based on deep neural 
network (DNN) and multiple regression (MR) 
algorithms have been observed to be highly effective 
in certain circumstances, MLR has been identified as 
the most optimal algorithm for predicting crop yield 
based on the research findings. 
4.5. Gradient Boosting Regressor Tree Model 
Performance 
The performance evaluation of the predictive 
model is presented in Table 5, where the GBRT 
model exhibits MAE values of 61378.7 and 
61749.5, and RMSE values of 79139.3 and 79641.6 
when predicting the crop yield prediction, 
respectively, based on the test dataset. 
Table 5. The performance evaluation results of the GBRT 
prediction model. 
Metric 
MAE 
RMSE 
Target 
Train 
Test 
Train 
Test 
Prediction 
61378.7 61749.5 
79139.3 
79641.6 
Table 6. The SOTA comparison of models. 
Metrics → 
Models ↓ 
Root Mean squared error 
(./1000 ha) 
Mean absolute error 
(./1000 ha) 
Mean percentage 
error (%) 
MR 
84.10 
61.8 
83 
MLR 
84.2 
63.98 
80 
DNN 
83.5 
62.7 
82 
GBRT 
70.39 
59.5 
88 

Download 0.67 Mb.

Do'stlaringiz bilan baham:
1   2   3   4   5   6   7




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2025
ma'muriyatiga murojaat qiling