ارزیابی قابلیت مدل‌های مبتنی بر داده کاوی در پیش‌بینی عملکرد گندم آبی در کشور

نوع مقاله : مقالات پژوهشی

نویسندگان

1 بخش تحقیقات فنی و مهندسی کشاورزی، مرکز تحقیقات و آموزش کشاورزی و منابع طبیعی استان قزوین، سازمان تحقیقات، آموزش و ترویج کشاورزی، قزوین، ایران

2 مؤسسه تحقیقات فنی و مهندسی کشاورزی، سازمان تحقیقات، آموزش و ترویج کشاورزی، کرج، ایران

چکیده

گندم و نان به‌عنوان اصلی‌ترین غذای مردم در کشور از اهمیت ویژه‌ای برخوردارند. گندم نه‌تنها یک کالای مهم تجاری در دنیا محسوب می‌شود، بلکه به‌عنوان سلاحی برتر در مناسبات سیاسی و جهانی روز به‌روز بر اهمیت استراتژیک آن افزوده می‌شود. از این رو تحلیل و پیش‌بینی وضعیت تولید این محصول همواره مورد توجه بوده است. در این تحقیق کارایی سه مدل شبکه عصبی مصنوعی، رگرسیون خطی چند متغیره و مدل درختی به‌منظور پیش‌بینی عملکرد گندم آبی در مناطق عمده تولید در سطح کشور، بر اساس اطلاعات میدانی ثبت شده 241 مزرعه، ارزیابی شد. نتایج تحقیق نشان داد ضریب تبیین مدل شبکه عصبی مصنوعی و مدل رگرسیون خطی چند متغیره به ترتیب برابر 672/0 و 577/0 بود که با اعمال گروه‌بندی داده­ها به روش درختی ضریب تبیین مدل پیش‌بینی به 762/0 افزایش یافت. نتایج خروجی مدل درختی نشان داد مناطق عمده تولید گندم در سطح کشور از نظر حجم آب مصرفی، به 4 گروه مستقل قابل تفکیک است. نهایتاً می­توان نتیجه گرفت مدل درختی با اعمال گروه‌بندی هدفمند در داده‌های ورودی، می‌تواند به‌عنوان یک ابزار قدرتمند در تخمین عملکرد گندم آبی در قطب­های عمده تولید گندم در سطح کشور مورد استفاده قرار گیرد.

کلیدواژه‌ها

موضوعات


عنوان مقاله [English]

Evaluating the Capability of Data Mining Models in Predicting Irrigated Wheat Yield in Iran

نویسندگان [English]

  • A,. Uossef gomrokchi 1
  • J. Baghani 2
  • F. Abbasi 2
1 Agricultural Engineering Research Department, Qazvin Agricultural and Natural Resources Research and Education Center, AREEO, Qazvin, Iran
2 Agricultural Engineering Research Institute, Agricultural Research, Education and Extension Organization, Karaj, Iran
چکیده [English]

Introduction: One of the modeling methods researchers have considered in various sciences in recent years is artificial neural network modeling. In addition to the artificial neural network and regression models, today, the capabilities of data mining methods have been used to improve the output results of prediction models and field information analysis. Tree models (decision trees) along with decision rules are one of the data mining methods. Tree models are a way of representing a set of rules that lead to a category or value. These models are made by sequentially separating data into separate groups, and the goal in this process is to increase the distance between groups in each separation. Research shows that plant yield is a function of various plant, climatic, and water, and soil management conditions. Therefore, calculating the amount of plant yield and related indices follows complex nonlinear relationships that also have special difficulty in modeling. Considering that the response of irrigated wheat to different inputs in different climates by field method is time-consuming, costly, and in some cases impossible, so the introduction of an efficient model that can predict yield and analyze yield sensitivity to various parameters is a great help. It will be to solve this problem. This study aimed to develop and evaluate the capability of three models of the neural network, tree, and multivariate linear regression in predicting wheat yield based on parameters affecting its yield in major wheat production hubs in the country.
Materials and Methods: The information used in this study includes the volume of water consumption and yield of irrigated wheat and the committees related to these two indicators in irrigated wheat fields under the management of farmers (241 farms) in the provinces of Khuzestan, Fars, Golestan, Hamadan, Kermanshah, Khorasan Razavi, Ardabil, East Azerbaijan, West Azerbaijan, Semnan, south of Kerman and Qazvin, which were harvested in a field study in the 2016-17 growing season. According to the Ministry of Jihad for Agriculture statistics, these provinces have the highest area under irrigated wheat cultivation in the country and cover about 70% of the area under cultivation and production of this crop in the country.
One of the most widely used monitored neural networks is the Perceptron multilayer network with error replication algorithm, which is suitable for a wide range of applications such as pattern recognition, interpolation, prediction, and process modeling. In the present study, in order to develop the neural network, the capabilities of R software with Neuralnet package have been used. After the normalization step, the data were randomized. This step aims to have a set of inputs and outputs in which the input-output categories do not have a special system. After the randomization of the data, the amount of information that should be used in the network training process is determined. This part of the data was considered for training (70%) and another part for network test (30%). Perceptron neural network activator functions in the implementation of network training and testing. The hyperbolic tangent activity function has been used to limit the range of output data from each neuron and the pattern-to-pattern training process. In the present study and the neural network modeling capability, the tree model method has been used to predict wheat yield. Tree modeling is one of the most powerful and common tools for classification and forecasting. The tree model, unlike the neural network model, produces the law. One of the advantages of the decision tree over the neural network is that it is resistant to input data noise. The tree model divides the data into different sections based on binary divisions. Each data partition can be re-subdivided into another binary, and a model fitted to each subdivision. In this research, the capabilities of WEKA software have been used to run a tree model. It is worth noting that after grouping, the prediction model is applied to the grouped data.
Results and Discussion: In this study, the efficiency of three models of the artificial neural network, multivariate linear regression, and tree model to predict the performance of irrigated wheat in major production areas in the country was evaluated based on field information recorded in 241 farms. The results showed that the coefficient of explanation of the model in predicting the yield of wheat production in the model of artificial neural network and a multivariate linear regression model was 0.672 and 0.577, respectively, which was applied by grouping the data by tree method. The coefficient of explanation has been increased to 0.762. The output results of the tree model showed that the major wheat production areas in Iran in terms of water consumption could be divided into four independent groups. Finally, it can be concluded that the tree model, considering the purposeful grouping in the input data, can be used as a powerful tool in estimating irrigated wheat yield in major wheat production areas in Iran.
Conclusion: In this study, the need to use data mining methods in analyzing field information and organizing large databases and the usefulness of data mining methods, especially the decision tree in estimating wheat crop yield, were investigated and compared with other forecasting methods. The general results of the research show that purposeful separation of input data into forecasting models can increase the output accuracy of forecasting models. However, it is not possible to provide a general approach to selecting or not selecting a forecasting model in different regions. In some studies, neural networks have shown a high ability to predict the performance of different products, but it is important to note that if there is sufficient data and correct understanding of the factors affecting the dependent variable, the accuracy of the models can be applied by data mining methods. It also improved the neural network. In a general approach, considering the accuracy of estimating the predicted models under study, these techniques can be used to estimate other late-finding characteristics of plants and soil.

کلیدواژه‌ها [English]

  • Data mining
  • Grouping
  • Modeling
  • Water consumption volume
1-       Ahmadi K., H.R. EbadZadeh F., Hatami R., HoseinPour and AbdShah H. 2019. Agricultural Statistics of 2017-2018. Ministry of Jihad for Agriculture, Deputy for Planning and Economy, Information and Communication Technology Office. Volume 3, Garden Products. 166 pp. (In Persian)
2-       Alvarez R. 2009. Predicting average regional yield and production of wheat in the Argentine Pampas by an artificial neural network approach. European Journal of Agronomy 30: 70-77.
3-       Aslam F., Salman A., and Jan I. 2019. Predicting wheat production in Pakistan by using an artificial neural network approach. Sarhad Journal of Agriculture 35(4): 1054-1062.
4-       Baghani J. 2018. Determination of wheat water consumption in Iran. Final Research Report, Agricultural Engineering Research Institute. (In Persian)
5-       Barikloo A., Alamdari P., Moravej K., and Servati M. 2017. Prediction of irrigated wheat yield by using hybrid algorithm methods of artificial neural networks and genetic algorithm. Journal of Water and Soil 30(3): 715-726. (In Persian with English abstract)
6-       Chipanshi A.C., Ripley E.A., and Lawford R.G. 1999. Large-scale simulation of wheat yields in a semi-arid environment using a crop-growth model. Agricultural Systems 59: 57−66.
7-       Doraiswamy P.C., Moulin S., Cook P.W., and Stern V. 2003. Crop yield assessment from remote sensing. Photogrammetric Engineering and Remote Sensing 69: 665−674.
8-       Franch B., Vermote E.F., Becker-Reshef I., Claverie M., Huang J., Zhang J., Justice C., and Sobrino J.A. 2015. Improving the timeliness of winter wheat production forecast in the United States of America, Ukraine and China using MODIS data and NCAR Growing Degree Day information. Remote Sensing of Environment 161: 131–148.
9-       Han J., Zhang Z., Cao J., Luo Y., Zhang L., Li Z., and Zhang J. 2020. Prediction of winter wheat yield based on multi-source data and machine learning in China. Remote Sensing 236(12): 1-22.
10-   Iwańska M., Oleksy A., Dacko M., Skowera B., Oleksiak T., and Wójcik-Gront E. 2018. Use of classification and regression trees (CART) for analyzing determinants of winter wheat yield variation among fields in Poland. Biometrical Letters 55(2): 197-214.
11-   Kaul M., Hill R.L., and Walthall C. 2005. Artificial neural networks for corn and soybean yield prediction. Agricultural Systems 85: 1-18.
12-   Khoshnevisan B., Rafiee S., Omid M., and Mousazadeh H. 2014. Development of an intelligent system based on ANFIS for predicting wheat grain yield on the basis of energy inputs. Information Processing in Agriculture 1(1): 14-22.
13-   Liu J., and Goering C.E. 1999. Neural network for setting target corn yields. ASAE paper 99-3040, Toronto, Ontario, Canada, 18-21.
14-   Maselli F., and Rembold F. 2001. Analysis of GAC NDVI data for cropland identification and yield forecasting in Mediterranean African countries. Photogrammetric Engineering and Remote Sensing 67:593−602.
15-   Mehnatkesh A., Ayyubi S., Jalalyan A., and Dehgani A.A. 2017. Comparison of multivariate linear regression and artificial neural networks models for estimating of rainfed wheat yield in some central Zagros areas. Iranian Journal of Dryland Agriculture 5(2): 119-133. (In Persian with English abstract)
16-   Montazar A., Azadegan B., and Shahkary M. 2009. Assesing the Efficiency Of atifical neural network model to predict wheat yield and water prodoctivity based on climatic data and seasonal water-nitrogen variabls. Iranian Water Research Journal 3(5): 17-29.
17-   Norouzi M., Ayoubi S., Jalalian A., Khademi H., and Dehghani A.A. 2010. Predicting rainfed wheat quality and quantity by artificial neural network using terrain and soil characteristics. Acta Agric Scandinavica, Section B-Plant Soil Sciences 60: 341-352.
18-   Ramesh D., and Vishnu Vardhan B. 2013. Data mining techniques and applications to agricultural yield data. International Journal of Advanced Research in Computer and Communication Engineering 2(9): 3477-3480.
19-   Raorane A.A., and Kulkarni R.V. 2013. Review role of data mining in agriculture. International Journal of Computer Science and Information Technologies 4(2): 270-275.
20-   Rumelhart D.E., Hinton G.E., and Williams R.J. 1986. Learning internal representation by back-propagation errors. In: Rumelhart DE, McClelland JL, the PDP Research Group (Eds.), Parallel Distributed Processing: Explorations in the Microstructure of Cognition. MIT Press, MA.
21-   Sepehri S., Abbasi F., and Nakhjavanimoghaddam M.M. 2019. Prediction of silage maize yield and sensitivity analysis of management parameters using artificial neural network models. Iranian Journal of Irrigation and Drainage 13(5): 1460-1470. (In Persian with English abstract)
22-   Servati M., Barikloo A, Alamdari P., and Moravej K. 2017. Application of heuristic methods in prediction of wheat yield. Applied Soil Research 6(3): 106-117. (In Persian with English abstract)
23-   Sudduth K.A., Drummond S.T., Birrell S.J., and Kitchen N.R. 1996. Analysis of spatial factors influencing crop yield, in Proc. 3rd Int. Conf. On Precision Agriculture, P.C. Robert et al. (ed.), pp. 129-140.
24-   Toloei Ashlaghi A., Poorebrahimi A., Ebrahimi M., and Ghasemahmad L. 2013. Using data mining techniques for prediction breast cancer recurrence. Iranian Journal of Breast Diseases 5(4): 23-34. (In Persian with English abstract)
25-   Veelenturf L.P.J. 1995. Analysis applications of artificial neural networks. Simon and Schuster International Group, United States of America.
26-   Wall L., Larocque D., and Leger P.M. 2007. The early explanatory power of NDVI in crop yield modeling. International Journal of Remote Sensing 29: 2211−2225.
27-   Wu F.Y., and Yen K.K. 1992. Application of neural network in regression analysis. Computer and Industrial Engineering 23: 93-98.
28-   Zakidizaji H., Bahrami H., Monjezi N., and Sheikhdavoodi M.J. 2019. Modeling of the variables that influence sugarcane yield using C5.0 and QUEST decision tree algorithms. Journal of Agricultural Machinery 9(2): 469-484. (In Persian with English abstract)