مدل‌سازی بارش- رواناب ایستگاه‌های هیدرومتری خرمازرد و بناب با استفاده از الگوریتم ماشین بردار پشتیبان و جنگل تصادفی

نوع مقاله : مقالات پژوهشی

نویسندگان

1 گروه مهندسی آب، دانشکده کشاورزی، دانشگاه تبریز، تبریز، ایران

2 گروه مهندسی آب، دانشکده کشاورزی، دانشگاه تبریز، تبریز ، ایران

چکیده

شبیه‌سازی فرآیند بارش-رواناب می‌تواند نقش بسزایی در مدیریت منابع آب و مسائل هیدرولوژی داشته باشد. در این تحقیق با استفاده از مدل‌های داده‌کاوی ماشین بردار پشتیبان (SVM) و جنگل تصادفی (RF) اقدام به مدل‌سازی بارش- رواناب دو ایستگاه بناب و خرمازرد به‌ترتیب واقع بر روی رودخانه‌های صوفی‌چای و ماهپری‌چای (دشت مراغه) شده است. در مطالعه حاضر داده‌های ایستگاه‌های هواشناسی و هیدرومتری منطقه از سال 1355 تا 1397 از شرکت آب منطقه‌ای و سازمان هواشناسی استان آذربایجان شرقی دریافت گردید. تغییر روند رواناب جاری در سال 1374، باعث گردید مدت مطالعه به دو دوره قبل و بعد آن تقسیم شود. مقدار بارش و رواناب با تاخیر زمانی یک ماه بعنوان ورودی به این مدل وارد و سپس مقادیر رواناب ماهانه مشاهداتی با رواناب ماهانه تخمین زده شده با استفاده از معیارهای ارزیابی خطا مورد بررسی گرفت. نتایج نشان داد که در هر دو دوره برای ایستگاه بناب مدل SVM کارآیی بالاتری نسبت به مدل RF داشت و در ایستگاه خرمازرد نیز برای این دو دوره، مدل RF عملکرد بهتری از مدل SVM ارائه کرد. نتایج مدل‌سازی در مجموعه تست در دو ایستگاه نشان داد که مقدار همبستگی متقابل برای دو دوره مطالعاتی اول و دوم ایستگاه بناب ‌به‌ترتیب برابر با 85/0 و 84/0 و برای ایستگاه خرمازرد برابر با 79/0 و 75/0 بدست آمد. با توجه به نتایج مقادیر آماره من کندال و سری‌های زمانی برای هر دو ایستگاه، روند مشخصی برای بارش در طول دوره مشاهده نشد، ولی دبی رودخانه‌ صوفی‌چای در ایستگاه بناب، بخصوص بعد از سال 1374 روند صعودی و دبی رودخانه ماهپری‌چای روند کاملا نزولی داشته است.

کلیدواژه‌ها

موضوعات


عنوان مقاله [English]

Rainfall-Runoff Modeling of Khormazard and Bonab Hydrometric Stations Using Support Vector Machine and Random Forest Algorithms

نویسندگان [English]

  • Z. Bigdeli 1
  • A. Majnooni-Heris 2
  • R. Delearhasannia 1
  • S. Karimi 2
1 Water Engineering department, Agriculture College, Tabriz University, Tabriz, Iran
2 Department of Water Engineering, Faculty of Agriculture, University of Tabriz, Tabriz, Iran
چکیده [English]

Introduction
Water plays a crucial role in ensuring the sustainable development of any region. Given that our country consists primarily of arid and semi-arid regions, where the majority of rivers are also found, along with the critical state of groundwater extraction and the growing importance of surface water, It is crucial to have a deep understanding of the future condition of water resources within the country's watersheds (Fathollahi et al., 2015). By utilizing intelligent models, it becomes feasible to represent the inherent relationships between data that cannot be solved by conventional mathematical methods. Support vector machine (SVM) and Random Forest algorithms are two types of machine learning methods that utilize essential algorithms for making repeated and accurate predictions (Kisi & Parmarm, 2016). The most recent study conducted by Zarei et al. (2022) evaluated the risk of flooding using data mining models of SVM and RF (case study: Frizi watershed). By analyzing the results, it was found that both the SVM algorithm and the new random forest algorithm showed higher accuracy in predicting flooding risks, both in terms of the educational data and algorithmic performance. The purpose of this study is to simulate the precipitation-runoff process in the hydrometric stations at the end of the Maragheh plain (Khormazard station on the Mahpari chai river and Bonab station on the Sufichai river) in East Azerbaijan province using support vector machine and random forest modeling algorithms. This study has been conducted over a period of 43 years, making it one of the few research cases in this area.
 
Materials and Methods
The Maragheh Sufi chai basin is situated in the eastern region of Lake Urmia, within the East Azarbaijan province. It covers an area of 611.89 square kilometers and is located between longitudes 45° and 40´ to 46° and 25´and latitudes from 37° and 15´ to 37° and 55´ north. The average height of the basin is 1767 meters above sea level (Sharmod et al., 2015). Based on the substantial changes observed in the runoff trend in the data since 1994 (without any noticeable change in the precipitation trend), the available data was divided into two distinct periods. The first period spans from 1976 to 1994, and the second period covers the years 1995 to 2019. To simulate rainfall-runoff, first the average rainfall of Maragheh plain was calculated by polygonal method. Subsequently, this data was combined with the discharge output from Bonab and Khormazard stations, with a one-day time lag. These inputs were then utilized in two models, SVM (kernel function) and RF. For this purpose, 70% of the data was used for the training stage and 30% of the data was used for the validation stage. Then, the rainfall and runoff training sets from one day before were chosen as the predictor variables, while the runoff training set was designated as the target variable. Several combinations of runoff and rainfall inputs were evaluated for the purpose of modeling. The inputs consist of the monthly Q and P values that were recorded previously (Pt, Qt-1), while the output represents the current runoff data (Qt), with the subscript t indicating the time step. As a result, two input combinations were constructed from Q and P data (as seen in Table 3) and SVM and RF models were used for rainfall-runoff modeling to determine the optimal input combination.
Calculating average rainfall through the Thiessen Polygons method
Thiessen polygons, which are Voronoi cells, are used to define rainfall polygons that correspond to the surface area (Ai). These polygons are used to weight the rainfall measured by each rain gauge (ri). Consequently, the area-weighted rainfall is equivalent to:




                         


                         (1)




Random Forest Algorithm
Random forest is a modern type of tree-based methods that includes a multitude of classification and regression trees. This algorithm is one of the most widely used machine learning algorithms due to its simplicity and usability for both classification and regression tasks.
Support Vector Machine (SVM) algorithm
Support vector machines works like other artificial intelligence methods based on data mining algorithm. The most important functions of the support vector machine model are classification and linearization or data regression.
       Evaluation Criteria
To evaluate the models and compare their effectiveness, this research employs metrics such as the root mean square error (RMSE), correlation coefficient (r), explanation coefficient (R2) and Nash-Sutcliffe efficiency coefficient (NS) are used. Below are the relationships among these criteria:




   


(2)




            (3)                                                     


 




   


(4)




 


(5)



 
 
 



 
Results and Discussion
 Figure 6 displays the time series data for rainfall and runoff during the two study periods, before and after 1994.The analysis of the figures showed that for Bonab station, during the two study periods, the value of Kendall's statistic for precipitation variable was 0.044 and 0.028, respectively. For Khormazard station, this statistic value for the first and second period was 0.030, and 0.028, respectively. However, these values are not significant at the 95% level. This indicates that the annual rainfall for the two studied stations during these years is not statistically significant. Therefore, it is concluded that the annual rainfall in these stations between the years 1976 to 2019 did not show any significant trend. The variations observed during this period were deemed normal, suggesting that the time series of rainfall displayed fluctuating patterns. However, it should be noted that there were instances of both increasing and decreasing trends in certain years Examining the time series reveals varying trends Initially, the outflow from Bonab station (both a and b) displayed fluctuating patterns, followed by periods of both decreasing and increasing trends. However, in recent years, there has an increase in outflow from this station. The Mann-Kendall test statistic for the two study periods for this station is 0.325 and 0.512, respectively. These values are significantly different at the 95% level, indicating that the increasing trend of discharge for both time periods was statistically significant. The reason for this trend at the Bonab station, compared to other entrance stations to Lake Urmia, is the lower demand for water in the Sofichai basin for agricultural and industrial purposes, in contrast to other rivers. To explore the root cause of this issue, studies should be conducted to examine both underground and surface water sources, as well as the utilization of water in the agricultural and industrial sectors of this region. On the contrary, the trend observed at Khormazard station (c and d) is different. Unlike Bonab station, the discharge from Khormazard station exhibited a complete downward trend. The Mann-Kendall test statistic for the discharge variable during our two research periods were -0.269 and -0.412, respectively. At the 95% level, the decreasing trend of discharge in this station was found to be significant. On the other hand, it is apparent that the volume of discharge in this hydrometric station has decreased drastically since 1976 (d). Apart from 2007, when there was a sudden increase in discharge volume, the water inflow into lake Urmia has remained at its lowest level throughout the years. To analyze the Bonab and Khormazard stations during two distinct periods, rainfall and runoff statistics (average, minimum, maximum) for the first period (1976-1994) and the second period (1995-2019) are presented in Tables 4 and 5. Based on the data presented in both tables, the Bonab station displays the highest average rainfall and runoff values in the total data column, while the Khormazard station has the lowest average rainfall and runoff values.
As mentioned, in order to model rainfall-runoff data using SVM and RF models, a portion of the data was used for training purposes, while another portion was used for validation. Tables 5 and 6 present the values of the calculated statistical indicators associated with the results obtained from the training and validation sections for both SVM and RF models. According to the results of Tables 6 and 7, it is clear that in both study periods, the SVM model outperformed the RF model at the Bonab station. The SVM model demonstrated superior accuracy in simulating both flow rate and monthly rainfall. Conversely, at the Kharmazard station during these periods, the RF model displayed better performance compared to the SVM model. The modeling results in the test set for both stations revealed that the mutual correlation values for the first and second study periods at the Bonab station were 0.85 and 0.84, respectively. For the Kharmazard station, these values were 0.79 and 0.75, respectively.
Conclusion
The results indicate that for both periods at the Bonab station, the SVM model exhibited higher efficiency compared to the RF model. Conversely, at the Khormazard station, the RF model outperformed the SVM model for both periods. Mutual correlation values for the test sets were 0.85 and 0.84 for the first and second study periods at the Bonab station, respectively, for the SVM model test set. For the Khormazard station, these values were 0.79 and 0.75, respectively, for the RF model test set. Other notable findings of this research include the analysis of the time series data for rainfall and runoff over 43 years. Graphs obtained for both stations, along with the Mann-Kendall statistic for precipitation and flow parameters, revealed no discernible trend in precipitation during the two study periods. Instead, precipitation in these areas displayed fluctuating patterns However, the analysis of the time series and statistical values for the discharge of Sofichai and Mahpari chai rivers at the Bonab and Khormazard stations showed different results. In the Bonab station, the discharge exhibited fluctuations, with an increase observed in the second period. Conversely, at the Khormazard station, the discharge trend was downward in both study periods. The volume of Mahpari chai River outflow notably decreased in recent years, as evidenced by the Mann-Kendall statistic showing a decreasing trend.

کلیدواژه‌ها [English]

  • Maragheh Plain
  • Modeling
  • Rainfall-runoff
  • Random forest
  • Sufi Chai
  • Support Vector Machine

©2023 The author(s). This is an open access article distributed under Creative Commons Attribution 4.0 International License (CC BY 4.0), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source.

  1. Adnan, R.M., Yuan, X., Kisi, O., Adnan, F., & Mehmood, A. (2018). Stream flow forecasting of poorly gauged mountainous watershed by Least Square Support Vector Machine, Fuzzy Genetic Algorithm and M5 Model Tree using climatic data from Nearby Station. Water Resources Management, 32, 4469-4486. https://doi.org/10.1007/ s11269-018-2033-2
  2. Aurenhammer, F., Klein, R., & Lee, D.T. (2013). Voronoi Diagrams and Delaunay Triangulations. World Scientific Publ. Co., Singapore p. 337.
  3. Band, S.S., Janizadeh, S., Chandra Pal, S., Saha, A., Chakrabortty, R., Melesse, A.M., & Mosavi, A. (2020). Flash flood susceptibility modeling using new approaches of hybrid and ensemble tree-based machine learning algorithms. Remote Sensing, 12(21), 3568. http://doi.org/10.3390/rs12213568:2-23
  4. Bashirian, F., Rahimi, D, Movahedi, S., & Zakerinejad, R. (2020). Water level instability analysis of Urmia Lake Basin in the northwest of Iran. Arabian Journal of Geosciences, 13(4), 1-14. https://doi.org/10.1007/s12517-020-5207-1
  5. Bigdeli, Z., Majnooni Heris, A., Delirhasannia, R., & Karimi, S. (2023). Rainfall-runoff modeling of Aji Chai basin using random forest and artificial neural network models. New Research Sustainable Water Engineering, 1(2), 27-42. http://doi:10.22103/nrswe.2023.20278.1013
  6. Bigdeli, Z., Majnooni-Heris, A., Delirhasannia, R., & Karimi, S. (2023). Application of support vector machine and boosted tree algorithm for rainfall-runoff modeling (Case study: Tabriz plain). Environment and Water Engineering, 9(4), 532-547. http://doi.org/10.22034/ewe.2023.366913.1816
  7. Botsis, D., Latinopoulos, P., & Diamantaras, K. (2011). Rainfall-runoff modeling using support vector regression and artificial neural networks. Journal Rhodes, Greece. https://doi.org/20.1001.1.20087942.1398.13.6.15.1
  8. Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5-32.
  9. Catani, F., Lagomarsino, D., Segoni, S., & Tofani, V. (2013). Landslide susceptibility estimation by random forests technique: sensitivity and scaling issues. Natural Hazards Earth System Science, 13, 2815-2831. https://doi.org/10.5194/nhess-13-2815-2013
  10. Dastorani, M.T., Mahjoobi, J., Talebi, A., & Fakhar, F. (2018). Application of machine learning approaches in rainfall-runoff modeling (Case study: Zayandeh_Rood Basin in Iran). Civil Engineering Infrastructures Journal, 51(2), 293-310. https://doi.org/20.1001.1.23222093.2018.51.2.4.1
  11. Eskandari, A., Noori, R., Meeraji, H., & Kiaghaderi, A. (2011). Development of an appropriate model based on artificial neural network and support vector machine to predict the 5-day biochemical oxygen demand while, Ecology, 38, 71-82. https://doi.org/20.1001.1.10258620.1391.38.1.8.1
  12. Fathollahi, S., Mirshahi, D., & Abbasipour, B. (2015). Water and climate change: prediction of runoff from rainfall in the Aji Chai River basin using artificial neural network, the first national congress of irrigation and drainage in Iran. (In Persian)
  13. Hamidi-Razi, H., Mazaheri, M., Carvajalino-Fernández, M., & Vali-Samani, J. (2019). Investigating the restoration of Lake Urmia using a numerical modelling approach. Journal of Great Lakes Research, 45(1), 87-97. https:// doi.org/10.1016/j.jglr.2018.10.002
  14. Hosseini-Moghari, S.M., Araghinejad, S., Tourian, MJ., Ebrahimi, K., & Döll, P. (2020). Quantifying the impacts of human water use and climate variations on recent drying of Lake Urmia basin: the value of different sets of spaceborne and in situ data for calibrating a global hydrological model. Hydrology & Earth System Sciences, 24(4). https://doi.org/10.5194/hess-24-1939-2020
  15. Hussain, D., & Khan, A.A. (2020). Machine learning techniques for monthly river flow forecasting of Hunza River. Pakistan. Earth Science Informatics,13(3), 939-949. https://doi.org/10.1007/s12145-020-00450-z
  16. Javadzadeh, H., Ataie-Ashtiani, B., Hosseini, S.M., & Simmons, C.T. (2020). Spectral analysis of periodic behavior of Lake Urmia water level time series Interaction of lake-groundwater levels using cross correlation analysis: A case study of Lake Urmia Basin, Iran. Science of The Total Environment,
  17. Kendall, M.G. (1975). Rank correlation methods. fourth ed. Charles Griffin, London.
  18. Kisi, O., & Parmar, K.S. (2016). Application of least square support vector machine and multivariate adaptive regression spline models in long term prediction of river water pollution. Journal of Hydrology, 534, 104–112. https://doi.org/10.1016/j.jhydrol.2015.12.014
  19. Lari, A., Pishvaee, M.S., & Khodabakhsh P. (2019). A system dynamics approach for basin policy design: Urmia Lake case study. Kybernetes. https://doi.org/10.1108/K-04-2019-0226
  20. Lettenmaier, D.P., Wood, E.F., & Wallis, J.R. (1994). Hydro-climatological Trends in the Continental United States, 1948–88. Journal of Climate, 7, 586–607. https://doi.org/10.1175/1520-0442(1994)007<0586:HCTITC>2.0.CO;2
  21. Lorrai, M., & Sechi, M.G. (1995). Neural nets for modeling rainfall-runoff transformation. Water Resources Management, 9, 299-313. https://doi.org/10.1007/BF00872489
  22. MannB. (1945). non-parametric test against trend. Journal of Econometrical, 13, 245-259.
  23. Montaseri, M., Nourjou, A., Behmanesh, J., & Akbari, M. (2018). Investigation of heteorological drought in Southern basins of Urmia lake (Case study: Zarrineh rud and Simeneh rud). Iranian journal of Ecohydrology, 5(1), 189-202. http://doi.org/10.22059/ije.2018.245903.781
  24. Najibzade, N., Qaderi, K., & Ahmadi, M.M. (2020). Rainfall-runoff modelling using support vector regression and artificial neural network models (case study: SafaRoud Dam Watershed). Iranian Journal of Irrigation & Drainage13(6), 1709-1720. https://doi: 20.1001.1.20087942.1398.13.6.15.1
  25. Nazeri Tahroudi, M., Ahmadi, F., & Khalili, K. (2018). Impact of 30 years changing of river flow on Urmia lake basin. AUT Journal of Civil Engineering, 2(1), 115-122. https://doi.org/10.22060/AJCE.2018.14520.5481
  26. Nicodemus, K.K. (2011). Letter to the Editor: On the stability and ranking of predictors from random forest variable importance measures. Briefings in Bioinformatics, 12, 369-373. https://doi.org/10.1093/bib/bbq011
  27. Pasquini, A.I., Lecomte, K.L., Piovano, E.L., & Depetris, P.J. (2006). Recent rainfall and runoff variability in central Argentina. Int. 158(1), 127-139. https://doi.org/10.1016/j.quaint.2006.05.021
  28. Patil, J. P., Sarangi, A., Singh, O.P., Singh, A.K., & Ahmad, T. (2008). Development of a GIS interface for estimation of runoff from watersheds. Water Resources Management, 22(9), 1221-1239. https://doi.org/10.1007/ s11269-007-9222-8
  29. Phomcha, P., Wirojanagud, P., Vangpaisal, T., & Thaveevouthti, T. (2011). Suitability of SWAT model for simulating of monthly streamflow in Lam Sonthi watershed. The Journal of Industrial Technology, 7(2), 49-56.
  30. Poursalehi, F., KhasheiSiuki, A., & Hashemi, S.R. (2022). Investigating the performance of random forest algorithm in predicting water table fluctuations Compared with two models of decision tree and artificial neural network (Case study: unconfined aquifer of Birjand. journal of Ecohydrology, 8(4), 961-974. https://doi.org/10.22059/IJE.2022. 327263.1526
  31. Rezazei, H., Jabbari, A., Behmanesh, J., & Hessari, B. (2017). Modelling the daily runoff of Nazloo Chai watershed at the west side of Urmia Lake. Journal of Water and Soil Conservation, 23(6), 123-141. https://doi.org/10.22069/ JWFST.2017.9735.2401
  32. Sabouhi, R., & Soltani, S. (2008). Analysis of the climate trend in the major cities of Iran. Journal of Agricultural Science and Technology, 12(46). (In Persian)
  33. Salehi Bavil, S., Zeinalzadeh, K., & Hessari, B. (2017). The changes in the frequency of daily precipitation in Urmia Lake basin, Iran. Journal of Theoretical and Applied Climatology, 1-10. https://doi.org/10.1007/s00704-017-2177-7
  34. Serrano, A., Mateos, V.L., & Garcia, J.A. (1999). Trend analysis of monthly precipitation over the Iberian Peninsula for the period 1921-1995. Physics and Chemistry of the Earth, Part B: Hydrology, Oceans and Atmosphere, 24(1-2), 85-90. https://doi.org/10.1016/S1464-1909(98)00016-1
  35. Seyedian, S. M., Soleimani, M. & Kashani, M. (2014). Predicting streamflow using data-driven model and time series. Iranian Journal of Ecohydrology, 1(3), 167-179. (In Persian). https://doi:10.22059/IJE.2014.54219
  36. Shafeizadeh, M., Fathian, H., & Nikbakht Shahbazi, A. (2019). Continuous rainfall-runoff simulation by artificial neural networks based on efficient input variables selection using partial mutual information (PMI) algorithm. Iran-Water Resources Research, 15(2), 144-161. https://doi.org/20.1001.1.17352347.1398.15.2.12.1
  37. Sharafi, M., Samadian Fard, S., & Hashemi, S. (2021). Monthly rainfall forecasting using genetic programming and support vector machine. Iranian Journal of Rainwater Catchment Systems, 8(4), 63-71.
  38. Sharifi, A., Dinpashoh, Y., Fakheri-Fard, A., & Moghaddamnia, A. (2014). Optimum combination of variables for runoff simulation in Amameh watershed using Gamma test. Water and Soil Science, 23(4), 59-72. (In Persian)
  39. Sharmod, T., Hosseini, A., & Mohammadzade, H. (2017). Hydrogeochemical report of the study areas of Azarshahr, Shiramin, Ajab Shir and Maragheh. Geological Survey and Mineral Exploration of IRAN.
  40. Shekar, S., & Xiong, H. (2018). Encyclopedia of GIS. Springer Science & Business Media. New York, USA, 1370.
  41. Sohrabi Geshnigani, F., Mirabbasi Najafabadi, R., & Golabi, M.R. (2021). Rainfall-runoff modeling using HBV model and random forest algorithm in Bazoft watershed. Iranian Journal of Soil and Water Research52(5), 1395-1407. https://doi:20.1001.1.2008479.1400.52.5.18.2
  42. Turgay, P., & Ercan, K. (2005). Trend analysis in Turkish precipitation data. Hydrological Processes: An International Journal, 20(9), 2011-2026. https://doi.org/10.1002/hyp.5993
  43. Vaheddoost, B., & Aksoy, H. (2018). Interaction of groundwater with Lake Urmia in Iran. Hydrological Processes, 32(21), 3283-3295. https://doi.org/10.1002/hyp.13263
  44. Youssef, A.M., Pourghasemi, H.R., Pourtaghi Z.S., & Al-Katheeri, M.M (2015). Landslide susceptibility mapping using random forest, boosted regression tree, classification and regression tree, and general linear models and comparison of their performance at Wadi Tayyah Basin, Asir Regio. Saudi Arabia Landslides. https://doi.org/ 10.1007/s10346-015-0614-1
  45. Zarei, M., Zandi, R., & Naemitabar, M. (2022). Assessment of flood occurrence potential using data mining models of support vector machine, chaid and random forest (case study: Frizi watershed). Jwmr, 13(25), 133-144. http://jwmr.sanru.ac.ir/article-1-1140-en.html

 

CAPTCHA Image