بررسی کارایی روش‌های پیش‌پردازش داده‌ها در بهبود عملکرد مدل برنامه‌ریزی بیان ژن (مطالعه موردی: رودخانه آب زال)

نوع مقاله : مقالات پژوهشی

نویسنده

گروه هیدرولوژی و منابع آب، دانشکده مهندسی آب و محیط زیست، دانشگاه شهید چمران اهواز، اهواز، ایران

چکیده

در این مطالعه سعی گردیده تاثیر کاربرد ضرایب فصلی و روش ریاضی تحلیل و پردازش سیگنال تحت عنوان تبدیل موجک در بهبود عملکرد مدل برنامه­ریزی بیان ژن (GEP) در پیش­بینی جریان ماهانه رودخانه آب زال در دوره آماری 1351 تا 1396 مورد بحث و بررسی قرار گیرد. بدین منظور داده­ها در سه حالت مختلف شامل الف) استفاده از داده­های جریان و در نظر گرفتن نقش حافظه تا چهار تاخیر، ب) دخالت دادن ترم پریودیک در دو حالت خطی (α-GEP) و غیرخطی (PT-GEP) و ج) تجزیه داده­ها با استفاده از پنج تابع موجک مختلف به دو زیرسری جزئیات و تقریب، آماده و به مدل GEP معرفی گردید. نتایج حاصل از اجرای مدل­های خطی و غیرخطی GEP نشان داد که در هر دو حالت، مدل با چهار تاخیر به بیشترین دقت در پیش­بینی جریان رودخانه دست یافته اما عملکرد مدل غیرخطی GEP با توجه به شاخص­های ارزیابی مورد استفاده اندکی بهتر بود. در مرحله بعد ترم پریودیک به ورودی­های مدل افزوده شد. براساس نتایج به دست آمده مدل PT-GEP با الگوی M4 کمترین خطا و بیشترین دقت را به خود اختصاص داده و توانسته شاخص RMSE را هشت درصد کاهش دهد. سپس در گام سوم داده­های جریان رودخانه با استفاده از توابع موجک تجزیه و مدل­های W-GEP ایجاد گردید. نتایج کلی این پژوهش نشان داد که مدل­های W-GEP از عملکرد بسیار مطلوبی برخوردار بوده به طوری که می­توان از آنها به عنوان یک روش موثر در پیش­بینی جریان میان­مدت رودخانه­ها استفاده نمود.

کلیدواژه‌ها


عنوان مقاله [English]

Evaluation of the Efficiency of Data Preprocessing Methods on Improving the Performance of Gene Expression Programming Model (Case Study: Ab Zal River)

نویسنده [English]

  • F. Ahmadi
Department of Hydrology and Water Resources, Shahid Chamran University of Ahvaz, Ahvaz, Iran
چکیده [English]

Introduction: Surface water has always been one of the most essential pillars of water projects and, with modeling and predicting the river flow, in addition to the management and utilization of water resources, it is possible to inhibit the natural disasters such as drought and floods. Therefore, researchers have always tried to improve the accuracy of hydrological parameters estimation by using new tools and combining them. In this study, the effect of seasonal coefficients and mathematical methods of signal analysis and signal processing on wavelet transform to improve the performance of the Gene Expression Programming (GEP) model were discussed.
Materials and Methods: In the present study, for the prediction of the monthly flow of Ab Zal River, the information of Pol Zal hydrometric station in period 1972 to 2017 was used. In the next step, different input patterns need to be ready. To this purpose, the data are presented in three different modes: (a) the use of flow data and considering the role of memory up to four delays; (b) the involvement of the periodic term in both linear (?-GEP) and nonlinear (PT-GEP) states, and (c): data analysis using the Haar wavelet, Daubechies 4 (db4), Symlet (sym), Meyer (mey), and Coiflet (coif), was done in two subscales, prepared, and introduced to the GEP model. To better analyze the effect of mathematical functions used in the GEP method, two linear modes (using Boolean functions including addition, multiplication, division, and minus) and nonlinear (including quadratic functions, etc.) were considered. The wavelet transform is a powerful tool in decomposing and reconstructing the original time series. Wavelet function is a type of function that has an oscillating property and can be quickly attenuated to zero. Modeling was done based on 80% of recorded data (432 months) and the validation was done based on the remaining 20% (108 months). To evaluate the performance of each of models, statistical indices such as mean square error (RMSE), mean absolute error (MAE), and correlation coefficient (R) were used.
Results and Dissection: The results of linear and nonlinear GEP models showed that in both cases, the four-delay model achieved the most accuracy in river flow prediction. Still the performance of nonlinear GEP model according to RMSE (4.093 (m3/s)), MAE (2.782 (m3/s)) and R (0.660) were better than another, respectively. In the next step, the periodic term was added to the model inputs. Based on the results, the PT-GEP model with M4 pattern had the lowest error, the highest accuracy and was able to reduce the RMSE index by 8%. Then, in the third step, the river flow data were divided into approximate subdivisions and details using five wavelet functions. The most appropriate level of analysis based on the number of data was considered as number three. The results of the W-GEP modes showed an excellent performance of this method so that the model was able to reduce the RMSE statistics with 48.6%, 41.2%, and 31.1% compared to the L-GEP, NL-GEP and PT-GEP methods, respectively. Also, the best performance of the W-GEP model with the Symlet wavelet and the decomposition level of one had the highest accuracy (R=0.847) and the lowest error (RMSE =2.898 (m3/s) and MAE =1.745 (m3/s) among all models (35 models) such as linear and nonlinear, seasonal and non-seasonal and wavelet hybrid models.
Conclusion: Based on the results, it can be concluded that the overall use of data preprocessing methods (including seasonal coefficients and wavelet functions) has improved the performance of the GEP model. However, the combination of wavelet functions with the GEP model has significantly increased the accuracy of the modeling. Therefore, it is recommended as the most suitable tool for river flow forecasting.

کلیدواژه‌ها [English]

  • Decomposition level
  • Gene Expression Programming
  • Hybrid model
  • Wavelet function
1-       Ahmadi F., Dinpashoh Y., Fakheri F. A., Khalili K., and Darbandi S. 2015. Comparing nonlinear time series models and genetic programming for daily river flow forecasting (Case study: Barandouz-Chai River). Journal of Water and Soil Conservation 22(1) : 121-169. (In Persian with English abstract)
2-       Ahmadi F., Radmanesh F., and Mirabbasi Najaf Abadi R. 2014. Comparison between Genetic Programming and Support Vector Machine Methods for Daily River Flow Forecasting (Case Study: Barandoozchay River). Journal of Water and Soil 28(6): 1162-1171. (In Persian with English abstract)
3-       Ahmadi F., Radmanesh F., and Mirabbasi R. 2016. Comparing the performance of Support Vector Machines and Bayesian networks in predicting daily river flow (Case study: Baranduz Chai River). Journal of Water and Soil Conservation 22(6): 171-186. (In Persian with English abstract)
4-       Ashofteh P.S., Bozorg-Haddad O., and Loáiciga H.A. 2020. Logical genetic programming (LGP) application to water resources management. Environmental Monitoring and Assessment 192(1): 34-42.
5-       Daubechies I. 1992. Ten lectures on wavelets. 2nd ed. Philadelphia: SIAM, CBMS-NSF regional conference series in applied mathematics 61.
6-       Deka P.C., and Prahlada R. 2012. Discrete wavelet neural network approach in significant wave height forecasting for multistep lead time. Ocean Engineering 43: 32-42.
7-       Farbodfam N., Ghorbani M.A., and Aalami M.T. 2009. Forecasting river flow using genetic programming (Case study: Lighwan watershed). Journal of Water and Soil Science 19(1):107-123. (In Persian with English abstract)
8-       Ferreira C. 2002. Genetic representation and genetic neutrality in gene expression programming. Advances in Complex Systems 5(4): 389-408.
9-       Freire P.K.D.M.M., Santos C.A.G., and da Silva G.B.L. 2019. Analysis of the use of discrete wavelet transforms coupled with ANN for short-term streamflow forecasting. Applied Soft Computing 80: 494-505.
10-   Ghorbani M.A., Shiri J., and Kazemi H. 2010. Estimation of Maximum, Mean and Minimum Air Temperature in Tabriz City Using Artificial Intelligent Methods. Journal of Agriculture Science 20(4): 87-104. (In Persian with English abstract)
11-   Hadi S.J., and Tombul M. 2018. Monthly streamflow forecasting using continuous wavelet and multi-gene genetic programming combination. Journal of Hydrology 561: 674-687.
12-   Kumar M., and Sahay R.R. 2018. Wavelet-genetic programming conjunction model for flood forecasting in rivers. Hydrology Research 49(6): 1880-1889.
13-   Labat D. 2005. Recent advances in wavelet analyses: Part 1. A review of concepts. Journal of Hydrology 314: 275-288.
14-   Lohani A.K., Kumar R., and Singh R.D. 2012. Hydrological time series modeling: A comparison between adaptive neuro-fuzzy, neural network and autoregressive techniques. Journal of Hydrology 442: 23-35.
15-   Mallat S.G. 1998. A wavelet tour of signal processing, San Diego.
16-   Grossmann A., and Morlet J. 1984. Decomposition of Hardy functions into square integrable wavelets of constant shape. SIAM Journal on Mathematical Analysis 15(4): 723-736.
17-   Mehr A.D. 2018. An improved gene expression programming model for streamflow forecasting in intermittent streams. Journal of Hydrology 563: 669-678.
18-   Mehr A.D., and Majdzadeh Tabatabai M.R. 2010. Prediction of Daily Discharge Trend of River Flow Based on Genetic Programming. Journal of Water and Soil 24(2): 325-333. (In Persian with English abstract)
19-   Mehr A.D., and Nourani V. 2018. Season algorithm-multigene genetic programming: a new approach for rainfall-runoff modelling. Water Resources Management 32(8): 2665-2679.
20-   Montaseri M., and Zamanzad Ghavidel S. 2014. River Flow Forecasting by Using Soft computing. Journal of Water and Soil 28(2): 394-405. (In Persian with English abstract)
21-   Nohegar A., Motamednia M., and Malekian A. 2016. Daily river flood modeling using genetic programming and artificial neural network (Case study: Amameh representative watershed). Physical Geography Research Quarterly 48(3): 367-383. (In Persian with English abstract)
22-   Parmar K. S., Makkhan S.J.S., and Kaushal S. 2019. Neuro-fuzzy-wavelet hybrid approach to estimate the future trends of river water quality. Neural Computing and Applications 31(12): 8463-8473.
23-   Polikar R. 1996. Fundamental concepts and an overview of the wavelet theory. Second Edition, Rowan University, College of Engineering Web Servers, Glassboro. NJ. 08028.
24-   Rahmani-Rezaeieh A., Mohammadi M., and Mehr A.D. 2020. Ensemble gene expression programming: a new approach for evolution of parsimonious streamflow forecasting model. Theoretical and Applied Climatology 139(2): 549-564.
25-   Solgi A., Zarei H., and Golabi M. 2017. Performance Assessment of Gene Expression Programming Model Using Data Preprocessing Methods to Modeling River Flow. Journal of Water and Soil Conservation 24(2): 185-201. (In Persian with English abstract)
26-   Sun Y., Niu J., and Sivakumar B. 2019. A comparative study of models for short-term streamflow forecasting with emphasis on wavelet-based approach. Stochastic Environmental Research and Risk Assessment 33(10): 1875-1891.
27-   Tayyab M., Zhou J., Dong X., Ahmad I., and Sun N. 2019. Rainfall-runoff modeling at Jinsha River basin by integrated neural network with discrete wavelet transform. Meteorology and Atmospheric Physics 131(1): 115-125.
28-   Wang W., and Ding J. 2003. Wavelet Network Model and Its Application to the Prediction of Hydrology. Nature and Science, pp. 67-71.
29-   Yaseen Z.M., Sulaiman S.O., Deo, R.C., and Chau K.W. 2019. An enhanced extreme learning machine model for river flow forecasting: State-of-the-art, practical applications in water resource engineering area and future research direction. Journal of Hydrology 569: 387-408.