عنوان مقاله [English]
Introduction One reason for the complexity of hydrological phenomena prediction, especially time series is existence of features such as trend, noise and high-frequency oscillations. These complex features, especially noise, can be detected or removed by preprocessing. Appropriate preprocessing causes estimation of these phenomena become easier. Preprocessing in the data driven models such as artificial neural network, gene expression programming, support vector machine, is more effective because the quality of data in these models is important. Present study, by considering diagnosing and data transformation as two different preprocessing, tries to improve the results of intelligent models. In this study two different intelligent models, Artificial Neural Network and Gene Expression Programming, are applied to estimation of daily suspended sediment load. Wavelet transforms and logarithmic transformation is used for diagnosing and data transformation, respectively. Finally, the impacts of preprocessing on the results of intelligent models are evaluated.
Materials and Methods In this study, Gene Expression Programming and Artificial Neural Network are used as intelligent models for suspended sediment load estimation, then the impacts of diagnosing and logarithmic transformations approaches as data preprocessor are evaluated and compared to the result improvement. Two different logarithmic transforms are considered in this research, LN and LOG. Wavelet transformation is used to time series denoising. In order to denoising by wavelet transforms, first, time series can be decomposed at one level (Approximation part and detail part) and second, high-frequency part (detail) will be removed as noise. According to the ability of gene expression programming and artificial neural network to analysis nonlinear systems; daily values of suspended sediment load of the Skunk River in USA, during a 5-year period, are investigated and then estimated.4 years of data are applied to models training and one year is estimated by each model. Accuracy of models is evaluated by three indexes. These three indexes are mean absolute error (MAE), root mean squared error (RMSE) and Nash-Sutcliffecoefficient (NS).
Results and Discussion In order to suspended sediment load estimation by intelligent models, different input combination for model training evaluated. Then the best combination of input for each intelligent model is determined and preprocessing is done only for the best combination. Two logarithmic transforms, LN and LOG, considered to data transformation. Daubechies wavelet family is used as wavelet transforms. Results indicate that diagnosing causes Nash Sutcliffe criteria in ANN and GEPincreases 0.15 and 0.14, respectively. Furthermore, RMSE value has been reduced from 199.24 to 141.17 (mg/lit) in ANN and from 234.84 to 193.89 (mg/lit) in GEP. The impact of the logarithmic transformation approach on the ANN result improvement is similar to diagnosing approach. While the logarithmic transformation approach has an adverse impact on GEP. Nash Sutcliffe criteria, after Ln and Log transformations as preprocessing in GEP model, has been reduced from 0.57 to 0.31 and 0.21, respectively, and RMSE value increases from 234.84 to 298.41 (mg/lit) and 318.72 (mg/lit) respectively. Results show that data denoising by wavelet transform is effective for improvement of two intelligent model accuracy, while data transformation by logarithmic transformation causes improvement only in artificial neural network. Results of the ANN model reveal that data transformation by LN transfer is better than LOG transfer, however both transfer function cause improvement in ANN results. Also denoising by different wavelet transforms (Daubechies family) indicates that in ANN models the wavelet function Db2 is more effective and causes more improvement while on GEP models the wavelet function Db1 (Harr) is better.
Conclusions: In the present study, two different intelligent models, Gene Expression Programming and Artificial Neural Network, have been considered to estimation of daily suspended sediment load in the Skunk river in the USA. Also, two different procedures, denoising and data transformation have been used as preprocessing to improve results of intelligent models. Wavelet transforms are used for diagnosing and logarithmic transformations are used for data transformation. The results of this research indicate that data denoising by wavelet transforms is effective for improvement of two intelligent model accuracy, while data transformation by logarithmic transformation causes improvement only in artificial neural network. Data transformation by logarithmic transforms not only does not improve results of GEP model, but also reduces GEP accuracy.