Ferdowsi University of MashhadWater and Soil2008-475735420210923Estimation of Suspended Sediment Load Using Integrated Intelligent Methods with Considering Model UncertaintyEstimation of Suspended Sediment Load Using Integrated Intelligent Methods with Considering Model Uncertainty4754884036710.22067/jsw.2021.68665.1021FAS.M. SaghebianDepartment of Civil Engineering, Ahar Branch, Islamic Azad University - Ahar - Iran0000-0002-6699-0831Journal Article20210223<strong>Introduction: </strong>Sediment transportation and accurate estimation of its rate is a significant issue for river engineers and researchers. So far, various and complex relationships have been proposed to predict the amount of suspended sediment transport rate, such as velocity and critical shear stress based equations. However, the complex nature of sediment transport and lack of validated models make it difficult to model the suspended sediment concentration and suspended sediment discharge carried by rivers. Although the developed models led to promising results in sediment transport prediction, due to the importance of sediment transport and its impact on hydraulic structures it is necessary to use other methods with higher efficiency. On the other hand, in recent years, the Meta model approaches have been applied in investigating the hydraulic and hydrologic complex phenomena. Hybrid models involving signal decomposition have also been shown to be effective in improving the prediction accuracy of time series prediction methods, as indicated in. Complementary Ensemble Empirical Mode Decomposition analysis is one of the widely used signal decomposition methods for hydrological time series prediction. Decomposition of time series reduces the difficulty of forecasting, thereby improving forecasting accuracy.<br />In this study, due to the complexity of the sediment and erosion phenomenon and the effect of different parameters in estimating, time series pre-processing methods along with support vector machine (SVM) and Gaussian process regression (GPR) kernel based approaches were used to estimate suspended sediment load of a natural river at two consecutive hydrometric stations. For this purpose, different models were defined based on hydraulic and sediment particles characteristics. Moreover, the capability of integrated pre-processing and post-processing methods in two states of inter-station and between-stations was investigated. First, the Wavelet Transform (WT) method was used for data pre-processing then, the high-frequency sub-series were selected and re-decomposed using the Empirical Mode Decomposition (EMD). Finally, the most effective sub-series were imposed as inputs for kernel-based models. In addition, to assess the reliability of the superior model, Monte Carlo uncertainty analysis was used.The results showed that the GPR model had a desirable degree of uncertainty in modeling.<br /><strong>Materials and Methods: </strong>In this study, data of two stations of Housatonic River was used. The distance between stations was approximately 50 km. The first station is located near Great Brighton, Massachusetts, and the second station is in Connecticut. The basin area for the stations is 282 and 634 square miles, respectively. The flow path is from the first station to the second station. SVM and GPR models are based on the assumption that adjacent observations should convey information about each other. Gaussian processes are a way of specifying a prior directly over function space. This is a natural generalization of the Gaussian distribution whose mean and covariance are a vector and matrix, respectively. Due to prior knowledge about the data and functional dependencies, no validation process is required for generalization, and GP regression models are able to understand the predictive distribution corresponding to the test input. Wavelet Transform (WT) uses a ﬂexible window function (mother wavelet) in signal processing. The ﬂexible window function can be changed over time according to the signal shape and compactness. After using WT, the signal will decompose into two approximations (large-scale or low-frequency component) and detailed (small-scale component) components. EEMD was proposed to solve the mode mixing issue of empirical mode decomposition (EMD) which speciﬁes the true IMF as the mean of an ensemble of trials. Each trial consists of the decomposition results of the signal plus a white noise of ﬁnite amplitude. EMD can be used to decompose any complex signal into finite intrinsic mode functions and a residue, resulting in subtasks with simpler frequency components and stronger correlations that are easier to analyze and forecast. Another important feature of empirical model of decomposition is that it can be used for noise reduction of noisy time series, which can be effective in improving the accuracy of model predictions. In the uncertainty analysis method, two elements are used to test the robustness and to analyze the models uncertainty. The first one is the percentage of the studied outputs which are in the range of 95PPU and the next one is the average distance between the upper (X<sub>U</sub>) and lower (X<sub>L</sub>) uncertainty bands. In this regard, the considered model should be run many times (1000 times in this study), and the empirical cumulative distribution probability of the models be calculated. The upper and lower bands are considered 2.5% and 97.5% probabilities of the cumulative distribution, respectively.<br /><strong>Results and Discussion: </strong>In order to evaluate and review the performance of the tested models and determine the accuracy of the selected models, three performance criteria named Correlation Coefficient (CC), Determination Coefficient (DC), and Root Mean Square Errors (RSME) were used. The obtained results indicated that the accuracy of the applied integrated models was higher than the single SVM and GPR models. The use of integrated methods decreased the error criteria between 20 to 25 %. The obtained results for the uncertainty analysis showed that in suspended sediment load modeling the observed and predicted values were within the 95 PPU band in most of the cases. Moreover, it was found that the amount of d-Factors for train and test datasets were smaller than the standard deviation of the observed data. Therefore, based on the results, it could be induced that the suspended sediment modeling via integrated WT-EEMD-GPR model led to an allowable degree of uncertainty.<br /><strong>Conclusion: </strong>Comparison of the developed models’ accuracy revealed that integrated GPR and SVM models had higher performance compared with single GPR and SVM models in predicting the suspended sediment discharge. The use of these two methods approximately decreased the error criteria between 20 to 25 %. According to the results, for the models that were developed based on the station data, the model with the input parameters of Dw<sub>t</sub>, Dw<sub>t-1</sub>, and Ds<sub>t-1</sub> and in the case of investigating the relationship between the stations, the model with the input parameters of Ds<sub>t-2</sub>, Dw<sub>t-1</sub>, and Ds<sub>t-1</sub> were superior models. Also, based on the uncertainty analysis, the integrated GPR model had an allowable degree of uncertainty in suspended sediment modeling. However, it should be noted that the used methods are data sensitive models. Therefore, further studies using data ranges out of this study and field data should be carried out to determine the merits of the models to estimate suspended sediment load in the real conditions of flow.<strong>Introduction: </strong>Sediment transportation and accurate estimation of its rate is a significant issue for river engineers and researchers. So far, various and complex relationships have been proposed to predict the amount of suspended sediment transport rate, such as velocity and critical shear stress based equations. However, the complex nature of sediment transport and lack of validated models make it difficult to model the suspended sediment concentration and suspended sediment discharge carried by rivers. Although the developed models led to promising results in sediment transport prediction, due to the importance of sediment transport and its impact on hydraulic structures it is necessary to use other methods with higher efficiency. On the other hand, in recent years, the Meta model approaches have been applied in investigating the hydraulic and hydrologic complex phenomena. Hybrid models involving signal decomposition have also been shown to be effective in improving the prediction accuracy of time series prediction methods, as indicated in. Complementary Ensemble Empirical Mode Decomposition analysis is one of the widely used signal decomposition methods for hydrological time series prediction. Decomposition of time series reduces the difficulty of forecasting, thereby improving forecasting accuracy.<br />In this study, due to the complexity of the sediment and erosion phenomenon and the effect of different parameters in estimating, time series pre-processing methods along with support vector machine (SVM) and Gaussian process regression (GPR) kernel based approaches were used to estimate suspended sediment load of a natural river at two consecutive hydrometric stations. For this purpose, different models were defined based on hydraulic and sediment particles characteristics. Moreover, the capability of integrated pre-processing and post-processing methods in two states of inter-station and between-stations was investigated. First, the Wavelet Transform (WT) method was used for data pre-processing then, the high-frequency sub-series were selected and re-decomposed using the Empirical Mode Decomposition (EMD). Finally, the most effective sub-series were imposed as inputs for kernel-based models. In addition, to assess the reliability of the superior model, Monte Carlo uncertainty analysis was used.The results showed that the GPR model had a desirable degree of uncertainty in modeling.<br /><strong>Materials and Methods: </strong>In this study, data of two stations of Housatonic River was used. The distance between stations was approximately 50 km. The first station is located near Great Brighton, Massachusetts, and the second station is in Connecticut. The basin area for the stations is 282 and 634 square miles, respectively. The flow path is from the first station to the second station. SVM and GPR models are based on the assumption that adjacent observations should convey information about each other. Gaussian processes are a way of specifying a prior directly over function space. This is a natural generalization of the Gaussian distribution whose mean and covariance are a vector and matrix, respectively. Due to prior knowledge about the data and functional dependencies, no validation process is required for generalization, and GP regression models are able to understand the predictive distribution corresponding to the test input. Wavelet Transform (WT) uses a ﬂexible window function (mother wavelet) in signal processing. The ﬂexible window function can be changed over time according to the signal shape and compactness. After using WT, the signal will decompose into two approximations (large-scale or low-frequency component) and detailed (small-scale component) components. EEMD was proposed to solve the mode mixing issue of empirical mode decomposition (EMD) which speciﬁes the true IMF as the mean of an ensemble of trials. Each trial consists of the decomposition results of the signal plus a white noise of ﬁnite amplitude. EMD can be used to decompose any complex signal into finite intrinsic mode functions and a residue, resulting in subtasks with simpler frequency components and stronger correlations that are easier to analyze and forecast. Another important feature of empirical model of decomposition is that it can be used for noise reduction of noisy time series, which can be effective in improving the accuracy of model predictions. In the uncertainty analysis method, two elements are used to test the robustness and to analyze the models uncertainty. The first one is the percentage of the studied outputs which are in the range of 95PPU and the next one is the average distance between the upper (X<sub>U</sub>) and lower (X<sub>L</sub>) uncertainty bands. In this regard, the considered model should be run many times (1000 times in this study), and the empirical cumulative distribution probability of the models be calculated. The upper and lower bands are considered 2.5% and 97.5% probabilities of the cumulative distribution, respectively.<br /><strong>Results and Discussion: </strong>In order to evaluate and review the performance of the tested models and determine the accuracy of the selected models, three performance criteria named Correlation Coefficient (CC), Determination Coefficient (DC), and Root Mean Square Errors (RSME) were used. The obtained results indicated that the accuracy of the applied integrated models was higher than the single SVM and GPR models. The use of integrated methods decreased the error criteria between 20 to 25 %. The obtained results for the uncertainty analysis showed that in suspended sediment load modeling the observed and predicted values were within the 95 PPU band in most of the cases. Moreover, it was found that the amount of d-Factors for train and test datasets were smaller than the standard deviation of the observed data. Therefore, based on the results, it could be induced that the suspended sediment modeling via integrated WT-EEMD-GPR model led to an allowable degree of uncertainty.<br /><strong>Conclusion: </strong>Comparison of the developed models’ accuracy revealed that integrated GPR and SVM models had higher performance compared with single GPR and SVM models in predicting the suspended sediment discharge. The use of these two methods approximately decreased the error criteria between 20 to 25 %. According to the results, for the models that were developed based on the station data, the model with the input parameters of Dw<sub>t</sub>, Dw<sub>t-1</sub>, and Ds<sub>t-1</sub> and in the case of investigating the relationship between the stations, the model with the input parameters of Ds<sub>t-2</sub>, Dw<sub>t-1</sub>, and Ds<sub>t-1</sub> were superior models. Also, based on the uncertainty analysis, the integrated GPR model had an allowable degree of uncertainty in suspended sediment modeling. However, it should be noted that the used methods are data sensitive models. Therefore, further studies using data ranges out of this study and field data should be carried out to determine the merits of the models to estimate suspended sediment load in the real conditions of flow.https://jsw.um.ac.ir/article_40367_9a9b30ba8cf43cef1e936b2a999b6832.pdf