azam habibipoor; Ali Talebi; Ali Akbar Karimian; Farhad Dehghani; Mohammad Hosain Mokhtari
Abstract
Introduction: Salinity is one of the problems of arid and semi-arid soils. Identification and classification of saline/alkaline soils is necessity for dealing with difficult situations and correct management. Considering the nature of salinity data and selection of befitting methods to process data before ...
Read More
Introduction: Salinity is one of the problems of arid and semi-arid soils. Identification and classification of saline/alkaline soils is necessity for dealing with difficult situations and correct management. Considering the nature of salinity data and selection of befitting methods to process data before use artificial neural network, can result in better simulations. The aim of this study was to investigate the optimal method for data processing to enhance the accuracy of surface soil salinity simulation and improve the efficiency of decision tree algorithm.
Materials and Methods: The study area was 88940.4 hectares of Marvast plain located in central Iran (54° 5´to 54° 18´ east longitude and 30° 10´to 30° 35´north latitude). This region faces with problems of soil and water resources salinity. In this study, the effect of data processing on increasing accuracy of simulation of soil surface salinity was assessed in Marvast region using decision tree algorithm. For this purpose, the decision tree algorithm was applied and simulation was performed using three approaches i.e. original data, logarithmic data and standardized data. Finally, five statistics including R، Rmse، %Rmse، MAE and Bias were calculated to evaluate the performance of used simulation methods.
Results and Discussion: In this study, when the logarithmic data was used, the composition of band 7 – elevation was identified as the most appropriate condition. The created tree can estimate the soil salinity by five laws:
If elevation is less than 1519, then the average of surface soil salinity will be 147.9 ds/m.
If elevation is between 1519 to 1569.9, then the average of surface soil salinity will be 43.6 ds/m.
If elevation is between 1569.9 to 1609.8, then the average of surface soil salinity will be 17.5 ds/m.
If elevation is more or equal to 1609.8 and pixel value of band 7 (ETM+ sensor) in selected point is less than 0.295, then the average of surface soil salinity will be 4.7 ds/m.
If elevation is higher or equal to 1609.8 and pixel value of band 7 (ETM+ sensor) in selected point is more than or equal to 0.295, then the average of surface soil salinity will be 1.4 ds/m.
For the approach of using the logarithmic data, decision tree algorithm used two parameters out of 46 independent variables introduced into the model. R، Rmse، %Rmse، MAE and Bias for this method was computed to be 0.76, 0.49, 38.57, 0.37 and -0.14, respectively. The application of logarithmic data was recognized as the best method considering the lower calculated error and its less input requirement. Using Easy fit software, the distribution of salinity data was found to be Log Pearson 3. Thus, the use of logarithmic data improved model performance. Our findings were in agreement with those of Afkhami et al (2015) who increased the simulation accuracy of suspended sediment with artificial intelligence methods (Artificial neural networks and ANFIS) using logarithmic data.
Conclusions: As effective factors for soil salinity simulation vary in different regions, application of a unique method and indicator to estimate soil salinity in deferent region may not be possible.. The application of semi intelligent algorithm which limits user intervention and selects effective parameters for simulation would increase the simulation accuracy. Furthermore, considering the nature of salinity data and selection of befitting methods to process before using decision tree algorithm can effectively improve model performance. The current study was conducted to select an appropriate approach to enhance the simulation accuracy of surface soil salinity. The results demonstrate that the performance of decision tree algorithm as one of the artificial intelligence models can be affected by input data. In this study, Log-Pearson3 distribution was defined as the distribution of salinity data. Moreover, despite existence of significant correlation coefficients for three simulation methods, the error was lower when logarithmic data was used. Since the probability distribution of salinity data in the studied area was logarithmic (Log-Pearson 3), the reduction in error rate can be attributed to the probability distribution of salinity data.
N. Seyyednezhad Golkhatm; H. Rezaee Pazhand
Abstract
Introduction: The analysis of extreme events such as last frost dates are detrimental phenomena which influence in various branches of engineering, such as agriculture. The analysis and probability predicting of these events can be decrease damage of agriculture, horticulture and the others. Furthermore, ...
Read More
Introduction: The analysis of extreme events such as last frost dates are detrimental phenomena which influence in various branches of engineering, such as agriculture. The analysis and probability predicting of these events can be decrease damage of agriculture, horticulture and the others. Furthermore, this phenomenon can have a relation with other thermal indexes. The analyzing of last frost dates of all synoptic stations of Khorasan Razavi province is subject of this article. The frequency analysis applied to eight distributions. Then the relationship between last frost dates and termal index were studied. Best relation was between minimum temperature and return periods of last frost dates.
Materials and Methods: The analyzing of last frost dates (origin is 23th september) of all synoptic stations of Khorasan Razavi province is subject of this article. First data of each station were screening. The basic properties such as homogeneity, randomness, stationary, independence and outliers must be test. The eight distribution distribution Normal, Gumbel type 1, Gamma 2-parameter, Log normal 2 or 3 parameters, Generalized Pareto, Generalized extreme values and Pearson Type 3 fitted to data and the parameters estimated with 7 methods by the name of the several types of Moments (5 methods), maximum likelihood and the maximum Entropy. The Kolmogorov – Smirnov goodness of fit test can compared the best distribution. The return periods of last frost dates are major application in frequency analysis. There is maybe a relationship between periods and termal index such as min., max. and mean temperature. This relationship can be adapted by regression methods.
Results and Discussion: The statistical analysis for prediction probabilities and return periods of the last frost dates for all synoptic stations in Khorasan Razavi province and the relationship between annual temperature indicators and this phenomenon is the aim of this article. The origin dates of this phenomenon are 23th September. First, data were screened. Then basic hypothesis test were applied which including the Runtest (randomness), the Mann-Whitney test (homogeneity and jump), the Wald-Wolfowitz test (independence and stationary), the Grubbs and Beck test (detection Outliers) and the three sigma methods (Outlier). The results were: 1-The Golmakan, Kashmar and Torbatejam had lower Outliers that will not cause any problem in data analysis by their skewness. 2- The independence of all stations was accepted at the 10% level. 3-The Gonabad data was not homogeneous and removed. Eight probability distributions (Normal, Gumbel type 1, 2-parameter gamma, 2 and 3 parameters log-normal, the generalized Pareto, the generalized extreme values and the Pearson type 3) were applied. The skewness coefficients for all stations were more than 0.1 so Normal distribution was rejected. Also the7 methods of estimation (five different methods of moments, maximum likelihood and maximum entropy methods) were used. The ks fit test was applied. The ks for some stations were closed together at several estimations methods. The results are as follows: GPA (4 times), PT3 (4 times), LN2 (4 times), GA2 (3 times). The obtained results were: 1- The shortest duration of frost date was belonged to the Sarakhs station, but the longest return periods were not same. 3- The interior station ranges were 32 to 50 days for all return periods, with a mean of 41, standard deviation 9.3 and the coefficient of variation 5.9%, which represents the damping of the phenomenon within the station. 4-Pearson type 3, which has been recommended by some researchers, can not be generalized. 5- The major method of estimation was MOM (8 cases). The relationship between the last frost days and other meteorological factors such as, minimum, average and maximum temperature were investigated in this paper. The linear relationship between last frost days and the average annual minimum temperature were the best-fit.
Conclusion: The last frost dates analyzing of all Khorasan Razavi province synoptic stations is subject of this article. The data screening and basic tests were applied and data accepted as random samples. The 8 distributions with 7 methods of estimation were fitted to data. The best fitted distribution at all stations mainly included GPA, PT3, LN2. The major estimation method was MOM. The relationship between last frost periods and minimum temperature was the best linear models. So, we can predict the return period from this temperature as well.