Determining Features Influencing Some Soil Physical Quality Indicators and their Predictions Using Decision Tree and Multiple Linear Regression Models

Document Type : Research Article

Authors

1 jiroft Univesity , jiroft

2 Vali-Asr Univesity of Rafsanjan, Rafsanjan

Abstract

Introduction: Soil quality is defined as the capacity of a soil to function within different land uses and ecosystem boundaries, sustain biological productivity, maintain environmental quality and promote plant, animal, and human health. Soil quality cannot be directly measured but can be evaluated on the basis of several parameters; the type of parameter to be used depends on research scale and goals. Soil quality indicators (SQIs) are used to evaluate the effect of different management and types of land use on soil quality and can be achieved by easily-measured soil physicochemical properties. Soil quality indicators are measurable characteristics of the soil affecting the soil capacity for crop production or environmental performance. Air capacity (AC), relative field capacity (RFC) and plant available water (PAWC) are the most important indicators. Selection of appropriate input parameters is the first and most important step in predicting SQIs. Feature selection can be defined as the identification and selection of a subset of useful features among the primary data collected. One of the methods for choosing the features is the Pearson coefficient, which shows the correlation between the input variables and target variable. When the coefficient is close to one, there is a strong relationship between the input and the target variable. The features having a correlation coefficients of greater than or equal to 0.9 are considered important and less than that are considered non-important. Decision tree algorithm is one of the prediction approaches in statistics and data mining literature. This algorithm can select the property with the highest separation capability. Working with this algorithm and interpret its results is very straightforward. The aims of this study were to select the best set of input properties influencing SQIs using Pearson correlation coefficient and then model the effect of the input properties by decision tree and multiple linear regression.
Materials and Methods: In this study, the Pearson correlation coefficient was used for selecting effective soil properties influencing SQIs and these indices were modeled and predicted by the decision tree algorithm with selected input properties. For this purpose, 104 soil samples were collected from the soil surface (0-15 cm depth) of four land uses including a garden with 20 year-old walnut trees, pasture, agriculture and a mountain almond in a semi-arid area in Iran (Rabor region, 29 27′ N to 38 54′ N and 56 45′ E to 57 16′ E). A multiple linear regression (MLR) model was constructed as the benchmark for the comparison of performances. Sensitivity analysis of decision tree model was performed with input variables using StatSoft method. The predictive capabilities of the proposed models were evaluated by the mean absolute error (MAE), root mean square error (RMSE), and coefficient of determination (R2) between measured and predicted SQIs values.
Results and Discussion: The soil properties including porosity, bulk density, clay and sand content for air capacity, porosity and sand, clay and silt content for relative field capacity, and bulk density, electrical conductivity, porosity, and sand, clay and silt content for plant available water were selected as important input parameters. In addition, the values of r2 for the decision tree model for air capacity, relative field capacity and plant available water were 0.95, 0.84 and 0.85, respectively, while the r2 values for multiple linear regression for AC, RFC and PAWC were 0.63, 0.62 and 0.61, respectively. According to the evaluation indices, it appears that the conventional regression model was poor in predicting SQIs. Therefore, conventional regression techniques (i.e., multiple-linear regression) may not be reliable for predicting the SQIs. The results of sensitivity analysis for decision tree model showed that porosity and bulk density for air capacity, porosity for relative field capacity and bulk density for plant available water had the greatest influence.
Conclusion: This research work provided a basis for predicting soil physical quality indicators and identifying important parameters impacting these indicators in agricultural soils, grassland and forests in semi-arid regions which can be generalized to other areas. Further studies are needed to assess the effects of selected input variables under different conditions.

Keywords


1- Alison L. E., and Modie C. D. 1965. Carbonate. P. 1379-1396. In: C. A. Black et al. (Ed.), Methods of soil analysis. Part II, Am. Soc. Agron., Madison, WI.
2- Allmaras R.R., Fritz V.A., PFLEGER F.L., and Copeland S.M. 2002. Impaired internal drainage and Aphanomyces euteiches root rot of pea caused by soil compaction in a finetextured soil. Soil and Tillage Research. 1740:1 -12.
3- Anlauf R., and Rehrmann P. 2012. Effect of compaction on soil hydraulic parameters of vegetative landfill covers. Geomaterials 2, 29–36.
4- Asghari Jafarabadi M., Soltani A., and Mohamadi S. M. 2013. Series Statistics; Correlation and regression. Iranian Journal of Diabetes and Lipid.12(6): 479-506.
5- Blake G. R. 1965. Bulk density. Methods of Soil Analysis. Part 1 Society of Agronomy, Madison, Wisconsin, USA.
6- Blake G. R., and Hartage K. H. 1986. Particle density, In: Klute, A. (Ed.), Methods of Soil Analysis, Part 1, # 9(2nd ed.), Agronomy Monograph. American Society of Agronomy,Madison, 377-381.
7- Bouyoucos G. J. 1951. A recalibration of hydrometer method for making mechanical analysis of soil. Agronomy Journal, 43: 434-438.
8- Breiman L., Friedman J., Olshen R., and Stone C. 1984 Classification and Regression Trees, Chapman & Hall/CRC Press, Boca Raton, FL. Development of a decision tree modeling approach .Geoderma. 139:277-287.
9- Dastourani M. T ., Habibipoor A., Ekhtesasi M. R., Talebi A., and Mahjoobi J. 2013. Evaluation of the Decision Tree Model in Precipitation Prediction (Case study: Yazd Synoptic Station). Iran-Water Resources Research. 8(3): 14-27. (In Persian with English abstract).
10- Dexter A.R. 2004. Soil physical quality. Part I: Theory, effects of soil texture, density, and organic matter, and effects on root growth. Geoderma. 120:201-214.
11- Drury, C.F., T.Q. Zhang, and B.D. Kay. 2003. The non-limiting and least limiting water range for soil nitrogen mineralization. Soil Science Society of America Journal. 67:1388-1404.
12- Emami H. 2012. Investigation the stability of agricultural soils in field Karaj. Journal of Soil Research (Soil and Water Sciences), 26:245-254.
13- Hughes G. F. 1968. On the Mean Accuracy of Statistical Pattern Recognizers. IEEE Transactions on Information Theory, 14, pp. 55-63.
14- Lal R. 1994. Soil Methods and guidelines for Sustainable use of soil and water resources in the tropics. Soil Management Support System, USDA,-NRCS. Washington, DC.
15- Mahmodabadi M., and Mazaheri M. 2012. Effect of some soil physical and chemical properties on permeability in field conditions. Jurnal of Engineering water and soil, 8, pp 14-26.
16- Moebius B.N., Van Es H.M., Schindelbeck R.R., Idowu O.J., Clune D.J., and Thies J.E. 2007. Evaluation of laboratory-measured soil physical properties and indicators of soil physical quality. Soil Science. 172: 895-912.
17- Mueller L., Kay B.D., Deen B., Hu C., Zhang Y., Wolff M., Eulenstein F., and Schindler U. 2009. Visual assessment of soil structure: Part II. Implications of tillage, rotation and traffic on sites in Canada, China and Germany. Soil and Tillage Research 103, 188-196.
18- Nabiollahi K., Haidari A., and Taghizadeh- Mehrjerdi R. 2014. Digital Mapping of Soil Texture Using Regression Tree and Artificial Neural Network in Bijar, Kurdistan. Journal of Water and Soil. 28(5):1025-1036. (In Persian with English abstract).
19- Norouzi M. 2009. Prediction of rainfed wheat yield using artificial neural network in Ardal district of Chaharmahal and Bakhtiari province. M.Sc. Thesis, Collage of Agriculture, Isfahan University of Technology, Isfahan, Iran, 112 p. (In Persian).
20- Omidvar K., Shafie SH., Taghizade Z., and Alipoor M. 2014. Evaluation of the Decision Tree Model in Precipitation Prediction Kermanshah Synoptic Station .Journal of operational research of geographical science. 14(34): 89-110. (In Persian).
21- Pahlavan Rad M., Khormali F., Tomanian N., Kiani F., and Komaki B. 2015. Predict soil classes using decision tree and logistic regression multivariate random on Golestan province. 14th iranian soil science congress, vali-e-Asr University of Rafsanjan.168-172.
22- Reynolds W.D., Bowman B.T., Drury C.F., Tan C.S., and X. Lu. 2002. Indicators of good soil physical quality: density and storage parameters. Geoderma. 110:131 -146.
23- Reynolds W., Drury C., Yang X., and Tan C. 2008. Optimal soil physical quality inferred through structural regression and parameter interactions. Geoderma 146, 466-474.
24- Richards L. A. 1954. Diagnosis and Improvement of Saline-Alkali Soils. U.S.D.A. Hand book, 60. Washington, D.C., U.S.A.
25- Shekofte H., Ramezani F., and Shirani H. 2017. Optimal feature selection for predicting soil CEC: Comparing the hybrid of ant colony organization algorithm and adaptive network-based fuzzy system with multiple linear regression. Geoderma 298: 27 –34.
26- Shirani H., Habibi M., Besalatpour A.A., and Esfandiarpour I. 2015. Determining the features influencing physical quality of calcareous soils in a semiarid region of Iran using a hybrid PSO-DT algorithm, 259-260 (2015)1-11.
27- Silva A.P., and Kay B.D. 1996. The sensitivity of shoot growth of corn to the least limiting water range of soils. Plant and Soil. 184: 323-329.
28- Singh M.J., and Khera K.L. 2009. Physical indicators of soil quality in relation to soil erodibility under different land uses. Arid Land Research and Management, 23:152-167.
29- Sobhani J., Najimi M., Pourkhorshidi A.R., and Parhizkar T. 2010. Prediction of the compressive strength of noslump concrete: A comparative study of regression, neural network and ANFIS models. Journal of Construction and Building Materials, 24: 709-718.
30- StatSoft Inc. 2004. Electronic Statistics Textbook (Tulsa, OK). http://www.statsoft.com/ textbook/stathome.html.
31- Taghizadeh-Mehrjardi R., Minasny B., Sarmadian F., and Malone P.B. 2013. Digital mapping of soil salinity in Ardakan region, central Iran. Geoderma, 213: 15-28.
32- Tan M., Tsang I. W., and Wang L. 2014. Towards ultrahigh dimensional feature selection for big data.J. Mach. Learn. Research., 15(1): 1371-1429.
33- Topp G.C., Reynolds W.D., Cook F.J., Kirby J.M., and Carter M.R. 1997. Physical attributes of soil quality. In: Gregorich, E.G., Carter, M.R. (Eds.), Soil Quality for Crop Production and Ecosystem Health. Developments in Soil Science. 25: 21 – 58.
34- Walkley A., and Black T. A. 1934. An examination of the Degtjareff method for determinating organic matter and a proposed modification of chromic acid titration method. Soil Sciences, 37: 29-38.
35- White R. 2006. Principles and practice of soil science 4th ed. Blackwell Publishing.
CAPTCHA Image