انتخاب ویژگی‌های مؤثر بر برخی شاخص‌های کیفیت فیزیکی خاک و پیش‌بینی آن‌ها با درخت تصمیم و رگرسیون چند متغیره خطی

نوع مقاله : مقالات پژوهشی

نویسندگان

1 دانشگاه جیرفت

2 دانشگاه ولیعصر رفسنجان

چکیده

اثر راهکارهای مدیریتی و استفاده از اراضی روی کیفیت خاک با شاخص‌های کیفیت فیزیکی خاک تعیین می‌شود که با ویژگی‌های فیزیکی- شیمیایی تخمین زده می‌شوند. جهت پیش‌بینی این شاخص‌ها، انتخاب ویژگی‌های مؤثر بر آن‌ها امری ضروری می‌باشد. مسأله انتخاب زیرمجموعه ویژگی‌ها، به مفهوم شناسایی و انتخاب یک زیرمجموعه مفید از ویژگی‌ها از میان مجموعه داده اولیه می‌باشد. در این مطالعه از ضریب همبستگی پیرسون برای انتخاب ویژگی‌های مؤثر بر شاخص‌های کیفیت فیزیکی خاک (از جمله؛ ظرفیت هوایی (AC)، ظرفیت زراعی نسبی (RFC) و آب قابل دسترس گیاه (PAWC)) استفاده شد و در ادامه پیش‌بینی این شاخص‌های کیفیت با الگوریتم درخت تصمیم رگرسیونی و رگرسیون چند متغیره خطی انجام شد. بدین منظور از 104 نقطه از چهار اراضی باغ، جنگل، مرتع و زراعی شهرستان رابر واقع در استان کرمان نمونه‌برداری شد و پارامترهای از جمله؛ بافت، تخلخل، چگالی ظاهری و حقیقی، هدایت هیدرولیکی، اسیدیته کربنات کلسیم، ظرفیت هوایی، ظرفیت زراعی نسبی و آب قابل دسترس اندازه‌گیری شدند. نتایج نشان داد برای ظرفیت هوایی، ویژگی‌های تخلخل، چگالی ظاهری، رس و شن، برای ظرفیت زراعی نسبی، تخلخل، شن، رس و سیلت و برای آب قابل دسترس گیاه، چگالی ظاهری، رس، قابلیت هدایت الکتریکی، تخلخل، شن و سیلت به عنوان پارامترهای ورودی مهم انتخاب شدند. همچنین مقدار R2 به‌دست آمده برای مدل درخت تصمیم برای ظرفیت هوایی، ظرفیت زراعی نسبی و آب قابل دسترس به ترتیب 95/0، 84/0 و 85/0 بود در حالی‌که در مدل رگرسیون چند متغیره خطی این شاخص‌ها به ترتیب 63/0، 62/0 و 61/0 مشاهده شدند. ویژگی‌های تخلخل و چگالی ظاهری بر ظرفیت هوایی، تخلخل بر ظرفیت زراعی نسبی و چگالی ظاهری بر روی آب قابل دسترس گیاه به عنوان مهم‌ترین پارامترهای مؤثر شناخته شدند. این پژوهش یک اساس برای پیش‌بینی و شناسایی پارامترهای مهم بر روی این سه ویژگی فیزیکی یا هیدرولیکی در خاک‌های کشاورزی، در منطقه نیمه خشک را فراهم کرد که می‌توان به مناطق دیگر نیز تعمیم داده شود.

کلیدواژه‌ها


عنوان مقاله [English]

Determining Features Influencing Some Soil Physical Quality Indicators and their Predictions Using Decision Tree and Multiple Linear Regression Models

نویسندگان [English]

  • hossin shekofte 1
  • maryam doustaky 2
  • aezam maseodi 2
1 jiroft Univesity , jiroft
2 Vali-Asr Univesity of Rafsanjan, Rafsanjan
چکیده [English]

Introduction: Soil quality is defined as the capacity of a soil to function within different land uses and ecosystem boundaries, sustain biological productivity, maintain environmental quality and promote plant, animal, and human health. Soil quality cannot be directly measured but can be evaluated on the basis of several parameters; the type of parameter to be used depends on research scale and goals. Soil quality indicators (SQIs) are used to evaluate the effect of different management and types of land use on soil quality and can be achieved by easily-measured soil physicochemical properties. Soil quality indicators are measurable characteristics of the soil affecting the soil capacity for crop production or environmental performance. Air capacity (AC), relative field capacity (RFC) and plant available water (PAWC) are the most important indicators. Selection of appropriate input parameters is the first and most important step in predicting SQIs. Feature selection can be defined as the identification and selection of a subset of useful features among the primary data collected. One of the methods for choosing the features is the Pearson coefficient, which shows the correlation between the input variables and target variable. When the coefficient is close to one, there is a strong relationship between the input and the target variable. The features having a correlation coefficients of greater than or equal to 0.9 are considered important and less than that are considered non-important. Decision tree algorithm is one of the prediction approaches in statistics and data mining literature. This algorithm can select the property with the highest separation capability. Working with this algorithm and interpret its results is very straightforward. The aims of this study were to select the best set of input properties influencing SQIs using Pearson correlation coefficient and then model the effect of the input properties by decision tree and multiple linear regression.
Materials and Methods: In this study, the Pearson correlation coefficient was used for selecting effective soil properties influencing SQIs and these indices were modeled and predicted by the decision tree algorithm with selected input properties. For this purpose, 104 soil samples were collected from the soil surface (0-15 cm depth) of four land uses including a garden with 20 year-old walnut trees, pasture, agriculture and a mountain almond in a semi-arid area in Iran (Rabor region, 29 27′ N to 38 54′ N and 56 45′ E to 57 16′ E). A multiple linear regression (MLR) model was constructed as the benchmark for the comparison of performances. Sensitivity analysis of decision tree model was performed with input variables using StatSoft method. The predictive capabilities of the proposed models were evaluated by the mean absolute error (MAE), root mean square error (RMSE), and coefficient of determination (R2) between measured and predicted SQIs values.
Results and Discussion: The soil properties including porosity, bulk density, clay and sand content for air capacity, porosity and sand, clay and silt content for relative field capacity, and bulk density, electrical conductivity, porosity, and sand, clay and silt content for plant available water were selected as important input parameters. In addition, the values of r2 for the decision tree model for air capacity, relative field capacity and plant available water were 0.95, 0.84 and 0.85, respectively, while the r2 values for multiple linear regression for AC, RFC and PAWC were 0.63, 0.62 and 0.61, respectively. According to the evaluation indices, it appears that the conventional regression model was poor in predicting SQIs. Therefore, conventional regression techniques (i.e., multiple-linear regression) may not be reliable for predicting the SQIs. The results of sensitivity analysis for decision tree model showed that porosity and bulk density for air capacity, porosity for relative field capacity and bulk density for plant available water had the greatest influence.
Conclusion: This research work provided a basis for predicting soil physical quality indicators and identifying important parameters impacting these indicators in agricultural soils, grassland and forests in semi-arid regions which can be generalized to other areas. Further studies are needed to assess the effects of selected input variables under different conditions.

کلیدواژه‌ها [English]

  • Air capacity
  • Available water
  • Modeling
  • Relative field capacity
  • Soil management
1- Alison L. E., and Modie C. D. 1965. Carbonate. P. 1379-1396. In: C. A. Black et al. (Ed.), Methods of soil analysis. Part II, Am. Soc. Agron., Madison, WI.
2- Allmaras R.R., Fritz V.A., PFLEGER F.L., and Copeland S.M. 2002. Impaired internal drainage and Aphanomyces euteiches root rot of pea caused by soil compaction in a finetextured soil. Soil and Tillage Research. 1740:1 -12.
3- Anlauf R., and Rehrmann P. 2012. Effect of compaction on soil hydraulic parameters of vegetative landfill covers. Geomaterials 2, 29–36.
4- Asghari Jafarabadi M., Soltani A., and Mohamadi S. M. 2013. Series Statistics; Correlation and regression. Iranian Journal of Diabetes and Lipid.12(6): 479-506.
5- Blake G. R. 1965. Bulk density. Methods of Soil Analysis. Part 1 Society of Agronomy, Madison, Wisconsin, USA.
6- Blake G. R., and Hartage K. H. 1986. Particle density, In: Klute, A. (Ed.), Methods of Soil Analysis, Part 1, # 9(2nd ed.), Agronomy Monograph. American Society of Agronomy,Madison, 377-381.
7- Bouyoucos G. J. 1951. A recalibration of hydrometer method for making mechanical analysis of soil. Agronomy Journal, 43: 434-438.
8- Breiman L., Friedman J., Olshen R., and Stone C. 1984 Classification and Regression Trees, Chapman & Hall/CRC Press, Boca Raton, FL. Development of a decision tree modeling approach .Geoderma. 139:277-287.
9- Dastourani M. T ., Habibipoor A., Ekhtesasi M. R., Talebi A., and Mahjoobi J. 2013. Evaluation of the Decision Tree Model in Precipitation Prediction (Case study: Yazd Synoptic Station). Iran-Water Resources Research. 8(3): 14-27. (In Persian with English abstract).
10- Dexter A.R. 2004. Soil physical quality. Part I: Theory, effects of soil texture, density, and organic matter, and effects on root growth. Geoderma. 120:201-214.
11- Drury, C.F., T.Q. Zhang, and B.D. Kay. 2003. The non-limiting and least limiting water range for soil nitrogen mineralization. Soil Science Society of America Journal. 67:1388-1404.
12- Emami H. 2012. Investigation the stability of agricultural soils in field Karaj. Journal of Soil Research (Soil and Water Sciences), 26:245-254.
13- Hughes G. F. 1968. On the Mean Accuracy of Statistical Pattern Recognizers. IEEE Transactions on Information Theory, 14, pp. 55-63.
14- Lal R. 1994. Soil Methods and guidelines for Sustainable use of soil and water resources in the tropics. Soil Management Support System, USDA,-NRCS. Washington, DC.
15- Mahmodabadi M., and Mazaheri M. 2012. Effect of some soil physical and chemical properties on permeability in field conditions. Jurnal of Engineering water and soil, 8, pp 14-26.
16- Moebius B.N., Van Es H.M., Schindelbeck R.R., Idowu O.J., Clune D.J., and Thies J.E. 2007. Evaluation of laboratory-measured soil physical properties and indicators of soil physical quality. Soil Science. 172: 895-912.
17- Mueller L., Kay B.D., Deen B., Hu C., Zhang Y., Wolff M., Eulenstein F., and Schindler U. 2009. Visual assessment of soil structure: Part II. Implications of tillage, rotation and traffic on sites in Canada, China and Germany. Soil and Tillage Research 103, 188-196.
18- Nabiollahi K., Haidari A., and Taghizadeh- Mehrjerdi R. 2014. Digital Mapping of Soil Texture Using Regression Tree and Artificial Neural Network in Bijar, Kurdistan. Journal of Water and Soil. 28(5):1025-1036. (In Persian with English abstract).
19- Norouzi M. 2009. Prediction of rainfed wheat yield using artificial neural network in Ardal district of Chaharmahal and Bakhtiari province. M.Sc. Thesis, Collage of Agriculture, Isfahan University of Technology, Isfahan, Iran, 112 p. (In Persian).
20- Omidvar K., Shafie SH., Taghizade Z., and Alipoor M. 2014. Evaluation of the Decision Tree Model in Precipitation Prediction Kermanshah Synoptic Station .Journal of operational research of geographical science. 14(34): 89-110. (In Persian).
21- Pahlavan Rad M., Khormali F., Tomanian N., Kiani F., and Komaki B. 2015. Predict soil classes using decision tree and logistic regression multivariate random on Golestan province. 14th iranian soil science congress, vali-e-Asr University of Rafsanjan.168-172.
22- Reynolds W.D., Bowman B.T., Drury C.F., Tan C.S., and X. Lu. 2002. Indicators of good soil physical quality: density and storage parameters. Geoderma. 110:131 -146.
23- Reynolds W., Drury C., Yang X., and Tan C. 2008. Optimal soil physical quality inferred through structural regression and parameter interactions. Geoderma 146, 466-474.
24- Richards L. A. 1954. Diagnosis and Improvement of Saline-Alkali Soils. U.S.D.A. Hand book, 60. Washington, D.C., U.S.A.
25- Shekofte H., Ramezani F., and Shirani H. 2017. Optimal feature selection for predicting soil CEC: Comparing the hybrid of ant colony organization algorithm and adaptive network-based fuzzy system with multiple linear regression. Geoderma 298: 27 –34.
26- Shirani H., Habibi M., Besalatpour A.A., and Esfandiarpour I. 2015. Determining the features influencing physical quality of calcareous soils in a semiarid region of Iran using a hybrid PSO-DT algorithm, 259-260 (2015)1-11.
27- Silva A.P., and Kay B.D. 1996. The sensitivity of shoot growth of corn to the least limiting water range of soils. Plant and Soil. 184: 323-329.
28- Singh M.J., and Khera K.L. 2009. Physical indicators of soil quality in relation to soil erodibility under different land uses. Arid Land Research and Management, 23:152-167.
29- Sobhani J., Najimi M., Pourkhorshidi A.R., and Parhizkar T. 2010. Prediction of the compressive strength of noslump concrete: A comparative study of regression, neural network and ANFIS models. Journal of Construction and Building Materials, 24: 709-718.
30- StatSoft Inc. 2004. Electronic Statistics Textbook (Tulsa, OK). http://www.statsoft.com/ textbook/stathome.html.
31- Taghizadeh-Mehrjardi R., Minasny B., Sarmadian F., and Malone P.B. 2013. Digital mapping of soil salinity in Ardakan region, central Iran. Geoderma, 213: 15-28.
32- Tan M., Tsang I. W., and Wang L. 2014. Towards ultrahigh dimensional feature selection for big data.J. Mach. Learn. Research., 15(1): 1371-1429.
33- Topp G.C., Reynolds W.D., Cook F.J., Kirby J.M., and Carter M.R. 1997. Physical attributes of soil quality. In: Gregorich, E.G., Carter, M.R. (Eds.), Soil Quality for Crop Production and Ecosystem Health. Developments in Soil Science. 25: 21 – 58.
34- Walkley A., and Black T. A. 1934. An examination of the Degtjareff method for determinating organic matter and a proposed modification of chromic acid titration method. Soil Sciences, 37: 29-38.
35- White R. 2006. Principles and practice of soil science 4th ed. Blackwell Publishing.