نوع مقاله : مقالات پژوهشی
نویسندگان
1 دانشگاه ایلام
2 دانشگاه صنعتی اصفهان-دانشکده کشاورزی
3 دانشگاه تهران
چکیده
تهیه نقشه های خاک با صحت مناسب یک ابزار توانمند برای دست یافتن به استفاده پایدار از اراضی در عرصههای کشاورزی و منابع طبیعی محسوب میشود. پژوهش حاضر در بخشی از اراضی وَرگَر شهرستان آبدانان در استان ایلام به منظور نقشه برداری رقومی کلاسهای خاک با استفاده از مدلهای جنگل تصادفی و منطق فازی اجرا گردید. در اراضی مورد مطالعه موقعیت 44 خاکرخ تعیین، حفر، تشریح و نمونه برداری از کلیه افقهای ژنتیکی صورت پذیرفت. پس از انجام آزمایشهای فیزیکوشیمیایی لازم ردهبندی خاکها انجام شد. از مدل رقومی ارتفاع ماهواره آلوس پالسار و نرمافزار ساگا جیآیاس برای تهیه متغیرهای کمکی ژئومورفومتری استفاده گردید. سه رویکرد انتخاب متغیر شامل الگوریتم باروتا، شاخص تورم واریانس و میانگین کاهش صحت به همراه دو مدل دادهکاوی جنگل تصادفی و منطق فازی برای مدلسازی روابط خاک-زمیننما به کار گرفته شد. نتایج نشان داد که رویکرد انتخاب متغیر میانگین کاهش صحت بهعنوان مناسبترین روش، از تعداد 35 متغیر کمکی ژئومورفومتری منجر به انتخاب شش متغیر گردید. همچنین رویکرد مدلسازی جنگل تصادفی-میانگین کاهش صحت، در سطح زیرگروه با صحت عمومی و شاخص کاپای 84 و 57 درصد دارای بالاترین دقت بود. بررسی نتایج حاصل از رویکرد فازی حاکی از این بود که مقادیر شاخص کاپا و صحت عمومی این روش با سه سناریو دیگر مشابه و اختلاف ناچیزی بین صحت نتایج در سطح فامیل خاک مشاهده گردید. بهطورکلی استفاده از رویکردهای مختلف انتخاب متغیر میتوانند موجب افزایش دقت تهیه نقشه های رقومی خاک گردند. همچنین افزایش تعداد مشاهدات میدانی و استفاده از سایر متغیرهای محیطی تأثیرگذار بر روی تشکیل خاکها را می توان برای پیشبینی کلاسهای خاک با صحت پایین به کارگیری نمود.
کلیدواژهها
عنوان مقاله [English]
Efficiency of Different Feature Selection Methods in Digital Mapping of Subgroup and Soil Family Classes with Data Mining Algorithms
نویسندگان [English]
- S. Nazari 1
- M. Rostaminia 1
- shamsollah Ayoubi 2
- A. Rahmani 3
- S.R. Mousavi 3
1 Ilam University
2
3 Tehran University
چکیده [English]
Abstract
Background and objectives: High-accuracy of soil maps is a powerful tool for achieving land sustainability in agricultural and natural resources. The present study was conducted in Vargar lands of Abdanan city related to Ilam province for digital mapping of soil classes at two taxonomic level from subgroup up to family by random forest (RF) and fuzzy logic models.
Materials and methods: Study area with 1027 hectare have 628.6 mm and 22.6 C° mean annual precipitation and temperature respectively. Three major physiographic units included Hilland, Piedmont plain and Alluvial plain were observed. Soil moisture and temperature regimes are ustic and hyperthermic calculated based on Newhall model in JNSM 6.1 version software. A total of 44 soil profile observation with random sampling pattern was determined based on standardized soil surveys then digging, description and after sampling from all genetic horizons then soil samples were transferred to laboratory. Finally, all of soil profiles were classified based on soil taxonomy system (2014) up to family level. Geomorphometric covariates as a representative of soil forming factors were prepared from digital elevation model (ALOS PALSAR Satellite,2011) with 12.5 m resolution in SAGA GIS 7.4 version software. Three feature selection approaches included Boruta, Variance inflation factors (VIF) and Mean decrease accuracy (MDA) with two Random forest (RF) and Fuzzy logic data mining algorithms were applied for relating soil-landscape relationship by using “randomforest”, “caret” packages in R 3.5.1 and SoLIM solution version 2015 software. Sample based project used for predicting soil classes in Fuzzy logic modeling process. In totally observation profile split into two data set included 80 percent (n=36) for calibrating and 20 percent for validating (n=8) based on bootstraps sampling algorithm random forest. Internal validation of random forest algorithm was done based on out of bag error percentage (OOB%). The best model performance was determined based on overall accuracy (OA) and kappa index, also for each individual class user accuracy (UA) and producer accuracy (PA) were applied.
Results: The results shown that from number of 40 geomorphometrics covariates, six covariates included Terrain classification index for lowlands, Annual insolation, Topographic position Index, Upslope curvature, Real surface area and Terrain surface convexity were selected by MDA as the best environmental covariates. Also, RF-MDA method with overall accuracy 84% and Kappa index 0.56 had the best performance compared to other methods (RF_VIF, RF-BO, Fuzzy-MDA) in subgroup level with 58, 55, 50 and 0.3, 0.67 and 0.18 respectively. Out of bag error results (%OOB) for RF-MDA, RF-VIF and RF-Boruta were obtained that 72.42%, 67.86% and 82.76% for subgroup level and 93.10%, 93.10% and 86.21% for family level respectively. while there was little difference between the accuracy of the method at the family taxonomic level and performed similar results in modeling of soil classes process. The results of the fuzzy approach showed that the kappa index values and overall accuracy of this method were similar to the other three scenarios and there was a slight difference between the accuracy of the results at the soil family level. In the fuzzy method, it was observed that the kappa and overall accuracy values at the subgroup level were lower than the other scenarios. Fuzzy approaches in contrasted to RF modeling prevented continues spatial variability by generating of fuzzy maps for each of soil class in the landscape. These results indicate that the random forest method is superior to the fuzzy method in family class mapping and soil subgroups. Based on MDA sensitivity analysis index, similarly, three geomorphometrics covariate included Terrain surface convexity (convexity), Terrain classification index for lowlands (TCI_Low) and Real surface area (Surface_Ar) had highest importance for predicting soil classes at two taxonomic level. With regarded to final soil predicted maps area, two classes (Fine-silty, carbonatic, hyperthermic Typic Haplustepts) and Typic Calciustolls with 32.70% and 48.90% and (Fine-silty, carbonatic, hyperthermic Typic Calciustolls) and Typic Haplustepts with 0.18% and 1.85% had the highest and lowest content at family and subgroup maps respectively.
Conclusion: In general, using different variable selection approaches in situations where soil classes have a relatively imbalanced abundance can increase the accuracy of digital mapping in soil studies. Increasing the number of field observations and the use of other environmental variables affecting soil formation can also be used for gradating in prediction low-accuracy soil classes.
کلیدواژهها [English]
- Soil mapping
- random forest
- Fuzzy logic
- Environmental covariates
ارسال نظر در مورد این مقاله