Mohammad Ali Mahmoodi; Sohaila Momeni; Masoud Davari
Abstract
Introduction: Land use and Land cover (LULC) information has been identified as one of the crucial data components for a range of applications including global change studies, urban planning, agricultural crop characterization, and forest ecosystem classification. The derivation of such information increasingly ...
Read More
Introduction: Land use and Land cover (LULC) information has been identified as one of the crucial data components for a range of applications including global change studies, urban planning, agricultural crop characterization, and forest ecosystem classification. The derivation of such information increasingly relies on remote sensing technology due to its ability to acquire valuable spatiotemporal information on LULC. One of the major approaches to deriving LULC information from remotely sensed images is classification. Numerous image classification algorithms exist. Among the most popular are the maximum likelihood classifier (MLC), artificial neural network (ANN) classifiers and decision tree (DT) classifiers. Conventional parametric method like MLC is based on statistical theory and assumes a multivariate normal distribution for each class. In case of data that has non-normal distribution (which is common with LULC data), the parametric classifiers may fail since the inability to resolve interclass confusion. This inability is the major limitation of parametric classifiers. Nonparametric classifiers like ANNs and DTs, which do not rely on any assumptions for the class distributions of data, could overcome the aforementioned limitations of parametric classifiers. The support vector machines (SVMs), a nonparametric classifier, that has recently been used in numerous applications in image processing, represents a group of theoretically superior machine learning algorithms. The SVM employs optimization algorithms to locate the optimal boundaries between classes. It was found competitive with the best available classification methods, including ANN and DT classifiers. The classification accuracy of SVMs is based upon the choice of the classification strategy and kernel function. The objective of this study was to investigate the sensitivity of SVM architecture including classification strategy and kernel types to identify LULC information from Landsat Enhanced Thematic Mapper (ETM) remote sensing data in Gavshan dam watershed in west of Iran.
Materials and Methods: SVMs were used to classify orthocorrected Landsat ETM images of May, 2016. Image pre-processing such as atmospheric correction were conducted before utilization. Three classification strategies (One versus one, one versus all and ordinal) and three types of kernels (linear, polynomial and radial basis function) were used for the SVM classification. A total of 18 different models were developed and implemented for sensitivity analysis of SVM architecture. A two-layer feed-forward Perceptron network classifier with sigmoid hidden and softmax output neurons was also used for comparison. The network was trained using scaled conjugate gradient backpropagation algorithm. A total of 1320 ground control points were collected to train, validate and test the SVM and ANN models. Ground truth locations on each image were identified using the GPS coordinates for extracting spectral reflectance data of seven bands (Bands 1-7) of Landsat ETM images. The LULC class of each point was identified using land survey or Google earth images. The identified LULC classes were agriculture, buffer forests, orchard, ranges brush, range grasses, urban areas, roads and water.
Results and Discussion: The results suggest that the choice of classification strategy and kernel types play an important role on SVMs classification accuracy. Statistical evaluation of the SVM models against the ground control points showed that the one versus one classification strategy had the highest accuracy than the two other ones for any kernel function type and the polynomial kernel function had the highest accuracy than the two other kernels for any classification strategy. The SVM model with polynomial (n=3) kernel and one versus one classification strategy outperformed all SVMs models and gave the highest overall classification accuracy of 78.5 and Kappa coefficient of 68.5. The McNemar’s test clearly showed significant improvement of the best SVM model in comparison to the ANN model (P<0.001). Also, the user accuracy and producer accuracy achieved by best SVM model were higher than ANN model for all LULC classes. In both approaches water and agriculture categories have high accuracy while roads have low accuracy. The resulting LULC map indicated that most parts of the studied area (52.8%) have been assigned to the agriculture. The ranges brush and range grasses categories cover 12.5% and 26.8% of the watershed, respectively. Only about 2.7% of the watershed have been covered with trees.
Conclusions: This study suggests that the SVMs approach based on Landsat ETM bands may provide reliable and accurate LULC information even better that best ANN approaches. However, choice of classification strategy and kernel types play an important role on SVMs classification accuracy. Best model of polynomial kernel and one versus one classification strategy outperformed all SVMs and ANN models and gave the highest classification accuracy.