Methods for integrating global land cover datasets: a case study in Western Europe
MetadataShow full item record
Current global land cover (GLC) maps have an overall accuracy around 70%, varying from 67 to 81% (Mora et al. 2014; See et al. 2014). Map producers and users feel a need to improve the quality of GLC maps as errors add to the uncertainty of GLC applications (Herold et al. 2011; Mora et al. 2014; Tsendbazar et al. 2015; Verburg et al. 2011). Improved GLC maps can be achieved by integrating different land cover (LC) datasets (Herold et al. 2008). Before LC maps can be integrated, LC data needs to be harmonized to the same thematic legend and spatial extent. LC products have limitations due to product inconsistency (Tuanmu and Jetz 2014). The use of differing methodologies in LC mapping, integration, classification scheme and algorithms and data sources raises GLC mapping inconsistency issues (Mora et al. 2014). Inconsistencies between GLC dataset form an obstacle for map integration. Integration aims to label LC information to the most accurate LC class, but is dependent on the LC information from the LC maps used for integration. There are different integration methodologies. Voting assigns a pixel to the LC class that occurs in the majority of the LC input datasets at the pixel’s location. Voting is a widely accepted approach in data integration (Ge et al. 2014; Goovaerts. 1999; Iwao et al. 2011; Jung et al. 2006; Kinoshita et al. 2014; Tuanmu and Jetz. 2014). This research uses normal voting, weighted voting and probability voting for the map integration of: FROM-GLC hierarchy (2013), Globcover 2009, LC-CCI (2010) and MODIS5 (2010) LC maps. Normal voting is a new method that is purely map driven and uses a two-step approach: (1) in case the LC input map agree on a LC class, pixels were assigned to a LC class from simple majority voting. (2) In case the LC input maps disagreed on a LC class and formed a tie, pixels were assigned to a LC class based on LC class preferences calculated from step 1. In Weighted voting, pixels are assigned to the LC class that accumulates the highest weight that is derived from user’s accuracy at that pixel’s location. In case of probability voting this accounts for the probability of each LC class being the correct class. Weights and probabilities were derived from the published confusion matrices FROM-GLC (Yu et al. 2014) and Globcover 2009, LC-CCI (2010) and MODIS5 (2010) (Tsendbazar et al. 2016). The integration methods were assessed on their overall and class specific accuracy in an external validation, by cross tabulating the assessed LC map against the reference dataset in a confusion matrix (Strahler et al. 2006; Foody 2005). The integration methods are evaluated on their improvement compared to each other and the LC input maps. As addition to the external validation, this research calculates the information entropy over the integration methods. Entropy is an internal measure of uncertainty and represents the amount of information necessary to require certainty (Shannon and Weaver. 1949). Based on the information entropy, probability voting was identified as the best integration method. A difference plot between the integration methods confirmed that normal voting and weighted voting achieved similar results. Normal voting, weighted voting and probability voting had an improved overall agreement with the reference dataset, respectively 70.85%, 71.72% and 71.40%, compared to the LC input maps. The improvement on class specific accuracies varied as LC-CCI (2010) often held the highest agreement metrics for LC classes. This can be explained; voting methods favor classes that have good probability or a high weight in the integration; therefore common classes are over-mapped and rare LC classes could have been under-mapped. The probability voting held the highest agreement metrics for LC classes among the voting methods.