Random Forest for lake ice detection using Sentinel-1 SAR data
Summary
Climate change is one of the greatest challenges humanity is presently facing. Lake ice is an important indicator of assessing climate change and provides a range of social and economical activities. Therefore, accurate
and highly temporal resolution monitoring is important. Machine learning in remote sensing is a widely
used research method in climate change detection. However, when it comes to lake ice monitoring optical
remote sensing becomes unfeasible due to cloud contamination. Therefore, Sentinel-1 SAR is a proposed
data source to overcome this problem as it has the ability to acquire images during day and nighttime and it
can penetrate clouds. Furthermore, a few lake and sea ice studies have been carried out using Sentinel-1 SAR
data. Nevertheless, most of the studies were performed locally and models were not tested to be generalizable. This research aimed to test the generalizability of a random forest model by training on four different
study areas using Sentinel-1 SAR data. Also, the influence of external features (e.g. texture extracted using a
Grey Level Co-occurrence Matrix (GLCM), wind, spatial) were studied. The validity of the model was assessed
using three different approaches to test local performance, overall performance, and the generalizability. The
model achieved an average local accuracy of 84%. In contrast, the performance using a generalized method
showed an average accuracy loss of 15% ranging in accuracy from 58% till 76%. To examine the influence
of external features different feature combinations were determined and used as input for the models. The
wind features improved the local models which indicated that these features are part of the characteristics
of the study area and therefore not generalizable. Texture and spatial features didn’t contribute to the model
at all. This can be explained by the small size of the lakes included in the study area and the noisy images.
From this it can be concluded that the random forest model is not readily applicable to other lakes and that
the backscatter features of the Sentinel-1 SAR data are the only consistent generalizable features that were
established during this research. Recommendations for further studies are to include many more study areas
in the model and to define until what local extent a model is accurate and reliable.