Improving image recognition for species identification by modeling ecological context

Ubbink, Wouter

View/Open

Master_Thesis_100%.pdf (9.778Mb)

Publication date

2024

Author

Ubbink, Wouter

Metadata

Show full item record

Summary

Fine-grained image recognition can be used to identify species from images on the (sub)species level. One of the key challenges for improving the accuracy of species identification models are geographical bias and class imbalance: some species and some areas are overrepresented in the training data. Providing a model with contextual information such as location coordinates, date, environmental variables and neighboring species may help to overcome these problems by creating context-aware predictions. We combined 22 million images of 31 thousand species with information on location and date of observation, habitat variables and neighboring observations to train a new context-aware model. We employed a transformer architecture that enriches the image representation created by a convolutional neural network, using information from 800 nearby species. Transforming image representations using neighbouring observations is a novel approach to modeling ecological context. This model was compared with a baseline image-only model and ablation models, using existing and new metrics that measure how well the model is able to deal with data biases. The new context-aware model showed a significant performance improvement on all metrics. The overall accuracy improvement was 1.5 percent point, reducing the error rate by 9.5 percent. Enriching the image representations using a transformer architecture improved the model for most taxonomic groups. Species with few observation records profited more strongly from including contextual ecological information than species with many observations. Rare species that are only present locally could be correctly identified because the model had access to contextual information about the local ecology. Areas with few data points profited more from the new model than areas with a lot of data. The local accuracy in different areas became more equally distributed. In summary, the new model was better able to deal with geographical bias and class imbalance in the data. Image recognition for species identification thus profits from including contextual ecological information in the model, either as direct input or as a means to transform image representations.

URI

https://studenttheses.uu.nl/handle/20.500.12932/46439

Collections

Theses