Domain adaptation in image classification of zooplankton
Summary
Automatic classification of zooplankton has been the focus of many studies due to the high cost, time consumption, and potential errors involved in expert labeling. Accurate classification of zooplankton is essential because these organisms are highly sensitive indicators of ecological changes in aquatic environments. However, differences across datasets, such as datasets obtained using a different camera, must be taken into account by the machine learning models to make use of the labels in source domain(s). This study evaluates the potential of combining datasets in both supervised and unsupervised settings. Two publicly available models were implemented, with the promising WMSSDA-$\beta$ that required significant modifications. The supervised WMSSDA-$\beta$ method uses multiple labeled datasets to enhance classification accuracy on a predefined labeled target dataset. This model uses adversarial techniques and statistical discrepancy techniques to weigh the influence of each source domain in the alignment process. In contrast, the unsupervised CAN approach utilizes a single labeled dataset to improve classification performance on an unlabeled target dataset through contrastive learning and clustering. WMSSDA-$\beta$ outperformed benchmark models, demonstrating its ability to leverage knowledge from different datasets despite distribution shifts. Furthermore, the potential of building a single WMSSDA-$\beta$ model to classify images in all dataset showed promising results.