Active learning for multi-target regression problems, with an application to meta-modeling of transportation simulations
Summary
Active learning in the context of global surrogate modeling is a relatively recent research topic with high relevance for the industry. Global surrogates are typically used for search space exploration in optimization problems, when the evaluation of the original objective function is computationally expensive. Active learning methods are used to decrease the number of training samples needed by constructing a sequential design strategy, often relying on model-specific or statistical heuristics. However, in past literature the focus seems to be mostly on the prediction of single output functions. In this thesis, we investigate the learning performance of different active learning strategies in the case of multivariate response modeling. The selection strategy used in each round of the active learning procedure to construct a batch of points to be labeled by the oracle is critical for the overall learning performance. We propose multiple selection strategies for batch construction and evaluate them on four concrete multi-target regression problems. As a case study, we investigate the effectiveness of the batch selection strategies in practice by meta-modeling a complex traffic simulation in which car traffic is simulated in the city of Utrecht.