Automatically creating predictive features from datasets
MetadataShow full item record
Rabobank is one of the biggest banks in the Netherlands and a global leader in food and agriculture financing. Rabobank is developing a model for Early Warning Systems (EWS). These EWS are meant to predict if a client of Rabobank will go into default within the next 12 months. The EWS model is created using a Decision Tree classifier trained on time series data. To train a Decision Tree on time series data, feature engineering on the time series data is needed. This way predictive features are created that can be used in training the Decision Tree. Because clients of Rabobank are involved, these features need to be interpretable. We will propose an algorithm based on the concept of Genetic Programming (GP) which will allow us to automate the feature engineering process for time series data. We will study the algorithm on various types of data, including data provided by Rabobank. We will answer the question if the algorithm can be used to create interpretable and predictive features. We will conclude that the proposed algorithm can guide as a helper to the data scientist performing feature engineering. The new features are predictive and can be interpreted in most cases.