Machine learning for predicting leads in content marketing
Summary
Commissioned by Jaarbeurs I created a model for predicting the number of leads specific content would generate in an online content marketing setting. I will describe how I addressed this problem and what methodology I used (Chapter 1). I will give an extensive overview of the data model I created and how I used imputation, feature engineering and feature selection to get the most out of the data (Chapter 2). In chapter 3, I will elaborate on the theoretical background of linear regression, logistic regression and survival analysis.
In chapter 4 the experiment setup and results of the models just using content data are discussed. A classification model is constructed to predict if a user would download certain content. This model is extended with features which describe a match between the user and the content (chapter 5). Survival analysis is used to make predictions depending on time. The newsletter data is added using time-dependent covariates (chapter 6).
In chapter 7, the results are discussed and a conclusion is drawn.