Estimating the Effectiveness of Educational Interventions Prior to Implementation: Using Textual Embeddings of Intervention Descriptions
Summary
Improving teaching and learning contexts requires a better understanding of the effectiveness of educational interventions. Traditional evaluation methods, such as retrospective meta-analyses and prospective meta-analyses, are often costly, time-consuming or reliant on existing data. This highlights the need for methods that can estimate the effectiveness of an intervention before it is implemented.
This study investigates the use of text embeddings to predict the effectiveness of an intervention before implementation. When using embeddings as the only predictor variables, the Random Forest model achieves the highest predictive accuracy in settings where interventions within one study are treated as statistically independent. When including moderator variables as predictors, Gradient Boosting obtains the strongest performance. The results show that moderators provide stronger predictive power compared to both embeddings and combinations of moderators and embeddings. Embeddings and moderators currently capture distinct, non-overlapping information, with embeddings (at least in their current form) likely introducing noisy or irrelevant features. Although holding strong potential, the use of text embeddings is not yet sufficiently reliable to support decision-making by school boards or policy-makers and requires further improvements before they can effectively be applied in educational contexts.