Variational inference in Stan
Summary
This thesis presents a novel, robust, and user-friendly variational inference (VI) algo- rithm implemented in Stan. As is customary, the algorithm uses the classical mean-field Gaussian variational family and the reparameterization trick to derive a gradient esti- mator. The novelty lies in the precise optimization routine and convergence criterion. The optimization operates in two phases: the warm-up, where quick progress is made and an appropriate step size is found. This is followed by the main optimization phase, which gradually decreases said step size when reaching stationarity. The process con- verges when the 2-Wasserstein distance between two Gaussian approximations from two epochs in the optimization phase is below a threshold. This threshold, set by the end- user, offers an intuitive accuracy parameter for convergence, in contrast with other VI implementations.
Experimental results demonstrate the algorithm’s robustness and effectiveness across various models from PosteriorDB, with accuracy and runtime largely insensitive to hyperparameter variations. The inclusion of model priors in the initialization phase significantly speeds up optimization, although implementation of this in Stan is still infeasible. The algorithm’s reliability is further enhanced by k-hat, although it proved ineffective for detecting untrustworthy posterior approximations.