Predicting train journeys from smart card data: a real-world application of the sequence prediction problem
MetadataShow full item record
This study aims to predict the next journey of travelers by train based on smart card data. After preprocessing raw data into features describing jour- neys, the problem is framed as a sequence prediction instance. Domain modelling issues such as the choice of alphabet, representation of time and the definition of a sequence are discussed. A base alphabet is constructed, and closed frequent pattern mining is proposed as a method of algorithmi- cally extending it. The resulting data encodings are tested against a range of established sequence prediction algorithms. Results show the All-Kth- Order-Markov algorithm outperforms other algorithms by a margin. With regard to pattern encoding, the results are somewhat inconclusive.