Improving sequence analysis for the social sciences: a new and more useful method to determine similarity between sociological sequences
Summary
Sequence analysis has been an increasingly popular tool to find patterns in sociological sequences. Sequence analysis compares sequences individually on similarity after which similar sequences are clustered into distinct groups. Analysing how and why certain groups are different from
other groups yields important insights.
This paper proposes more useful method to calculate sociologically valid similarity values between sequences.
It is shown that the proposed method does not only yield sociologically expected results, it also outperforms existing algorithms with regard to the valuation of order and the support for time-dependent substitution matrices. Moreover, it supports both single-channel and multi-channel sequences, as well as sequences of unequal length. Finally, tests on existing data-sets show that the new method produces sociologically expected results for real-life data, and that the algorithm is confident in doing so.