Exploring TCR TiRP Scores for Treg Identification and Population analysis
Summary
This paper examines the application of the TiRP score, a likelihood score for Treg cells, and its
underlying features for predicting T cell phenotypes (Treg or Tconv) and T cell population
dynamics. These features were investigated using a random forest classifier, clustering and
using the Bray-Curtis statistic. Non-overlapping TCR sequences from a reference dataset were
used to train and validate the random forest model, but the predictions were not
significantly better than random, indicating that the TiRP score and its underlying features
are insufficient to act as a definitive classifier. Afterwards, hierarchical clustering was
employed to investigate Tregness patterns in the Emerson dataset based on various features
such as age, gender, CMV status, and race, but no clear patterns emerged. Additionally,
Bray-Curtis (BC) similarity scores between the Emerson dataset and reference datasets were
calculated, showing equally high dissimilarity compared to Treg and Tconv populations,
indicating that the BC scores were non-informative on the nature of the donor T cell
repertoire. In addition, the BC scores did not exhibit significant changes with age or gender.
Overall, the TiRP score proved inadequate for population dynamics and classifier models due
to overlapping TCR, low diversity and the inherent noise in the scoring.