Comparative analysis and integration of HIV drug resistance mutation testing and prediction tools
Summary
The pressure of antiretroviral therapy (ART) induces the selection of drug
resistant variants in patients with human immunodeficiency virus 1 (HIV-1).
Interpreting patient HIV-1 sequencing data allows personalization of treatment,
avoiding clinical complications associated with drug resistance. Rules-based
methods are the standard for drug resistance interpretation, although machine
learning-based methods have also been developed as an alternative. Moreover,
minority variants are relevant for the assessment of drug resistance, pointing to next-generation sequencing (NGS) techniques as the most appropriate
technologies for this application given its low frequency detection limit. Only
a limited number of pipelines are available for drug resistance interpretation
from NGS data. Several improvements could thus be proposed with respect to
the current pipelines. In this project, we compared the performance of seven
different drug resistance interpretation methods on 22 ART drug datasets. We
showed that the combination of rules-based HIVDB method, linear regression, and random forest in an ensemble approach achieved the best performance among all methods. Then, we proved the robustness of this ensemble
method to incomplete sequence coverage for protease inhibitor (PI), nucleoside
reverse transcriptase inhibitor (NRTI) and non-nucleoside reverse transcriptase
inhibitor (NNRTI) drugs. Finally, we integrated these results into a standalone
software tool to be incorporated into V-pipe, a software capable of analyzing
genetically variable viral NGS samples. The incorporation into V-pipe paves
the way for building an NGS-based pipeline for clinical and research purposes
that outperforms the current state-of-the-art.