Internal validity of a text mining algorithm to identify Adverse Drug Reactions in freetext entries from electronic health records of geriatrics and orthopedics patients
Summary
Background: Structured registration of adverse drug reactions (ADRs) in the electronic health record
(EHR) is vital in preventing recurrence of ADRs, but in practice, ADRs are often saved as free-text
only. Using a text mining tool could be useful in identifying these ADRs.
Aim: To determine the internal validity of a previously developed ADR-identifying text mining
algorithm at the geriatrics and orthopedics department at Catharina Hospital Eindhoven.
Methods: One year of EHR data from 15 orthopedics patients and 6 geriatrics patients were manually
reviewed for ADRs, creating a gold standard. MedDRA and SNOMED-CT terminology was used to
identify symptoms and the Dutch G-standard database was used to identify offending medications.
The same data was reviewed by the algorithm and its output was compared to the gold standard.
Results: A total of 100 unique ADRs were identified in the gold standard, 20 of which were
potentially serious. 14 ADRs were also found by the algorithm (true positives); 86 ADRs were marked
as false negatives. The algorithm also returned 49 false positives. Overall, the algorithm reached a
22% PPV (positive predictive value), 14% sensitivity and an F-measure of 0.17. At the geriatrics
department the PPV was 28%, as opposed to 15% at the orthopedics department. For serious ADRs,
the algorithm reached an overall sensitivity of 20%.
Conclusion: In this preliminary analysis, the algorithm did not meet our goals. This study needs to be
finished in order to draw valid conclusions. Future research into this algorithm is required for further
improvements and evaluation of its performance in different settings.