Background: There is an increasing trend to use electronic healthcare databases to conduct drug safety studies since they are more efficient compared to primary data capture. EHR databases can still have less statistical power if the information on exposure and/or outcome of interest is rare to obtain a sufficient sample size. A way to solve this problem is by conducting a multi-database study (MDBS). MDBS pose several challenges including how to manage heterogeneity between the different included data sources. Therefore, it is necessary to find data source specific methodologies to conduct MDBS. One of the strategies is the Component Algorithm Strategy. Here, the choice of a particular event-finding algorithm is generally dependent on the characteristics of the data source. By creating different event-finding algorithms, we aimed to demonstrate the impact and usefulness of the component algorithm strategy by applying it to a case study which investigated the risk of GI bleeding and stroke in New Oral Anti-Coagulants (NOACs) versus Vitamin K antagonists (VKA) users.
Methods: The component algorithm strategy was developed using data from two European healthcare databases- CPRD Gold (UK) and PHARMO (The Netherlands). GI Bleeding and Stroke algorithms were created for available data domains (diagnoses, signs/symptoms), healthcare settings and concept sets (signs/symptoms for GI Bleeding and Non-traumatic, Traumatic and Unspecified for Stroke). Algorithm- and database-specific incident rates for study outcomes were estimated for the study period 2010 – 2019. Cox regression analysis was performed to calculate the risk of GI bleeding or stroke while using NOACs/VKAs and adjusted for confounders.
Results/Conclusion: The implementation of the component algorithm strategy had a visible impact on the cases retrieved by each algorithm, with sensitive algorithms detecting higher number of cases compared to more specific algorithms. When calculated for each event definition, the risk of GI bleeding and stroke was higher in DOAC users compared to VKA users in CPRD Gold data source, whereas there was an increased risk of GI bleeding or stroke in VKA users for some event definitions in the PHARMO database, whereas VKAs showed a protective effect for some other event definitions.
Plain Language Summary:
Although randomised controlled trials (RCTs) are the gold standard to assess the safety and effectiveness of a new pharmaceutical product, they exclude a wide variety of population and therefore it is necessary to assess the safety and effectiveness of the drugs also in the excluded populations. One of the solutions to solve this problem is making use of electronic health records (EHR) to and studying the utilisation and effectiveness of the drugs in the real world. However, these EHR databases pose some problems when the drug or the medical condition of interest is rare and there are not enough people to obtain a sufficient sample size. To overcome this, multiple data sources are used in a single study which has its strengths and limitations. One of the limitations is the differences between the two data sources known as database heterogeneity. This could be due to differences in the way data is collected within the two data sources. Therefore, it is necessary to develop strategies that are tailor made for each data source. One of the strategies, known as the Component Algorithm strategy, makes use of specific data domains present in the data source to identify events of interest. By creating such algorithms, we can aim to overcome the issue of database heterogeneity while retrieving outcomes of interest.||