Predicting the effect of human, heart-related regulatory SNPs in their native and orthologous mouse genomic contexts
Summary
Congenital heart disease (CHD) is the most common developmental malformation in
newborns. Recent genetical studies have raised attention at the role of non-coding regions in
CHD. To study these variants, researchers relied on mouse and human models. However,
with over a billion known single nucleotide polymorphisms (SNPs) and the lack of scalable
assays, experimental validation remains largely unfeasible.
To address these challenges, we trained ChromBPNet base-resolution models using single
cell ATAC-seq data from human and mouse fetal cardiac tissue. We retrieved SNPs from
human GWAS, obtained mouse orthologues, and used our trained models to predict variant
effects. Using these predictions, we aimed to study cross-species concordance in variant
effect. This analysis showed that variant effect predictions were cell type and trait dependent
and highly correlated in early developmental cell types.
