RP19 - Uncertainty aware haplotype based genomic variant effect prediction
We have developed a new statistical approach for the determination of HLA haplotypes while quantifying the involved uncertainty and, in particular, while considering the heterogeneity of tumor samples. In the second cohort, we consider haplotypes again, but this time at the level of their impact on the proteins derived from genes.
The prediction of effects of genomic variants is a crucial step in the analysis of genomic data, in particular in precision oncology and precision medicine in general. So far, this has been done by considering each individual variant alone by predicting its effect on a regulatory element or a protein (e.g. causing the early termination of the protein or, less severe, a changed amino acid), estimating the severity of this effect, and annotating public knowledge about the variant (like allele frequencies, pathogenicity, and disease associations). Various tools that accomplish this task have been developed, for example VEP  and SNPeff .
However, the consideration of each variant alone can lead to incomplete or in the extreme case even wrong information. For example, on the one hand, a deletion at the beginning of a protein that shifts the reading frame invalidates any downstream predictions of amino acid changes. On the other hand, if this first deletion is followed by an insertion that restores the reading frame, the impact on the protein might be less severe, or at least completely different. Another example is the collaboration of multiple amino acid changing variants that only together generate an effect severe enough to impact a certain binding domain of a protein. Finally, a severe variant inside a promoter or enhancer does only play a role for the targeted genes on the same haplotype or subclone, which needs to be taken into account when interpreting the interplay with potential additional variants in the targeted gene.
In total, we expect our new approach to yield a paradigm shift in variant interpretation that enables faster and more informed decisions in precision medicine.
 William McLaren et al. “The Ensembl Variant Effect Predictor”. In: Genome Biology 17.1 (June 6, 2016), p. 122. doi: 10.1186/s13059-016-0974-4.
 Pablo Cingolani et al. “A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff”. In: Fly 6.2 (Apr. 1, 2012), pp. 80–92. doi: 10.4161/fly.19695.
 Mölder, F. et al. Sustainable data analysis with Snakemake. F1000Res 10, 33 (2021). doi: 10.12688/f1000research.29032.1
 Johannes Köster et al. “Varlociraptor: enhancing sensitivity and controlling false discovery rate in somatic indel discovery”. In: Genome Biology 21.1 (Apr. 28, 2020), p. 98. doi: 10.1186/s13059-020-01993-6.