Can differences in genomic sequence context distinguish between germline and somatic structural variants in tumor samples without paired normals?

Original title: Genomic sequence context differs between germline and somatic structural variants allowing for their differentiation in tumor samples without paired normals

Authors: Wolu Chukwu,Siyun Lee,Alexander Crane,Shu Zhang,Ipsa Mittra,Marcin Imielinski,Rameen Beroukhim,Frank Dubois,Simona Dalin

In this article, researchers discuss a new method for distinguishing between different types of genetic variations in tumor samples. Currently, there is no reliable way to differentiate between germline and somatic structural variants (SVs) when there is no matched normal sample available for comparison. To address this problem, the researchers analyzed various features of germline and somatic SVs from a large group of patients. They identified 21 features that were significantly different between the two types of SVs. By training a computer model called a support vector machine (SVM) on a large dataset, the researchers were able to computationally distinguish between germline and somatic SVs. The SVM classifier performed with high accuracy on both a test set from the same dataset and a separate non-TCGA cohort, demonstrating its robust performance across different datasets. This new method has the potential to greatly improve the accuracy of genetic analyses in cancer research. The authors declare no competing interests.

Original article: https://www.biorxiv.org/content/10.1101/2023.10.09.561462v2