HyunChul Jung1,2, Thomas Bleazard3, Jongkeun Lee1 & Dongwan Hong1
1Cancer Genomics Branch, Division of Convergence Technology, National Cancer Center, Gyeonggi-do, Korea. 2Bioinformatics and Systems Biology Graduate Program, University of California San Diego, La Jolla, California, USA. 3College of Natural Sciences, Seoul National University Graduate School, Seoul, Korea.
Correspondence to: Dongwan Hong
Whole-genome and exome sequencing enables the discovery of druggable cancer-driver genes, most commonly through the analysis of somatic point mutations. As a key step in detecting somatic point mutations, it is common practice to compare sequencing results against public single-nucleotide polymorphism (SNP) databases to help remove previously described variants that occur naturally in the human population. A problem with this approach is that public SNP databases include somatic mutations that may be cancer related, and indiscriminate filtering can often remove such SNPs, even though they may have relevance to disease biology. Here, we carry out a systematic investigation of somatic mutations in SNP databases and illustrate the importance of appropriate filtering. Finally, we propose an improved filtering workflow for the detection of cancer-related mutations.