Huiran Yeom1, Yonghee Lee1, Taehoon Ryu2, Jinsung Noh1, Amos Chungwon Lee3, Han-Byoel Lee4, Eunji Kang5, Seo Woo Song1 & Sunghoon Kwon1,2,3,6,*
1 Department of Electrical and Computer Engineering, Seoul National University, Seoul 08826, Republic of Korea. 2 Department of Molecular and Genetical Engineering, Celemics Inc., 371-17, Gasan-dong, Geumcheon-gu, 08506 Seoul, Republic of Korea. 3 Interdisciplinary Program for Bioengineering, Seoul National University, 08826 Seoul, Republic of Korea. 4 Department of Surgery, Seoul National University College of Medicine, Seoul National University Hospital Biomedical Research Institute, 03080 Seoul, Republic of Korea. 5 Cancer Research Institute, Seoul National University, 03080 Seoul, Republic of Korea. 6 Bio-MAX institute, Seoul National University, 08826 Seoul, Republic of Korea.
These authors contributed equally: Huiran Yeom, Yonghee Lee.
*Correspondence and requests for materials should be addressed to S.K.
The advent of next-generation sequencing (NGS) has accelerated biomedical research by enabling the high-throughput analysis of DNA sequences at a very low cost. However, NGS has limitations in detecting rare-frequency variants (< 1%) because of high sequencing errors (> 0.1~1%). NGS errors could be filtered out using molecular barcodes, by comparing read replicates among those with the same barcodes. Accordingly, these barcoding methods require redundant reads of non-target sequences, resulting in high sequencing cost. Here, we present a cost-effective NGS error validation method in a barcode-free manner. By physically extracting and individually amplifying the DNA clones of erroneous reads, we distinguish true variants of frequency > 0.003% from the systematic NGS error and selectively validate NGS error after NGS. We achieve a PCR-induced error rate of 2.5×10−6 per base per doubling event, using 10 times less sequencing reads compared to those from previous studies.