Hui Kwon Kim1,2,3,4,10, Goosang Yu1,2,10, Jinman Park1,2, Seonwoo Min5, Sungtae Lee1, Sungroh Yoon5,6,7 and Hyongbum Henry Kim1,2,3,4,8,9,*
1Department of Pharmacology, Yonsei University College of Medicine, Seoul, Republic of Korea. 2Brain Korea 21 Plus Project for Medical Sciences, Yonsei University College of Medicine, Seoul, Republic of Korea. 3Center for Nanomedicine, Institute for Basic Science (IBS), Seoul, Republic of Korea. 4Graduate Program of Nano Biomedical Engineering (NanoBME), Advanced Science Institute, Yonsei University, Seoul, Republic of Korea. 5Electrical and Computer Engineering, Seoul National University, Seoul, Republic of Korea. 6Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea. 7Graduate School of Data Science, Seoul National University, Seoul, Republic of Korea. 8Severance Biomedical Science Institute, Yonsei University College of Medicine, Seoul, Republic of Korea. 9Graduate Program of NanoScience and Technology, Yonsei University, Seoul, Republic of Korea. 10These authors contributed equally: Hui Kwon Kim, Goosang Yu.
*Correspondence to Hyongbum Henry Kim.
Prime editing enables the introduction of virtually any small-sized genetic change without requiring donor DNA or double-strand breaks. However, evaluation of prime editing efficiency requires time-consuming experiments, and the factors that affect efficiency have not been extensively investigated. In this study, we performed high-throughput evaluation of prime editor 2 (PE2) activities in human cells using 54,836 pairs of prime editing guide RNAs (pegRNAs) and their target sequences. The resulting data sets allowed us to identify factors affecting PE2 efficiency and to develop three computational models to predict pegRNA efficiency. For a given target sequence, the computational models predict efficiencies of pegRNAs with different lengths of primer binding sites and reverse transcriptase templates for edits of various types and positions. Testing the accuracy of the predictions using test data sets that were not used for training, we found Spearman’s correlations between 0.47 and 0.81. Our computational models and information about factors affecting PE2 efficiency will facilitate practical application of prime editing.