한빛사논문
Ina Bang 1†, Sang-Mok Lee 1†, Seojoung Park 1†, Joon Young Park 1, Linh Khanh Nong 1, Ye Gao 2, Bernhard O Palsson 2,3,4, Donghyuk Kim 1
1School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology, Ulsan 44919, Republic of Korea.
2Department of Bioengineering, University of California San Diego, La Jolla CA 92093, USA.
3Department of Pediatrics, University of California San Diego, La Jolla CA 92093, USA.
4Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Building 220, Kemitorvet, 2800 Kgs. Lyngby, Denmark.
†Ina Bang, Sang-Mok Lee and Seojoung Park contributed equally to this work.
Corresponding author: Donghyuk Kim
Abstract
Recognizing binding sites of DNA-binding proteins is a key factor for elucidating transcriptional regulation in organisms. ChIP-exo enables researchers to delineate genome-wide binding landscapes of DNA-binding proteins with near single base-pair resolution. However, the peak calling step hinders ChIP-exo application since the published algorithms tend to generate false-positive and false-negative predictions. Here, we report the development of DEOCSU (DEep-learning Optimized ChIP-exo peak calling SUite), a novel machine learning-based ChIP-exo peak calling suite. DEOCSU entails the deep convolutional neural network model which was trained with curated ChIP-exo peak data to distinguish the visualized data of bona fide peaks from false ones. Performance validation of the trained deep-learning model indicated its high accuracy, high precision and high recall of over 95%. Applying the new suite to both in-house and publicly available ChIP-exo datasets obtained from bacteria, eukaryotes and archaea revealed an accurate prediction of peaks containing canonical motifs, highlighting the versatility and efficiency of DEOCSU. Furthermore, DEOCSU can be executed on a cloud computing platform or the local environment. With visualization software included in the suite, adjustable options such as the threshold of peak probability, and iterable updating of the pre-trained model, DEOCSU can be optimized for users' specific needs.
논문정보
관련 링크
관련분야 연구자보기
소속기관 논문보기
관련분야 논문보기
해당논문 저자보기