[DEBUG-WINDOW 처리영역 보기]
즐겨찾기  |  뉴스레터  |  오늘의 정보  |  e브릭몰e브릭몰 회원가입   로그인
실험복 제공
스폰서배너광고 안내  배너1 배너2 배너3 배너4
전체보기 행사별VOD 스페셜VOD 세미나VOD 교육VOD
SVM-based Protein Name Recognition using Edit-Distance Features Boosted by Virtual
제2차 한국생물정보학회 연례학술대회  |  2003.10.31
n this paper, we propose solutions to resolve the problem of many spelling variants and the problem of
lack of annotated corpus for training, which are two among the main difficulties in named entity
recognition in biomedical domain. To resolve the problem of spelling variants, we propose a use of editdistance as a feature for SVM. And we propose a use of virtual examples to automatically expand the
annotated corpus to resolve the lack-of-corpus problem. Using virtual examples, the annotated corpus can be extended in a fast, efficient and easy way. The experimental results show that the introduction of editdistance produces some improvements in protein name recognition performance. And the model, which is
trained with the corpus expanded by virtual examples, outperforms the model trained with the original
corpus. According to the proposed methods, we finally achieve the performance 75.80 in F-measure
(71.89 % in precision, 80.15 % in recall) in the experiment of protein name recognition on GENIA corpus (ver. 3.0).
본 동영상의 Citation 복사
조회 2534    주소복사 트위터 공유 페이스북 공유 
Text Mining and Biostatistics
제2차 한국생물정보학회 연례학술대회 | 2003.10.31
VOD홈 목록보기
제2차 한국생물정보학회 연례학술대회 - Sessions
  FAQ 더보기>  
VOD 홈  |  VODFAQ  |  VOD 문의 및 제안
 |  BRIC소개  |  이용안내  |  이용약관  |  개인정보처리방침  |  이메일무단수집거부
Copyright © BRIC. All rights reserved.  |  문의 member@ibric.org
트위터 트위터    페이스북 페이스북   유튜브 유튜브    RSS서비스 RSS