실험Q&A를 통해 여러분의 지식을 나누어 주세요. 답변을 등록하시려면 로그인 해주세요.
본 정보는 네티즌에 의해 작성된 정보로, 내용 중 중요하다고 생각되는 부분은 추가적인 사실 확인을 반드시 하시길 바랍니다.
몇개 database가 있지만 ncbi에서 검색해 보세요.
NCBI SARS-CoV-2 datahub:
https://www.ncbi.nlm.nih.gov/labs/virus/vssi/#/virus?SeqType_s=Genome&VirusLineage_ss=SARS-CoV-2,%20taxid:2697049
표준 서열을 찾으신다면 3번째 RefSeq Genome 탭 선택하시고, assembly GCF_009858895.2 클릭하시면 Protein accessions 가 주르륵 나옵니다.
이 중 세번째가 spike protein 인 듯 합니다. 클릭해보시면:
Protein accession page:
https://www.ncbi.nlm.nih.gov/protein/YP_009724390.1
Gene accession page:
https://www.ncbi.nlm.nih.gov/gene/43740568
염기서열을 찾으셨으니까, 위 Gene accession page 에서 중간 즈음 "Genomic regions, transcripts, and products" 섹션에서 "Go to nucleotide: FASTA" 부분을 클릭하시면:
https://www.ncbi.nlm.nih.gov/nuccore/NC_045512.2?report=fasta&from=21563&to=25384
표준 유전체 (NC_045512.2) 에서 21563..25384 번째 서열이 스파이크 유전자 (S) 염기 서열이 되겠습니다.
생각보다 찾기 어렵게 해놓았네요;;
- 2020-01-05일 분석된 surface glycoprotein 서열입니다.
- 전공자가 아니면 찾기 힘들것 같아 서열을 올립니다.
- 염기서열
1 ATGTTTGTTT TTCTTGTTTT ATTGCCACTA GTCTCTAGTC AGTGTGTTAA
51 TCTTACAACC AGAACTCAAT TACCCCCTGC ATACACTAAT TCTTTCACAC
101 GTGGTGTTTA TTACCCTGAC AAAGTTTTCA GATCCTCAGT TTTACATTCA
151 ACTCAGGACT TGTTCTTACC TTTCTTTTCC AATGTTACTT GGTTCCATGC
201 TATACATGTC TCTGGGACCA ATGGTACTAA GAGGTTTGAT AACCCTGTCC
251 TACCATTTAA TGATGGTGTT TATTTTGCTT CCACTGAGAA GTCTAACATA
301 ATAAGAGGCT GGATTTTTGG TACTACTTTA GATTCGAAGA CCCAGTCCCT
351 ACTTATTGTT AATAACGCTA CTAATGTTGT TATTAAAGTC TGTGAATTTC
401 AATTTTGTAA TGATCCATTT TTGGGTGTTT ATTACCACAA AAACAACAAA
451 AGTTGGATGG AAAGTGAGTT CAGAGTTTAT TCTAGTGCGA ATAATTGCAC
501 TTTTGAATAT GTCTCTCAGC CTTTTCTTAT GGACCTTGAA GGAAAACAGG
551 GTAATTTCAA AAATCTTAGG GAATTTGTGT TTAAGAATAT TGATGGTTAT
601 TTTAAAATAT ATTCTAAGCA CACGCCTATT AATTTAGTGC GTGATCTCCC
651 TCAGGGTTTT TCGGCTTTAG AACCATTGGT AGATTTGCCA ATAGGTATTA
701 ACATCACTAG GTTTCAAACT TTACTTGCTT TACATAGAAG TTATTTGACT
751 CCTGGTGATT CTTCTTCAGG TTGGACAGCT GGTGCTGCAG CTTATTATGT
801 GGGTTATCTT CAACCTAGGA CTTTTCTATT AAAATATAAT GAAAATGGAA
851 CCATTACAGA TGCTGTAGAC TGTGCACTTG ACCCTCTCTC AGAAACAAAG
901 TGTACGTTGA AATCCTTCAC TGTAGAAAAA GGAATCTATC AAACTTCTAA
951 CTTTAGAGTC CAACCAACAG AATCTATTGT TAGATTTCCT AATATTACAA
1001 ACTTGTGCCC TTTTGGTGAA GTTTTTAACG CCACCAGATT TGCATCTGTT
1051 TATGCTTGGA ACAGGAAGAG AATCAGCAAC TGTGTTGCTG ATTATTCTGT
1101 CCTATATAAT TCCGCATCAT TTTCCACTTT TAAGTGTTAT GGAGTGTCTC
1151 CTACTAAATT AAATGATCTC TGCTTTACTA ATGTCTATGC AGATTCATTT
1201 GTAATTAGAG GTGATGAAGT CAGACAAATC GCTCCAGGGC AAACTGGAAA
1251 GATTGCTGAT TATAATTATA AATTACCAGA TGATTTTACA GGCTGCGTTA
1301 TAGCTTGGAA TTCTAACAAT CTTGATTCTA AGGTTGGTGG TAATTATAAT
1351 TACCTGTATA GATTGTTTAG GAAGTCTAAT CTCAAACCTT TTGAGAGAGA
1401 TATTTCAACT GAAATCTATC AGGCCGGTAG CACACCTTGT AATGGTGTTG
1451 AAGGTTTTAA TTGTTACTTT CCTTTACAAT CATATGGTTT CCAACCCACT
1501 AATGGTGTTG GTTACCAACC ATACAGAGTA GTAGTACTTT CTTTTGAACT
1551 TCTACATGCA CCAGCAACTG TTTGTGGACC TAAAAAGTCT ACTAATTTGG
1601 TTAAAAACAA ATGTGTCAAT TTCAACTTCA ATGGTTTAAC AGGCACAGGT
1651 GTTCTTACTG AGTCTAACAA AAAGTTTCTG CCTTTCCAAC AATTTGGCAG
1701 AGACATTGCT GACACTACTG ATGCTGTCCG TGATCCACAG ACACTTGAGA
1751 TTCTTGACAT TACACCATGT TCTTTTGGTG GTGTCAGTGT TATAACACCA
1801 GGAACAAATA CTTCTAACCA GGTTGCTGTT CTTTATCAGG ATGTTAACTG
1851 CACAGAAGTC CCTGTTGCTA TTCATGCAGA TCAACTTACT CCTACTTGGC
1901 GTGTTTATTC TACAGGTTCT AATGTTTTTC AAACACGTGC AGGCTGTTTA
1951 ATAGGGGCTG AACATGTCAA CAACTCATAT GAGTGTGACA TACCCATTGG
2001 TGCAGGTATA TGCGCTAGTT ATCAGACTCA GACTAATTCT CCTCGGCGGG
2051 CACGTAGTGT AGCTAGTCAA TCCATCATTG CCTACACTAT GTCACTTGGT
2101 GCAGAAAATT CAGTTGCTTA CTCTAATAAC TCTATTGCCA TACCCACAAA
2151 TTTTACTATT AGTGTTACCA CAGAAATTCT ACCAGTGTCT ATGACCAAGA
2201 CATCAGTAGA TTGTACAATG TACATTTGTG GTGATTCAAC TGAATGCAGC
2251 AATCTTTTGT TGCAATATGG CAGTTTTTGT ACACAATTAA ACCGTGCTTT
2301 AACTGGAATA GCTGTTGAAC AAGACAAAAA CACCCAAGAA GTTTTTGCAC
2351 AAGTCAAACA AATTTACAAA ACACCACCAA TTAAAGATTT TGGTGGTTTT
2401 AATTTTTCAC AAATATTACC AGATCCATCA AAACCAAGCA AGAGGTCATT
2451 TATTGAAGAT CTACTTTTCA ACAAAGTGAC ACTTGCAGAT GCTGGCTTCA
2501 TCAAACAATA TGGTGATTGC CTTGGTGATA TTGCTGCTAG AGACCTCATT
2551 TGTGCACAAA AGTTTAACGG CCTTACTGTT TTGCCACCTT TGCTCACAGA
2601 TGAAATGATT GCTCAATACA CTTCTGCACT GTTAGCGGGT ACAATCACTT
2651 CTGGTTGGAC CTTTGGTGCA GGTGCTGCAT TACAAATACC ATTTGCTATG
2701 CAAATGGCTT ATAGGTTTAA TGGTATTGGA GTTACACAGA ATGTTCTCTA
2751 TGAGAACCAA AAATTGATTG CCAACCAATT TAATAGTGCT ATTGGCAAAA
2801 TTCAAGACTC ACTTTCTTCC ACAGCAAGTG CACTTGGAAA ACTTCAAGAT
2851 GTGGTCAACC AAAATGCACA AGCTTTAAAC ACGCTTGTTA AACAACTTAG
2901 CTCCAATTTT GGTGCAATTT CAAGTGTTTT AAATGATATC CTTTCACGTC
2951 TTGACAAAGT TGAGGCTGAA GTGCAAATTG ATAGGTTGAT CACAGGCAGA
3001 CTTCAAAGTT TGCAGACATA TGTGACTCAA CAATTAATTA GAGCTGCAGA
3051 AATCAGAGCT TCTGCTAATC TTGCTGCTAC TAAAATGTCA GAGTGTGTAC
3101 TTGGACAATC AAAAAGAGTT GATTTTTGTG GAAAGGGCTA TCATCTTATG
3151 TCCTTCCCTC AGTCAGCACC TCATGGTGTA GTCTTCTTGC ATGTGACTTA
3201 TGTCCCTGCA CAAGAAAAGA ACTTCACAAC TGCTCCTGCC ATTTGTCATG
3251 ATGGAAAAGC ACACTTTCCT CGTGAAGGTG TCTTTGTTTC AAATGGCACA
3301 CACTGGTTTG TAACACAAAG GAATTTTTAT GAACCACAAA TCATTACTAC
3351 AGACAACACA TTTGTGTCTG GTAACTGTGA TGTTGTAATA GGAATTGTCA
3401 ACAACACAGT TTATGATCCT TTGCAACCTG AATTAGACTC ATTCAAGGAG
3451 GAGTTAGATA AATATTTTAA GAATCATACA TCACCAGATG TTGATTTAGG
3501 TGACATCTCT GGCATTAATG CTTCAGTTGT AAACATTCAA AAAGAAATTG
3551 ACCGCCTCAA TGAGGTTGCC AAGAATTTAA ATGAATCTCT CATCGATCTC
3601 CAAGAACTTG GAAAGTATGA GCAGTATATA AAATGGCCAT GGTACATTTG
3651 GCTAGGTTTT ATAGCTGGCT TGATTGCCAT AGTAATGGTG ACAATTATGC
3701 TTTGCTGTAT GACCAGTTGC TGTAGTTGTC TCAAGGGCTG TTGTTCTTGT
3751 GGATCCTGCT GCAAATTTGA TGAAGACGAC TCTGAGCCAG TGCTCAAAGG
3801 AGTCAAATTA CATTACACAT AA
- 단백질 서열
MFVFLVLLPLVSSQCVNLTTRTQLPPAYTNSFTRGVYYPDKVFRSSVLHS 50
TQDLFLPFFSNVTWFHAIHVSGTNGTKRFDNPVLPFNDGVYFASTEKSNI 100
IRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLGVYYHKNNK 150
SWMESEFRVYSSANNCTFEYVSQPFLMDLEGKQGNFKNLREFVFKNIDGY 200
FKIYSKHTPINLVRDLPQGFSALEPLVDLPIGINITRFQTLLALHRSYLT 250
PGDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETK 300
CTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFGEVFNATRFASV 350
YAWNRKRISNCVADYSVLYNSASFSTFKCYGVSPTKLNDLCFTNVYADSF 400
VIRGDEVRQIAPGQTGKIADYNYKLPDDFTGCVIAWNSNNLDSKVGGNYN 450
YLYRLFRKSNLKPFERDISTEIYQAGSTPCNGVEGFNCYFPLQSYGFQPT 500
NGVGYQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTG 550
VLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITP 600
GTNTSNQVAVLYQDVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCL 650
IGAEHVNNSYECDIPIGAGICASYQTQTNSPRRARSVASQSIIAYTMSLG 700
AENSVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDSTECS 750
NLLLQYGSFCTQLNRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKDFGGF 800
NFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLI 850
CAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAM 900
QMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQD 950
VVNQNAQALNTLVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGR 1000
LQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLM 1050
SFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGT 1100
HWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDPLQPELDSFKE 1150
ELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDL 1200
QELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSC 1250
GSCCKFDEDDSEPVLKGVKLHYT*