ID R1A_SARS2 Reviewed; 4405 AA. AC P0DTC1; DT 22-APR-2020, integrated into UniProtKB/Swiss-Prot. DT 22-APR-2020, sequence version 1. DT 22-APR-2020, entry version 1. DE RecName: Full=Replicase polyprotein 1a; DE Short=pp1a; DE AltName: Full=ORF1a polyprotein; DE Contains: DE RecName: Full=Non-structural protein 1; DE Short=nsp1; DE AltName: Full=Leader protein; DE Contains: DE RecName: Full=Non-structural protein 2; DE Short=nsp2; DE AltName: Full=p65 homolog; DE Contains: DE RecName: Full=Non-structural protein 3; DE Short=nsp3; DE EC=3.4.19.12; DE EC=3.4.22.69; DE AltName: Full=PL2-PRO; DE AltName: Full=Papain-like proteinase; DE Short=PL-PRO; DE AltName: Full=SARS coronavirus main proteinase; DE Contains: DE RecName: Full=Non-structural protein 4; DE Short=nsp4; DE Contains: DE RecName: Full=3C-like proteinase; DE Short=3CL-PRO; DE Short=3CLp; DE EC=3.4.22.-; DE AltName: Full=nsp5; DE Contains: DE RecName: Full=Non-structural protein 6; DE Short=nsp6; DE Contains: DE RecName: Full=Non-structural protein 7; DE Short=nsp7; DE Contains: DE RecName: Full=Non-structural protein 8; DE Short=nsp8; DE Contains: DE RecName: Full=Non-structural protein 9; DE Short=nsp9; DE Contains: DE RecName: Full=Non-structural protein 10; DE Short=nsp10; DE AltName: Full=Growth factor-like peptide; DE Short=GFL; DE Contains: DE RecName: Full=Non-structural protein 11; DE Short=nsp11; OS Severe acute respiratory syndrome coronavirus 2 (2019-nCoV) (SARS-CoV-2). OC Viruses; Riboviria; Nidovirales; Cornidovirineae; Coronaviridae; OC Orthocoronavirinae; Betacoronavirus; unclassified Betacoronavirus. OX NCBI_TaxID=2697049; RN [1] RP NUCLEOTIDE SEQUENCE [GENOMIC RNA]. RX PubMed=32015508; DOI=10.1038/s41586-020-2008-3; RA Wu F., Zhao S., Yu B., Chen Y.-M., Wang W., Song Z.-G., Hu Y., Tao Z.-W., RA Tian J.-H., Pei Y.-Y., Yuan M.-L., Zhang Y.-L., Dai F.-H., Liu Y., RA Wang Q.-M., Zheng J.-J., Xu L., Holmes E.C., Zhang Y.-Z.; RT "A new coronavirus associated with human respiratory disease in China."; RL Nature 579:265-269(2020). CC -!- FUNCTION: [Replicase polyprotein 1a]: Multifunctional protein involved CC in the transcription and replication of viral RNAs. Contains the CC proteinases responsible for the cleavages of the polyprotein. CC {ECO:0000250|UniProtKB:P0C6X7}. CC -!- FUNCTION: [Non-structural protein 1]: Inhibits host translation by CC interacting with the 40S ribosomal subunit. The nsp1-40S ribosome CC complex further induces an endonucleolytic cleavage near the 5'UTR of CC host mRNAs, targeting them for degradation. Viral mRNAs are not CC susceptible to nsp1-mediated endonucleolytic RNA cleavage thanks to the CC presence of a 5'-end leader sequence and are therefore protected from CC degradation. By suppressing host gene expression, nsp1 facilitates CC efficient viral gene expression in infected cells and evasion from host CC immune response. {ECO:0000250|UniProtKB:P0C6X7}. CC -!- FUNCTION: [Non-structural protein 2]: May play a role in the modulation CC of host cell survival signaling pathway by interacting with host PHB CC and PHB2. Indeed, these two proteins play a role in maintaining the CC functional integrity of the mitochondria and protecting cells from CC various stresses. {ECO:0000250|UniProtKB:P0C6X7}. CC -!- FUNCTION: [Non-structural protein 3]: Responsible for the cleavages CC located at the N-terminus of the replicase polyprotein. In addition, CC PL-PRO possesses a deubiquitinating/deISGylating activity and processes CC both 'Lys-48'- and 'Lys-63'-linked polyubiquitin chains from cellular CC substrates. Participates together with nsp4 in the assembly of virally- CC induced cytoplasmic double-membrane vesicles necessary for viral CC replication. Antagonizes innate immune induction of type I interferon CC by blocking the phosphorylation, dimerization and subsequent nuclear CC translocation of host IRF3. Prevents also host NF-kappa-B signaling. CC {ECO:0000250|UniProtKB:P0C6X7}. CC -!- FUNCTION: [Non-structural protein 4]: Participates in the assembly of CC virally-induced cytoplasmic double-membrane vesicles necessary for CC viral replication. {ECO:0000250|UniProtKB:P0C6X7}. CC -!- FUNCTION: [3C-like proteinase]: Cleaves the C-terminus of replicase CC polyprotein at 11 sites. Recognizes substrates containing the core CC sequence [ILMVF]-Q-|-[SGACN]. Also able to bind an ADP-ribose-1''- CC phosphate (ADRP). {ECO:0000250|UniProtKB:P0C6X7}. CC -!- FUNCTION: [Non-structural protein 6]: Plays a role in the initial CC induction of autophagosomes from host reticulum endoplasmic. Later, CC limits the expansion of these phagosomes that are no longer able to CC deliver viral components to lysosomes. {ECO:0000250|UniProtKB:P0C6X7}. CC -!- FUNCTION: [Non-structural protein 7]: Forms a hexadecamer with nsp8 (8 CC subunits of each) that may participate in viral replication by acting CC as a primase. Alternatively, may synthesize substantially longer CC products than oligonucleotide primers. {ECO:0000250|UniProtKB:P0C6X7}. CC -!- FUNCTION: [Non-structural protein 8]: Forms a hexadecamer with nsp7 (8 CC subunits of each) that may participate in viral replication by acting CC as a primase. Alternatively, may synthesize substantially longer CC products than oligonucleotide primers. {ECO:0000250|UniProtKB:P0C6X7}. CC -!- FUNCTION: [Non-structural protein 9]: May participate in viral CC replication by acting as a ssRNA-binding protein. CC {ECO:0000250|UniProtKB:P0C6X7}. CC -!- FUNCTION: [Non-structural protein 10]: Plays a pivotal role in viral CC transcription by stimulating both nsp14 3'-5' exoribonuclease and nsp16 CC 2'-O-methyltransferase activities. Therefore plays an essential role in CC viral mRNAs cap methylation. {ECO:0000250|UniProtKB:P0C6X7}. CC -!- CATALYTIC ACTIVITY: CC Reaction=Thiol-dependent hydrolysis of ester, thioester, amide, peptide CC and isopeptide bonds formed by the C-terminal Gly of ubiquitin (a 76- CC residue protein attached to proteins as an intracellular targeting CC signal).; EC=3.4.19.12; Evidence={ECO:0000250|UniProtKB:P0C6U8}; CC -!- CATALYTIC ACTIVITY: CC Reaction=TSAVLQ-|-SGFRK-NH(2) and SGVTFQ-|-GKFKK the two peptides CC corresponding to the two self-cleavage sites of the SARS 3C-like CC proteinase are the two most reactive peptide substrates. The enzyme CC exhibits a strong preference for substrates containing Gln at P1 CC position and Leu at P2 position.; EC=3.4.22.69; CC Evidence={ECO:0000250|UniProtKB:P0C6U8}; CC -!- SUBUNIT: [Non-structural protein 2]: Interacts with host PHB and PHB2. CC {ECO:0000250|UniProtKB:P0C6X7}. CC -!- SUBUNIT: [3C-like proteinase]: 3CL-PRO exists as monomer and homodimer. CC Only the homodimer shows catalytic activity. CC {ECO:0000250|UniProtKB:P0C6X7}. CC -!- SUBUNIT: [Non-structural protein 4]: Interacts with PL-PRO and nsp6. CC {ECO:0000250|UniProtKB:P0C6X7}. CC -!- SUBUNIT: [Non-structural protein 7]: Eight copies of nsp7 and eight CC copies of nsp8 assemble to form a heterohexadecamer dsRNA-encircling CC ring structure. {ECO:0000250|UniProtKB:P0C6X7}. CC -!- SUBUNIT: [Non-structural protein 8]: Eight copies of nsp7 and eight CC copies of nsp8 assemble to form a heterohexadecamer dsRNA-encircling CC ring structure. {ECO:0000250|UniProtKB:P0C6X7}. CC -!- SUBUNIT: [Non-structural protein 9]: Is a dimer. CC {ECO:0000250|UniProtKB:P0C6X7}. CC -!- SUBUNIT: [Non-structural protein 10]: Forms a dodecamer and interacts CC with nsp14 and nsp16; these interactions enhance nsp14 and nsp16 CC enzymatic activities. {ECO:0000250|UniProtKB:P0C6X7}. CC -!- SUBCELLULAR LOCATION: [Non-structural protein 3]: Host membrane CC {ECO:0000250|UniProtKB:P0C6X7}; Multi-pass membrane protein CC {ECO:0000250|UniProtKB:P0C6X7}. Host cytoplasm CC {ECO:0000250|UniProtKB:P0C6X7}. CC -!- SUBCELLULAR LOCATION: [Non-structural protein 4]: Host membrane CC {ECO:0000250|UniProtKB:P0C6X7}; Multi-pass membrane protein CC {ECO:0000250|UniProtKB:P0C6X7}. Host cytoplasm CC {ECO:0000250|UniProtKB:P0C6X7}. Note=Localizes in virally-induced CC cytoplasmic double-membrane vesicles. {ECO:0000250|UniProtKB:P0C6X7}. CC -!- SUBCELLULAR LOCATION: [Non-structural protein 6]: Host membrane CC {ECO:0000250|UniProtKB:P0C6X7}; Multi-pass membrane protein CC {ECO:0000250|UniProtKB:P0C6X7}. CC -!- SUBCELLULAR LOCATION: [Non-structural protein 7]: Host cytoplasm, host CC perinuclear region {ECO:0000250|UniProtKB:P0C6X9}. Note=nsp7, nsp8, CC nsp9 and nsp10 are localized in cytoplasmic foci, largely perinuclear. CC Late in infection, they merge into confluent complexes. CC {ECO:0000250|UniProtKB:P0C6X9}. CC -!- SUBCELLULAR LOCATION: [Non-structural protein 8]: Host cytoplasm, host CC perinuclear region {ECO:0000250|UniProtKB:P0C6X9}. Note=nsp7, nsp8, CC nsp9 and nsp10 are localized in cytoplasmic foci, largely perinuclear. CC Late in infection, they merge into confluent complexes. CC {ECO:0000250|UniProtKB:P0C6X9}. CC -!- SUBCELLULAR LOCATION: [Non-structural protein 9]: Host cytoplasm, host CC perinuclear region {ECO:0000250|UniProtKB:P0C6X9}. Note=nsp7, nsp8, CC nsp9 and nsp10 are localized in cytoplasmic foci, largely perinuclear. CC Late in infection, they merge into confluent complexes. CC {ECO:0000250|UniProtKB:P0C6X9}. CC -!- SUBCELLULAR LOCATION: [Non-structural protein 10]: Host cytoplasm, host CC perinuclear region {ECO:0000250|UniProtKB:P0C6X9}. Note=nsp7, nsp8, CC nsp9 and nsp10 are localized in cytoplasmic foci, largely perinuclear. CC Late in infection, they merge into confluent complexes. CC {ECO:0000250|UniProtKB:P0C6X9}. CC -!- ALTERNATIVE PRODUCTS: CC Event=Ribosomal frameshifting; Named isoforms=2; CC Comment=Normal translation results in Replicase polyprotein 1a. CC Ribosomal frameshifting at the end of this protein occurs at low CC frequency and produces Replicase polyprotein 1ab.; CC Name=Replicase polyprotein 1a; CC IsoId=P0DTC1-1; Sequence=Displayed; CC Name=Replicase polyprotein 1ab; CC IsoId=P0DTD1-1; Sequence=External; CC -!- DOMAIN: The hydrophobic domains (HD) could mediate the membrane CC association of the replication complex and thereby alter the CC architecture of the host cell membrane. {ECO:0000250|UniProtKB:P0C6U8}. CC -!- PTM: Specific enzymatic cleavages in vivo by its own proteases yield CC mature proteins. 3CL-PRO and PL-PRO proteinases are autocatalytically CC processed. {ECO:0000250|UniProtKB:P0C6X7}. CC -!- MISCELLANEOUS: [Replicase polyprotein 1a]: Produced by conventional CC translation. {ECO:0000250|UniProtKB:P0C6U8}. CC -!- SIMILARITY: Belongs to the coronaviruses polyprotein 1ab family. CC {ECO:0000305}. CC --------------------------------------------------------------------------- CC Copyrighted by the UniProt Consortium, see https://www.uniprot.org/terms CC Distributed under the Creative Commons Attribution (CC BY 4.0) License CC --------------------------------------------------------------------------- DR EMBL; MN908947; QHD43415.1; ALT_FRAME; Genomic_RNA. DR Proteomes; UP000464024; Genome. DR PROSITE; PS00867; CPSASE_2; 1. DR PROSITE; PS51442; M_PRO; 1. DR PROSITE; PS51154; MACRO; 1. DR PROSITE; PS51124; PEPTIDASE_C16; 1. PE 3: Inferred from homology; KW Activation of host autophagy by virus; Decay of host mRNAs by virus; KW Endonuclease; Eukaryotic host gene expression shutoff by virus; KW Eukaryotic host translation shutoff by virus; Host cytoplasm; KW Host gene expression shutoff by virus; Host membrane; KW Host mRNA suppression by virus; Host-virus interaction; Hydrolase; KW Inhibition of host innate immune response by virus; KW Inhibition of host interferon signaling pathway by virus; KW Inhibition of host IRF3 by virus; Inhibition of host ISG15 by virus; KW Inhibition of host RLR pathway by virus; Membrane; Metal-binding; KW Modulation of host ubiquitin pathway by viral deubiquitinase; KW Modulation of host ubiquitin pathway by virus; Nuclease; Protease; KW Reference proteome; Repeat; Ribosomal frameshifting; RNA-binding; KW Thiol protease; Transmembrane; Transmembrane helix; KW Ubl conjugation pathway; Viral immunoevasion; Zinc; Zinc-finger. FT CHAIN 1..4405 FT /note="Replicase polyprotein 1a" FT /evidence="ECO:0000250|UniProtKB:P0C6U8" FT /id="PRO_0000449634" FT CHAIN 1..180 FT /note="Non-structural protein 1" FT /evidence="ECO:0000250|UniProtKB:P0C6V3" FT /id="PRO_0000449635" FT CHAIN 181..818 FT /note="Non-structural protein 2" FT /evidence="ECO:0000250|UniProtKB:P0C6V3" FT /id="PRO_0000449636" FT CHAIN 819..2763 FT /note="Non-structural protein 3" FT /evidence="ECO:0000250|UniProtKB:P0C6V3" FT /id="PRO_0000449637" FT CHAIN 2764..3263 FT /note="Non-structural protein 4" FT /evidence="ECO:0000250|UniProtKB:P0C6V3" FT /id="PRO_0000449638" FT CHAIN 3264..3569 FT /note="3C-like proteinase" FT /evidence="ECO:0000250|UniProtKB:P0C6V3" FT /id="PRO_0000449639" FT CHAIN 3570..3859 FT /note="Non-structural protein 6" FT /evidence="ECO:0000250|UniProtKB:P0C6V3" FT /id="PRO_0000449640" FT CHAIN 3860..3942 FT /note="Non-structural protein 7" FT /evidence="ECO:0000250|UniProtKB:P0C6V3" FT /id="PRO_0000449641" FT CHAIN 3943..4140 FT /note="Non-structural protein 8" FT /evidence="ECO:0000250|UniProtKB:P0C6V3" FT /id="PRO_0000449642" FT CHAIN 4141..4253 FT /note="Non-structural protein 9" FT /evidence="ECO:0000250|UniProtKB:P0C6V3" FT /id="PRO_0000449643" FT CHAIN 4254..4392 FT /note="Non-structural protein 10" FT /evidence="ECO:0000250|UniProtKB:P0C6V3" FT /id="PRO_0000449644" FT CHAIN 4393..4405 FT /note="Non-structural protein 11" FT /evidence="ECO:0000250|UniProtKB:P0C6V3" FT /id="PRO_0000449645" FT TRANSMEM 2226..2246 FT /note="Helical" FT /evidence="ECO:0000255" FT TRANSMEM 2318..2338 FT /note="Helical" FT /evidence="ECO:0000255" FT TRANSMEM 2339..2359 FT /note="Helical" FT /evidence="ECO:0000255" FT TRANSMEM 2361..2381 FT /note="Helical" FT /evidence="ECO:0000255" FT TRANSMEM 2776..2796 FT /note="Helical" FT /evidence="ECO:0000255" FT TRANSMEM 3045..3065 FT /note="Helical" FT /evidence="ECO:0000255" FT TRANSMEM 3077..3097 FT /note="Helical" FT /evidence="ECO:0000255" FT TRANSMEM 3100..3120 FT /note="Helical" FT /evidence="ECO:0000255" FT TRANSMEM 3128..3148 FT /note="Helical" FT /evidence="ECO:0000255" FT TRANSMEM 3165..3185 FT /note="Helical" FT /evidence="ECO:0000255" FT TRANSMEM 3587..3607 FT /note="Helical" FT /evidence="ECO:0000255" FT TRANSMEM 3609..3629 FT /note="Helical" FT /evidence="ECO:0000255" FT TRANSMEM 3635..3655 FT /note="Helical" FT /evidence="ECO:0000255" FT TRANSMEM 3674..3694 FT /note="Helical" FT /evidence="ECO:0000255" FT TRANSMEM 3730..3750 FT /note="Helical" FT /evidence="ECO:0000255" FT TRANSMEM 3779..3799 FT /note="Helical" FT /evidence="ECO:0000255" FT DOMAIN 1025..1194 FT /note="Macro" FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00490" FT DOMAIN 1634..1898 FT /note="Peptidase C16" FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00444" FT DOMAIN 3264..3569 FT /note="Peptidase C30" FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00772" FT ZN_FING 1752..1789 FT /note="C4-type" FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00444" FT ACT_SITE 1674 FT /note="For PL1-PRO activity" FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00444" FT ACT_SITE 1835 FT /note="For PL2-PRO activity" FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00444" FT ACT_SITE 3304 FT /note="For 3CL-PRO activity" FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00772" FT ACT_SITE 3408 FT /note="For 3CL-PRO activity" FT /evidence="ECO:0000255|PROSITE-ProRule:PRU00772" FT SITE 180..181 FT /note="Cleavage; by PL-PRO" FT /evidence="ECO:0000250|UniProtKB:P0C6V3" FT SITE 818..819 FT /note="Cleavage; by PL-PRO" FT /evidence="ECO:0000250|UniProtKB:P0C6V3" FT SITE 2763..2764 FT /note="Cleavage; by PL-PRO" FT /evidence="ECO:0000250|UniProtKB:P0C6V3" FT SITE 3263..3264 FT /note="Cleavage; by 3CL-PRO" FT /evidence="ECO:0000250|UniProtKB:P0C6V3" FT SITE 3569..3570 FT /note="Cleavage; by 3CL-PRO" FT /evidence="ECO:0000250|UniProtKB:P0C6V3" FT SITE 3859..3860 FT /note="Cleavage; by 3CL-PRO" FT /evidence="ECO:0000250|UniProtKB:P0C6V3" FT SITE 3942..3943 FT /note="Cleavage; by 3CL-PRO" FT /evidence="ECO:0000250|UniProtKB:P0C6V3" FT SITE 4140..4141 FT /note="Cleavage; by 3CL-PRO" FT /evidence="ECO:0000250|UniProtKB:P0C6V3" FT SITE 4253..4254 FT /note="Cleavage; by 3CL-PRO" FT /evidence="ECO:0000250|UniProtKB:P0C6V3" FT SITE 4392..4393 FT /note="Cleavage; by 3CL-PRO" FT /evidence="ECO:0000250|UniProtKB:P0C6V3" SQ SEQUENCE 4405 AA; 489989 MW; 7F8A21148A7A7E2A CRC64; MESLVPGFNE KTHVQLSLPV LQVRDVLVRG FGDSVEEVLS EARQHLKDGT CGLVEVEKGV LPQLEQPYVF IKRSDARTAP HGHVMVELVA ELEGIQYGRS GETLGVLVPH VGEIPVAYRK VLLRKNGNKG AGGHSYGADL KSFDLGDELG TDPYEDFQEN WNTKHSSGVT RELMRELNGG AYTRYVDNNF CGPDGYPLEC IKDLLARAGK ASCTLSEQLD FIDTKRGVYC CREHEHEIAW YTERSEKSYE LQTPFEIKLA KKFDTFNGEC PNFVFPLNSI IKTIQPRVEK KKLDGFMGRI RSVYPVASPN ECNQMCLSTL MKCDHCGETS WQTGDFVKAT CEFCGTENLT KEGATTCGYL PQNAVVKIYC PACHNSEVGP EHSLAEYHNE SGLKTILRKG GRTIAFGGCV FSYVGCHNKC AYWVPRASAN IGCNHTGVVG EGSEGLNDNL LEILQKEKVN INIVGDFKLN EEIAIILASF SASTSAFVET VKGLDYKAFK QIVESCGNFK VTKGKAKKGA WNIGEQKSIL SPLYAFASEA ARVVRSIFSR TLETAQNSVR VLQKAAITIL DGISQYSLRL IDAMMFTSDL ATNNLVVMAY ITGGVVQLTS QWLTNIFGTV YEKLKPVLDW LEEKFKEGVE FLRDGWEIVK FISTCACEIV GGQIVTCAKE IKESVQTFFK LVNKFLALCA DSIIIGGAKL KALNLGETFV THSKGLYRKC VKSREETGLL MPLKAPKEII FLEGETLPTE VLTEEVVLKT GDLQPLEQPT SEAVEAPLVG TPVCINGLML LEIKDTEKYC ALAPNMMVTN NTFTLKGGAP TKVTFGDDTV IEVQGYKSVN ITFELDERID KVLNEKCSAY TVELGTEVNE FACVVADAVI KTLQPVSELL TPLGIDLDEW SMATYYLFDE SGEFKLASHM YCSFYPPDED EEEGDCEEEE FEPSTQYEYG TEDDYQGKPL EFGATSAALQ PEEEQEEDWL DDDSQQTVGQ QDGSEDNQTT TIQTIVEVQP QLEMELTPVV QTIEVNSFSG YLKLTDNVYI KNADIVEEAK KVKPTVVVNA ANVYLKHGGG VAGALNKATN NAMQVESDDY IATNGPLKVG GSCVLSGHNL AKHCLHVVGP NVNKGEDIQL LKSAYENFNQ HEVLLAPLLS AGIFGADPIH SLRVCVDTVR TNVYLAVFDK NLYDKLVSSF LEMKSEKQVE QKIAEIPKEE VKPFITESKP SVEQRKQDDK KIKACVEEVT TTLEETKFLT ENLLLYIDIN GNLHPDSATL VSDIDITFLK KDAPYIVGDV VQEGVLTAVV IPTKKAGGTT EMLAKALRKV PTDNYITTYP GQGLNGYTVE EAKTVLKKCK SAFYILPSII SNEKQEILGT VSWNLREMLA HAEETRKLMP VCVETKAIVS TIQRKYKGIK IQEGVVDYGA RFYFYTSKTT VASLINTLND LNETLVTMPL GYVTHGLNLE EAARYMRSLK VPATVSVSSP DAVTAYNGYL TSSSKTPEEH FIETISLAGS YKDWSYSGQS TQLGIEFLKR GDKSVYYTSN PTTFHLDGEV ITFDNLKTLL SLREVRTIKV FTTVDNINLH TQVVDMSMTY GQQFGPTYLD GADVTKIKPH NSHEGKTFYV LPNDDTLRVE AFEYYHTTDP SFLGRYMSAL NHTKKWKYPQ VNGLTSIKWA DNNCYLATAL LTLQQIELKF NPPALQDAYY RARAGEAANF CALILAYCNK TVGELGDVRE TMSYLFQHAN LDSCKRVLNV VCKTCGQQQT TLKGVEAVMY MGTLSYEQFK KGVQIPCTCG KQATKYLVQQ ESPFVMMSAP PAQYELKHGT FTCASEYTGN YQCGHYKHIT SKETLYCIDG ALLTKSSEYK GPITDVFYKE NSYTTTIKPV TYKLDGVVCT EIDPKLDNYY KKDNSYFTEQ PIDLVPNQPY PNASFDNFKF VCDNIKFADD LNQLTGYKKP ASRELKVTFF PDLNGDVVAI DYKHYTPSFK KGAKLLHKPI VWHVNNATNK ATYKPNTWCI RCLWSTKPVE TSNSFDVLKS EDAQGMDNLA CEDLKPVSEE VVENPTIQKD VLECNVKTTE VVGDIILKPA NNSLKITEEV GHTDLMAAYV DNSSLTIKKP NELSRVLGLK TLATHGLAAV NSVPWDTIAN YAKPFLNKVV STTTNIVTRC LNRVCTNYMP YFFTLLLQLC TFTRSTNSRI KASMPTTIAK NTVKSVGKFC LEASFNYLKS PNFSKLINII IWFLLLSVCL GSLIYSTAAL GVLMSNLGMP SYCTGYREGY LNSTNVTIAT YCTGSIPCSV CLSGLDSLDT YPSLETIQIT ISSFKWDLTA FGLVAEWFLA YILFTRFFYV LGLAAIMQLF FSYFAVHFIS NSWLMWLIIN LVQMAPISAM VRMYIFFASF YYVWKSYVHV VDGCNSSTCM MCYKRNRATR VECTTIVNGV RRSFYVYANG GKGFCKLHNW NCVNCDTFCA GSTFISDEVA RDLSLQFKRP INPTDQSSYI VDSVTVKNGS IHLYFDKAGQ KTYERHSLSH FVNLDNLRAN NTKGSLPINV IVFDGKSKCE ESSAKSASVY YSQLMCQPIL LLDQALVSDV GDSAEVAVKM FDAYVNTFSS TFNVPMEKLK TLVATAEAEL AKNVSLDNVL STFISAARQG FVDSDVETKD VVECLKLSHQ SDIEVTGDSC NNYMLTYNKV ENMTPRDLGA CIDCSARHIN AQVAKSHNIA LIWNVKDFMS LSEQLRKQIR SAAKKNNLPF KLTCATTRQV VNVVTTKIAL KGGKIVNNWL KQLIKVTLVF LFVAAIFYLI TPVHVMSKHT DFSSEIIGYK AIDGGVTRDI ASTDTCFANK HADFDTWFSQ RGGSYTNDKA CPLIAAVITR EVGFVVPGLP GTILRTTNGD FLHFLPRVFS AVGNICYTPS KLIEYTDFAT SACVLAAECT IFKDASGKPV PYCYDTNVLE GSVAYESLRP DTRYVLMDGS IIQFPNTYLE GSVRVVTTFD SEYCRHGTCE RSEAGVCVST SGRWVLNNDY YRSLPGVFCG VDAVNLLTNM FTPLIQPIGA LDISASIVAG GIVAIVVTCL AYYFMRFRRA FGEYSHVVAF NTLLFLMSFT VLCLTPVYSF LPGVYSVIYL YLTFYLTNDV SFLAHIQWMV MFTPLVPFWI TIAYIICIST KHFYWFFSNY LKRRVVFNGV SFSTFEEAAL CTFLLNKEMY LKLRSDVLLP LTQYNRYLAL YNKYKYFSGA MDTTSYREAA CCHLAKALND FSNSGSDVLY QPPQTSITSA VLQSGFRKMA FPSGKVEGCM VQVTCGTTTL NGLWLDDVVY CPRHVICTSE DMLNPNYEDL LIRKSNHNFL VQAGNVQLRV IGHSMQNCVL KLKVDTANPK TPKYKFVRIQ PGQTFSVLAC YNGSPSGVYQ CAMRPNFTIK GSFLNGSCGS VGFNIDYDCV SFCYMHHMEL PTGVHAGTDL EGNFYGPFVD RQTAQAAGTD TTITVNVLAW LYAAVINGDR WFLNRFTTTL NDFNLVAMKY NYEPLTQDHV DILGPLSAQT GIAVLDMCAS LKELLQNGMN GRTILGSALL EDEFTPFDVV RQCSGVTFQS AVKRTIKGTH HWLLLTILTS LLVLVQSTQW SLFFFLYENA FLPFAMGIIA MSAFAMMFVK HKHAFLCLFL LPSLATVAYF NMVYMPASWV MRIMTWLDMV DTSLSGFKLK DCVMYASAVV LLILMTARTV YDDGARRVWT LMNVLTLVYK VYYGNALDQA ISMWALIISV TSNYSGVVTT VMFLARGIVF MCVEYCPIFF ITGNTLQCIM LVYCFLGYFC TCYFGLFCLL NRYFRLTLGV YDYLVSTQEF RYMNSQGLLP PKNSIDAFKL NIKLLGVGGK PCIKVATVQS KMSDVKCTSV VLLSVLQQLR VESSSKLWAQ CVQLHNDILL AKDTTEAFEK MVSLLSVLLS MQGAVDINKL CEEMLDNRAT LQAIASEFSS LPSYAAFATA QEAYEQAVAN GDSEVVLKKL KKSLNVAKSE FDRDAAMQRK LEKMADQAMT QMYKQARSED KRAKVTSAMQ TMLFTMLRKL DNDALNNIIN NARDGCVPLN IIPLTTAAKL MVVIPDYNTY KNTCDGTTFT YASALWEIQQ VVDADSKIVQ LSEISMDNSP NLAWPLIVTA LRANSAVKLQ NNELSPVALR QMSCAAGTTQ TACTDDNALA YYNTTKGGRF VLALLSDLQD LKWARFPKSD GTGTIYTELE PPCRFVTDTP KGPKVKYLYF IKGLNNLNRG MVLGSLAATV RLQAGNATEV PANSTVLSFC AFAVDAAKAY KDYLASGGQP ITNCVKMLCT HTGTGQAITV TPEANMDQES FGGASCCLYC RCHIDHPNPK GFCDLKGKYV QIPTTCANDP VGFTLKNTVC TVCGMWKGYG CSCDQLREPM LQSADAQSFL NGFAV //