
3D structural models and function annotation for all proteins encoded by the genome of SARS-CoV-2
Table I: Structure prediction and structure-based function annotation of SARS-CoV-2 genome.
Protein sequence translated from SARS-CoV-2 genome
1 C-I-TASSER structure model and estimated accuracyProtein name and function (based on UniProt curation of SARS-CoV-2 proteome)Solved experimental structure
1>QHD43415_1 (L=180)
MESLVPGFNEKTHVQLSLPVLQVRDVLVRGFGDSVEEVLS
EARQHLKDGTCGLVEVEKGVLPQLEQPYVFIKRSDARTAP
HGHVMVELVAELEGIQYGRSGETLGVLVPHVGEIPVAYRK
VLLRKNGNKGAGGHSYGADLKSFDLGDELGTDPYEDFQEN
WNTKHSSGVTRELMRELNGG

Estimate TM-score=0.55 Host translation inhibitor nsp1.
Inhibits host translation by interacting with the 40S ribosomal subunit. The nsp1-40S ribosome complex further induces an endonucleolytic cleavage near the 5'UTR of host mRNAs, targeting them for degradation. Viral mRNAs are not susceptible to nsp1-mediated endonucleolytic RNA cleavage thanks to the presence of a 5'-end leader sequence and are therefore protected from degradation. By suppressing host gene expression, nsp1 facilitates efficient viral gene expression in infected cells and evasion from host immune response.
[GO term predictions] [Ligand binding site predictions] NA(range:NA);
2> QHD43415_2 (L=638)
AYTRYVDNNFCGPDGYPLECIKDLLARAGKASCTLSEQLD
FIDTKRGVYCCREHEHEIAWYTERSEKSYELQTPFEIKLA
KKFDTFNGECPNFVFPLNSIIKTIQPRVEKKKLDGFMGRI
RSVYPVASPNECNQMCLSTLMKCDHCGETSWQTGDFVKAT
CEFCGTENLTKEGATTCGYLPQNAVVKIYCPACHNSEVGP
EHSLAEYHNESGLKTILRKGGRTIAFGGCVFSYVGCHNKC
AYWVPRASANIGCNHTGVVGEGSEGLNDNLLEILQKEKVN
INIVGDFKLNEEIAIILASFSASTSAFVETVKGLDYKAFK
QIVESCGNFKVTKGKAKKGAWNIGEQKSILSPLYAFASEA
ARVVRSIFSRTLETAQNSVRVLQKAAITILDGISQYSLRL
IDAMMFTSDLATNNLVVMAYITGGVVQLTSQWLTNIFGTV
YEKLKPVLDWLEEKFKEGVEFLRDGWEIVKFISTCACEIV
GGQIVTCAKEIKESVQTFFKLVNKFLALCADSIIIGGAKL
KALNLGETFVTHSKGLYRKCVKSREETGLLMPLKAPKEII
FLEGETLPTEVLTEEVVLKTGDLQPLEQPTSEAVEAPLVG
TPVCINGLMLLEIKDTEKYCALAPNMMVTNNTFTLKGG

Estimate TM-score=0.40 Non-structural protein 2 (nsp2).
May play a role in the modulation of host cell survival signaling pathway by interacting with host PHB and PHB2. Indeed, these two proteins play a role in maintaining the functional integrity of the mitochondria and protecting cells from various stresses.
[GO term predictions] [Ligand binding site predictions] NA(range:NA);
3>QHD43415_3 (L=1945)
APTKVTFGDDTVIEVQGYKSVNITFELDERIDKVLNEKCS
AYTVELGTEVNEFACVVADAVIKTLQPVSELLTPLGIDLD
EWSMATYYLFDESGEFKLASHMYCSFYPPDEDEEEGDCEE
EEFEPSTQYEYGTEDDYQGKPLEFGATSAALQPEEEQEED
WLDDDSQQTVGQQDGSEDNQTTTIQTIVEVQPQLEMELTP
VVQTIEVNSFSGYLKLTDNVYIKNADIVEEAKKVKPTVVV
NAANVYLKHGGGVAGALNKATNNAMQVESDDYIATNGPLK
VGGSCVLSGHNLAKHCLHVVGPNVNKGEDIQLLKSAYENF
NQHEVLLAPLLSAGIFGADPIHSLRVCVDTVRTNVYLAVF
DKNLYDKLVSSFLEMKSEKQVEQKIAEIPKEEVKPFITES
KPSVEQRKQDDKKIKACVEEVTTTLEETKFLTENLLLYID
INGNLHPDSATLVSDIDITFLKKDAPYIVGDVVQEGVLTA
VVIPTKKAGGTTEMLAKALRKVPTDNYITTYPGQGLNGYT
VEEAKTVLKKCKSAFYILPSIISNEKQEILGTVSWNLREM
LAHAEETRKLMPVCVETKAIVSTIQRKYKGIKIQEGVVDY
GARFYFYTSKTTVASLINTLNDLNETLVTMPLGYVTHGLN
LEEAARYMRSLKVPATVSVSSPDAVTAYNGYLTSSSKTPE
EHFIETISLAGSYKDWSYSGQSTQLGIEFLKRGDKSVYYT
SNPTTFHLDGEVITFDNLKTLLSLREVRTIKVFTTVDNIN
LHTQVVDMSMTYGQQFGPTYLDGADVTKIKPHNSHEGKTF
YVLPNDDTLRVEAFEYYHTTDPSFLGRYMSALNHTKKWKY
PQVNGLTSIKWADNNCYLATALLTLQQIELKFNPPALQDA
YYRARAGEAANFCALILAYCNKTVGELGDVRETMSYLFQH
ANLDSCKRVLNVVCKTCGQQQTTLKGVEAVMYMGTLSYEQ
FKKGVQIPCTCGKQATKYLVQQESPFVMMSAPPAQYELKH
GTFTCASEYTGNYQCGHYKHITSKETLYCIDGALLTKSSE
YKGPITDVFYKENSYTTTIKPVTYKLDGVVCTEIDPKLDN
YYKKDNSYFTEQPIDLVPNQPYPNASFDNFKFVCDNIKFA
DDLNQLTGYKKPASRELKVTFFPDLNGDVVAIDYKHYTPS
FKKGAKLLHKPIVWHVNNATNKATYKPNTWCIRCLWSTKP
VETSNSFDVLKSEDAQGMDNLACEDLKPVSEEVVENPTIQ
KDVLECNVKTTEVVGDIILKPANNSLKITEEVGHTDLMAA
YVDNSSLTIKKPNELSRVLGLKTLATHGLAAVNSVPWDTI
ANYAKPFLNKVVSTTTNIVTRCLNRVCTNYMPYFFTLLLQ
LCTFTRSTNSRIKASMPTTIAKNTVKSVGKFCLEASFNYL
KSPNFSKLINIIIWFLLLSVCLGSLIYSTAALGVLMSNLG
MPSYCTGYREGYLNSTNVTIATYCTGSIPCSVCLSGLDSL
DTYPSLETIQITISSFKWDLTAFGLVAEWFLAYILFTRFF
YVLGLAAIMQLFFSYFAVHFISNSWLMWLIINLVQMAPIS
AMVRMYIFFASFYYVWKSYVHVVDGCNSSTCMMCYKRNRA
TRVECTTIVNGVRRSFYVYANGGKGFCKLHNWNCVNCDTF
CAGSTFISDEVARDLSLQFKRPINPTDQSSYIVDSVTVKN
GSIHLYFDKAGQKTYERHSLSHFVNLDNLRANNTKGSLPI
NVIVFDGKSKCEESSAKSASVYYSQLMCQPILLLDQALVS
DVGDSAEVAVKMFDAYVNTFSSTFNVPMEKLKTLVATAEA
ELAKNVSLDNVLSTFISAARQGFVDSDVETKDVVECLKLS
HQSDIEVTGDSCNNYMLTYNKVENMTPRDLGACIDCSARH
INAQVAKSHNIALIWNVKDFMSLSEQLRKQIRSAAKKNNL
PFKLTCATTRQVVNVVTTKIALKGG

Estimate TM-score=0.58 Papain-like proteinase.
Responsible for the cleavages located at the N-terminus of the replicase polyprotein. In addition, PL-PRO possesses a deubiquitinating/deISGylating activity and processes both 'Lys-48'- and 'Lys-63'-linked polyubiquitin chains from cellular substrates. Participates together with nsp4 in the assembly of virally-induced cytoplasmic double-membrane vesicles necessary for viral replication. Antagonizes innate immune induction of type I interferon by blocking the phosphorylation, dimerization and subsequent nuclear translocation of host IRF3. Prevents also host NF-kappa-B signaling.
[GO term predictions] [Ligand binding site predictions] 6W6Y(range:207-379);
6W9C(range:748-1060);
4>QHD43415_4 (L=500)
KIVNNWLKQLIKVTLVFLFVAAIFYLITPVHVMSKHTDFS
SEIIGYKAIDGGVTRDIASTDTCFANKHADFDTWFSQRGG
SYTNDKACPLIAAVITREVGFVVPGLPGTILRTTNGDFLH
FLPRVFSAVGNICYTPSKLIEYTDFATSACVLAAECTIFK
DASGKPVPYCYDTNVLEGSVAYESLRPDTRYVLMDGSIIQ
FPNTYLEGSVRVVTTFDSEYCRHGTCERSEAGVCVSTSGR
WVLNNDYYRSLPGVFCGVDAVNLLTNMFTPLIQPIGALDI
SASIVAGGIVAIVVTCLAYYFMRFRRAFGEYSHVVAFNTL
LFLMSFTVLCLTPVYSFLPGVYSVIYLYLTFYLTNDVSFL
AHIQWMVMFTPLVPFWITIAYIICISTKHFYWFFSNYLKR
RVVFNGVSFSTFEEAALCTFLLNKEMYLKLRSDVLLPLTQ
YNRYLALYNKYKYFSGAMDTTSYREAACCHLAKALNDFSN
SGSDVLYQPPQTSITSAVLQ

Estimate TM-score=0.53 Non-structural protein 4 (nsp4).
Participates in the assembly of virally-induced cytoplasmic double-membrane vesicles necessary for viral replication.
[GO term predictions] [Ligand binding site predictions] NA(range:NA);
5>QHD43415_5 (L=306)
SGFRKMAFPSGKVEGCMVQVTCGTTTLNGLWLDDVVYCPR
HVICTSEDMLNPNYEDLLIRKSNHNFLVQAGNVQLRVIGH
SMQNCVLKLKVDTANPKTPKYKFVRIQPGQTFSVLACYNG
SPSGVYQCAMRPNFTIKGSFLNGSCGSVGFNIDYDCVSFC
YMHHMELPTGVHAGTDLEGNFYGPFVDRQTAQAAGTDTTI
TVNVLAWLYAAVINGDRWFLNRFTTTLNDFNLVAMKYNYE
PLTQDHVDILGPLSAQTGIAVLDMCASLKELLQNGMNGRT
ILGSALLEDEFTPFDVVRQCSGVTFQ

Estimate TM-score=0.96 Proteinase 3CL-PRO.
Cleaves the C-terminus of replicase polyprotein at 11 sites. Recognizes substrates containing the core sequence [ILMVF]-Q-|-[SGACN]. Also able to bind an ADP-ribose-1''-phosphate (ADRP).
[GO term predictions] [Ligand binding site predictions] 6LU7(range:1-306);
6>QHD43415_6 (L=290)
SAVKRTIKGTHHWLLLTILTSLLVLVQSTQWSLFFFLYEN
AFLPFAMGIIAMSAFAMMFVKHKHAFLCLFLLPSLATVAY
FNMVYMPASWVMRIMTWLDMVDTSLSGFKLKDCVMYASAV
VLLILMTARTVYDDGARRVWTLMNVLTLVYKVYYGNALDQ
AISMWALIISVTSNYSGVVTTVMFLARGIVFMCVEYCPIF
FITGNTLQCIMLVYCFLGYFCTCYFGLFCLLNRYFRLTLG
VYDYLVSTQEFRYMNSQGLLPPKNSIDAFKLNIKLLGVGG
KPCIKVATVQ

Estimate TM-score=0.37 Non-structural protein 6 (nsp6).
Plays a role in the initial induction of autophagosomes from host reticulum endoplasmic. Later, limits the expansion of these phagosomes that are no longer able to deliver viral components to lysosomes.
[GO term predictions] [Ligand binding site predictions] NA(range:NA);
7>QHD43415_7 (L=83)
SKMSDVKCTSVVLLSVLQQLRVESSSKLWAQCVQLHNDIL
LAKDTTEAFEKMVSLLSVLLSMQGAVDINKLCEEMLDNRA
TLQ

Estimate TM-score=0.63 Non-structural protein 7 (nsp7).
Forms a hexadecamer with nsp8 (8 subunits of each) that may participate in viral replication by acting as a primase. Alternatively, may synthesize substantially longer products than oligonucleotide primers.
[GO term predictions] [Ligand binding site predictions] 6M71(range:1-83);
8>QHD43415_8 (L=198)
AIASEFSSLPSYAAFATAQEAYEQAVANGDSEVVLKKLKK
SLNVAKSEFDRDAAMQRKLEKMADQAMTQMYKQARSEDKR
AKVTSAMQTMLFTMLRKLDNDALNNIINNARDGCVPLNII
PLTTAAKLMVVIPDYNTYKNTCDGTTFTYASALWEIQQVV
DADSKIVQLSEISMDNSPNLAWPLIVTALRANSAVKLQ

Estimate TM-score=0.88 Non-structural protein 8 (nsp8).
Forms a hexadecamer with nsp7 (8 subunits of each) that may participate in viral replication by acting as a primase. Alternatively, may synthesize substantially longer products than oligonucleotide primers.
[GO term predictions] [Ligand binding site predictions] 6M71(range:84-132);
9>QHD43415_9 (L=113)
NNELSPVALRQMSCAAGTTQTACTDDNALAYYNTTKGGRF
VLALLSDLQDLKWARFPKSDGTGTIYTELEPPCRFVTDTP
KGPKVKYLYFIKGLNNLNRGMVLGSLAATVRLQ

Estimate TM-score=0.93 Non-structural protein 9 (nsp9).
May participate in viral replication by acting as a ssRNA-binding protein.
[GO term predictions] [Ligand binding site predictions] 6W4B(range:1-113);
10>QHD43415_10 (L=139)
AGNATEVPANSTVLSFCAFAVDAAKAYKDYLASGGQPITN
CVKMLCTHTGTGQAITVTPEANMDQESFGGASCCLYCRCH
IDHPNPKGFCDLKGKYVQIPTTCANDPVGFTLKNTVCTVC
GMWKGYGCSCDQLREPMLQ

Estimate TM-score=0.90 Non-structural protein 10 (nsp10).
Plays a pivotal role in viral transcription by stimulating both nsp14 3'-5' exoribonuclease and nsp16 2'-O-methyltransferase activities. Therefore plays an essential role in viral mRNAs cap methylation.
[GO term predictions] [Ligand binding site predictions] 6W75(range:1-139);
11>QHD43415_11 (L=932)
SADAQSFLNRVCGVSAARLTPCGTGTSTDVVYRAFDIYND
KVAGFAKFLKTNCCRFQEKDEDDNLIDSYFVVKRHTFSNY
QHEETIYNLLKDCPAVAKHDFFKFRIDGDMVPHISRQRLT
KYTMADLVYALRHFDEGNCDTLKEILVTYNCCDDDYFNKK
DWYDFVENPDILRVYANLGERVRQALLKTVQFCDAMRNAG
IVGVLTLDNQDLNGNWYDFGDFIQTTPGSGVPVVDSYYSL
LMPILTLTRALTAESHVDTDLTKPYIKWDLLKYDFTEERL
KLFDRYFKYWDQTYHPNCVNCLDDRCILHCANFNVLFSTV
FPPTSFGPLVRKIFVDGVPFVVSTGYHFRELGVVHNQDVN
LHSSRLSFKELLVYAADPAMHAASGNLLLDKRTTCFSVAA
LTNNVAFQTVKPGNFNKDFYDFAVSKGFFKEGSSVELKHF
FFAQDGNAAISDYDYYRYNLPTMCDIRQLLFVVEVVDKYF
DCYDGGCINANQVIVNNLDKSAGFPFNKWGKARLYYDSMS
YEDQDALFAYTKRNVIPTITQMNLKYAISAKNRARTVAGV
SICSTMTNRQFHQKLLKSIAATRGATVVIGTSKFYGGWHN
MLKTVYSDVENPHLMGWDYPKCDRAMPNMLRIMASLVLAR
KHTTCCSLSHRFYRLANECAQVLSEMVMCGGSLYVKPGGT
SSGDATTAYANSVFNICQAVTANVNALLSTDGNKIADKYV
RNLQHRLYECLYRNRDVDTDFVNEFYAYLRKHFSMMILSD