and a lead inhibitor with possible antiviral properties
Viruses belonging to the family Coronaviridae consist of virulent pathogens that have a zoonotic property. Severe acute respiratory syndrome coronaviruses (SARS-CoVs) and Middle East respiratory syndrome coronaviruses (MERS-CoVs) of this family have emerged before and SARS-CoV-2 has emerged now globally. The characterization of spike glycoproteins, polyproteins and other viral proteins from viruses is important for antiviral drug development. Homology modelling of these proteins with known templates offers the opportunity to discover ligand-binding sites and explore the possible antiviral properties of these protein–ligand complexes. In this study, we performed a complete bioinformatic analysis, sequence alignment, comparison of multiple sequences and modelling of the SARS-CoV-2 whole-genome sequences, the spike protein and the polyproteins for homology with known proteins. We also analysed binding sites in these models for possible binding with ligands that exhibit antiviral properties. Our results indicated that the sequence of the polyprotein isolate SARS-CoV-2_HKU-SZ-001_2020 showed 98.94 percent identity to SARS-coronavirus NSP12 bound to NSP7 and NSP8 co-factors. The results also indicated that a part of the viral genome (residues 3268–3573 in Frame 2 with 306 amino acids) of the SARS-CoV-2 isolate Wuhan-Hu-1 (GenBank Accession Number MN908947.3) when modelled with template of the PDB database showed 96 percent identity to a 3C-like peptidase of SARS-CoVs, which has the ability to bind with an aza-peptide epoxide (APE) known for the irreversible inhibition of SARS-CoV main peptidase. A docking profile with 9 different conformations of the ligand with the protein model using Autodock Vina showed an affinity of −7.1 kcal mol−1. This region was conserved in 831 genomes of SARS-CoV-2. The part of the genome (residues 1568–1882 in Frame 2 with 315 amino acids) when modelled with template 3e9s of the PDB database showed 82 percent identity to a papain-like protease/deubiquitinase, which when complexed with ligand GRL0617 acts as an inhibitor and can block SARS-CoV replication. A docking profile with 9 different conformations of the ligand with the protein model using Autodock Vina showed an affinity of −7.9 kcal mol−1. This region was conserved in 831 genomes of SARS-CoV-2. It is possible that these ligands can be used as antivirals of SARS-CoV-2.
The physico-chemical properties and primary structure parameters of the 7 polyprotein RdRp region of SARS-CoV-2 isolate are given in Table 1. RdRP forms an important part of the viral genome, where the RNA viruse's function is to catalyze the synthesis of the RNA strand complementary to a given RNA template.
The isolate SI200040-SP orf1ab polyprotein and the isolate SI200121-SP orf1ab polyprotein had 2 reading frames as compared to the rest of the isolates that had 3 reading frames. The presence of multiple reading frames suggests the possibility of overlapping genes as seen in many viral, prokaryotic and mitochondrial genomes. This could affect the way the proteins are formed. The number of amino acid residues in all the polyproteins were the same except one isolate, SI200040-SP, which had one amino acid more than the other polyproteins. The extinction coefficients of the two isolates, SI200040-SP orf1ab polyprotein and SI200121-SP orf1ab polyprotein, were much higher than that of the rest of the polyproteins. The extinction coefficient is important when studying protein–protein and protein–ligand interactions. The instability index of these two isolates was also higher than that of the others, indicating that these two isolates are predicted to be unstable. The regulation of gene expression by polyprotein processing is known in viruses, and this is observed in many viruses that are human pathogens.
The isolates here like many other viruses may be using the replication strategy that could involve the translation of a large polyprotein with the subsequent cleavage by viral proteases. The two isolates SI200040-SP orf1ab polyprotein and SI200121-SP orf1ab polyprotein also showed shorter half-lives than the other isolates, indicating that they are susceptible to enzymatic degradation.
The tertiary structure analysis of the isolate SARS-CoV-2_HKU-SZ-001_2020 ORF1ab polyprotein is given in Table 2.
It was observed that the polyprotein showed 98.94 percent identity to the PDB structure 6nur.1.A, which is a hetero-1-2-1-mer. The polyprotein is an RNA-directed RNA polymerase. The protein is identical to the SARS-coronavirus NSP12 bound to NSP7 and NSP8 co-factors.In SARS, it is basically a nonstructural protein with NSP12 being the RNA-dependent RNA polymerase, and the co factors NSP7 and NSP8 having the function of forming hexadecameric complexes which also act as processivity clamp for RNA polymerase and primase.This structure as in SARS-CoVs here in SARS-CoV-2 may be involved in the machinery of core RNA synthesis and can be a template for exploring antiviral properties.
The phylogenetic tree of the seven polyproteins
It is seen that the glycoproteins are similar in all the isolates. Multiple alignment of the polyproteins of SARS-CoV-2 is shown in Fig. S1 (ESI).
Based on the polyproteins function in the SARS CoV and its identity to the SARS-CoV-2, it is possible that it has the same functions in SARS-CoV-2 as an RNA polymerase which does de novo initiation and primer extension with possible exonuclease activities, the activity itself being primer dependent can be useful for understanding the mechanism of SARS-CoV-2 replication and can be used as an antiviral target.
The two parts of the main protein from the whole genome of SARS-CoV-2 aligned with two SARS CoV proteins and the ligand binding sites were similar; the alignment positions, number of amino acids and ligands and the interacting residues are given in Table 3.
The polyprotein also has an identity of 19.74 percent to an ABC-type uncharacterized transport system periplasmic component-like protein; this protein is known to be a substrate-binding protein and possible binding can be explored here.
The homology model developed from the residues 254 to 13 480 in Frame 2 with 4409 amino acids from the complete genome sequence of the SARS-CoV-2 isolate Wuhan-Hu-1 (GenBank Accession Number MN908947.3), which has 29903 bp with linear ss-RNA showed interesting template alignments, in all the model aligned with 50 templates from the PDB database with most of them being replicase polyprotein 1ab, which is a SARS-CoV papain-like protease.The maximum similarity of 97.3 percent was with the template structure of a Nsp9 protein from SARS-coronavirus, indicating that this novel coronavirus has high degree of similarity with the SARS-coronavirus and this can be used for gaining insights into vaccine development. Nsp9 is an RNA-binding protein and has an oligosaccharide/oligonucleotide fold-like fold; this protein can have an important function in the replication machinery of the virus and can be important when designing antivirals for this virus.
Two models were developed, one of SARS-CoV-2 3CLpro protein from residues 3268–3573 in Frame 2 with 306 amino acids and the other of SARS-CoV-2 PLPro protein from the part of the genome residues 1568–1882 in Frame 2 with 315 amino acids of the SARS-CoV-2 isolate Wuhan-Hu-1 (GenBank Accession Number MN908947.3). The models exhibited similarity to the 3C-like proteinase and a papain-like protease/deubiquitinase proteins, which are known antiviral drug targets. These 3CLpro and PLpro models constitute starting points for anti-SARS-CoV-2 drug design as the corresponding SARS proteins are validated drug targets.
Ligand binding with these proteins and their action on viral replication and inactivation can be useful in stopping the viral replication.The homology models of the 4409 amino acid residues of the SARS-CoV-2 isolate Wuhan-Hu-1 with the ligand association with templates 2a5i and 3e9s are shown in Fig. 2 and 3 respectively
Fig. 2 Homology model of SARS-CoV-2 3CLpro derived from the SARS PDB template 2a5i with aza-peptide epoxide (APE) ligand binding.
Fig. 3Homology model of SARS-CoV-2 3CLpro derived from the SARS PDB template 3e9s with GRL0617 ligand binding
The statistics of structural comparison with PDB templates is given in Table 4; it is seen that the proteins from SARS-CoV-2 are significantly close to the proteins of SARS-CoVs and the amino acid alignments in the binding region of both the viruses are the same.
The alignment of the 305 residues from 3268–3573 aa of the novel SARS-CoV-2 with the template 2a5i is shown in Fig. 4, and the alignment of the 315 residues from 1568–1882 aa of the novel SARS-CoV-2 with the template 3e9s is shown in Fig. 5.
A PSI-BLAST with a length of 306 amino acid residues, 3268–3573, in Frame 2 from the SARS-CoV-2 isolate Wuhan-Hu-1 (GenBank Accession Number MN908947.3) was conducted to ascertain the conservation of these amino acids in 831 genome sequences of SARS-CoV-2, and it was found that there was a complete match in these genomes of the virus. The fact that the region is conserved in all these SARS-CoV-2 sequences further emphasizes that this ligand interaction of an aza-peptide epoxide with the protein can be used as an antiviral in SARS-CoV-2. Similarly, a PSI-BLAST of a length of 315 amino acid residues, 3268–3573, in Frame 2 and 315 amino acid residues, 1568–1882, in Frame 2 from SARS-CoV-2 virus isolate Wuhan-Hu-1 (GenBank Accession Number MN908947.3) was conducted to ascertain the conservation of these amino acids in 831 genome sequences of SARS-CoV-2, and it was found that there was a complete match in these genomes of the virus. The fact that the region is conserved in all these SARS-CoV-2 sequences further emphasizes that this interaction of ligand GRL0617 with the protein can be used as an antiviral in SARS-CoV-2.
The important templates that aligned with these 4409 amino acid residues of the whole genome of the SARS-CoV-2 isolate Wuhan-Hu-1 were 2a5i of the PDB database, which is a crystal structure of SARS coronavirus main peptidase inhibited by an aza-peptide epoxide in the space group C239 and 3e9s of the PDB database, which is papain-like protease/deubiquitinase that when combined with ligand GRL0617 acts as an inhibitor of SARS virus replication.38 The model with template 2a5i of the PDB database shows that an aza-peptide epoxide (APE; kinact/Ki = 1900 (±400) M−1 s−1), which is a known anti-SARS agent can be used to develop a molecular target with irreversible inhibitor properties. The substrate-binding properties and structural and chemical complementarity of this aza-peptide epoxide could be explored as an anti-SARS-CoV-2 agent. The structure of APE which is ethyl (2S)-4-[(3-amino-3-oxo-propyl)-[[(2S)-2-[[(2S)-4-methyl-2-phenylmethoxycarbonylamino-pentanoyl]amino]-3-phenyl-propanoyl]amino]amino]-2-hydroxy-4-oxo-butanoate with covalent bond formed with the catalytic cysteine and open epioxide groups producing the hydroxyl groups is shown in Fig. 6.
Fig. 6Structure of the aza-peptide epoxide (APE) with covalent bonds formed between the catalytic cysteine residue and open epioxide groups producing hydroxyl groups.
The model with template 3e9s of the PDB database shows that the coronavirus virus PLPro can complex with a ligand GRL0617 known to be a potent inhibitor of viral replication in SARS.
The genome of MN908947.3 SARS-CoV-2 virus isolate Wuhan-Hu-1 encodes a 4409 aa long protein along with the other glycoproteins and polyproteins. The homology modelling of this protein showed sequence and structural alignment with two SARS proteases with structural accession numbers 3e9s and 2a5i at positions 1568–1882 and 3268–3573 respectively. The results suggest that the inhibition of virus replication by the TTT ligand and an aza-peptide epoxide occurs via binding with PLpro and 3CLpro respectively. The structural similarity of these templates are 83% and 96% respectively. The multiple sequence alignment shows complete conservation of the sequence, suggesting a high degree of homology. The comparison of hydrophobic interaction, hydrogen bonds, and salt bridges of the constructed model of the novel coronavirus protein from positions 3268–3573 aa to those of the template 2a5i with the ligand AZP is given in Table S2 (ESI†). On comparison, it was observed that the binding properties are the same except for the presence of a water bridge in the template 2a5i.
The comparison of hydrophobic interaction, hydrogen bonds, π-stacking of the constructed model of the novel coronavirus protein from positions 1568–1882 aa to the ligand small-molecule noncovalent lead inhibitor with those of the template 3e9s is given in Table S3 (ESI†). On comparison, it was observed that the binding properties are the same except for an additional π-stacking at Tyr in the template 2a5i. This shows that there is high possibility of binding of these antiviral compounds with the regions of novel coronavirus protein that is in homology with the SARS protein.
The comparison of the hydrophobic interaction for the binding of the ligand AZP between the SARS-CoV-2 protein and the template 2a5i of SARS-CoVs is shown in Fig. 7 and the comparison of the same between the SARS-CoV-2 protein and the template 3e9s of SARS-CoVs is shown in Fig. 8. It was observed that the interaction is the same in both proteins with the same amino acids participating in the interaction, indicating that there is a possibility that these ligands with antiviral properties can bind to the new virus.
Fig. 7 Comparison of the hydrophobic interaction of the binding of the ligand AZP between the SARS-CoV-2 protein and the template 2a5i of SARS CoVs.
Fig. 8 Comparison of the hydrophobic interaction of the binding of the ligand between the SARS-CoV-2 protein and the template 3e9s of SARS CoVs
Fig. 9(a) Interaction profile of GRL0617 with amino acid residues of the homology model of SARS-CoV-2 papain-like protease. (b) Interaction profile of GRL0617 with amino acid residues of the template 3e9s.
Both show eight interacting amino acids, few of which exhibit multiple interactions. The complex with PLPro shows very high affinity, i.e. −10.2 kcal mol−1 as compared to the complex with the model, which shows lesser affinity, i.e. −7.9 kcal mol−1. The comparison of conserved amino acids show Asp1643 in the homology model and Asp165 in the template, both of which show H bonds at distances of 2.60 and 2.07 respectively. Additionally, Asp165 shows a pi–sigma bond at a distance of 3.53 and pi–anion at a distance of 4.39, accounting for the stronger affinity in PLPro as against the homology model. In the case of Pro1644 in the homology model, there was an alkyl bond at a distance of 4.70, whereas the template shows a pi–alkyl bond at a distance of 5.04, the pi–alkyl bond being stronger than the alkyl bond. Similarly, Pro1632 in the homology model shows a pi–alkyl bond at a distance of 5.06 and the PLPro shows 2 pi–alkyl bonds at distances of 4.31 and 4.72; the two pi–alkyl bonds at a close distance account for the stronger affinity of the template. Ala1635 in the homology model and Leu163 in the PLPro both are hydrophobic amino acids and show alkyl bonds at distances of 3.80 and 4.25 respectively. Ala1635 additionally exhibits a pi–alkyl bond at a distance of 4.24. Thr1642 in the homology model and Gln270 in the PLPro both exhibit H bonds. However, Gln270 exhibits 2 H bonds at distances of 2.83 and 2.74 via its –NH group. Thr1642 exhibits 1 H bond at a distance of 2.62 via its –OH and 1 pi–sigma bond at a distance of 3.87. Phe1636 in the homology model and Tyr269 template are both aromatic amino acids. They both show pi–pi interactions. Phe1636 exhibits pi–pi stacking at a distance of 5.60, whereas Tyr269 exhibits 3 pi–pi T shaped bonds at distances of 5.06, 5.25 and 5.44. It also exhibits an additional H bond at a distance of 3.07.
The comparison results of the interaction of AZP with the amino acid residues of 3CLpro and the model obtained using the template 2a5i are shown in Fig. 10a and b respectively.
Fig. 10(a) Interaction profile of the ligand AZP with the amino acid residues of the homology model of SARS-CoV-2 3C-like protease. (b) Interaction profile of the ligand AZP with the amino acid residues of the template 2a5i
Both show five interacting amino acids, and the conserved amino acids are Gln3456 in the homology model and Gln110 in 3CLpro, both showing H bonds at distances of 2.40 and 2.39 respectively. The similarities present are Thr3292 in the homology model and Ser158 in 3CLpro both of which show H bonds at distances 2.78 and 2.71 respectively. Both of them have a –OH group that participates in the H bond. Tyr3321 in the homology model and Lys102 in 3CLpro both show H bonds at distances of 2.73 and 2.97 respectively, which is reflective of the electronegative group, i.e. participation of –OH in the former and –NH in the latter. Cys3412 in the homology model and Val297 in 3CLpro both show pi–alkyl bonds at distances of 4.82 and 5.30 respectively. Asp3454 in the homology model and Phe294 in 3CLpro exhibit a H bond at a distance of 1.97, the latter exhibits pi–pi stacking at a distance of 4.51. This is responsible for the slightly higher affinity of AZP to 3CLpro than AZP to the model, the former having an affinity (kcal mol−1) of −7.4 and the latter −7.1.
However, it is also interesting to note that even though alignment studies showed 82% and 96% identity in case of Model 1 (obtained using the template 3E9S) and Model 2 (obtained using the template 2A5I) to PlPro and 3CLpro respectively, the binding cavity interactions/milieu were very similar in the 2nd case in spite of not much conserved amino acid residues and in the former case, the binding cavity showed certain similarity in terms of the cavity milieu, however the intensity varied due to multiple, additional stability.
We were able to see the difference in the protein–ligand interaction in both the models by docking these ligands to the whole surface of a protein, as we had no prior knowledge of the target pocket. As the docking involved several runs and energy calculations for arriving at a favorable protein–ligand complex, the interactions observed showed that the interaction profile of the ligand AZP with amino acid residues of the homology model of SARS-CoV-2 3C-like protease showed an affinity of −7.1 kcal mol−1 and the interaction profile of GRL0617 with amino acid residues of the homology Model of SARS-CoV-2 papain-like protease showed an affinity of −7.9 kcal mol−1.
The similarity in the amino acids involved in the hydrophobic interactions that are short-range interactions and have an important role in the affinities of the ligands and receptors shows that the proteins of SARS-CoV-2 may bind with the same affinity as seen in SARS-CoVs, and this also shows a similar action of the ligand as seen in SARS-CoVs, indicating that these ligands could possibly be used as antivirals in SARS-CoV-2.
The targeting of this part of the genome of SARS-CoV-2 with the antiviral compounds that have shown to bind in the similar region of the SARS virus can have implications in the development of an effective antiviral compound against SARS-CoV-2. SARS-CoV-2 shows homology with the SARS coronaviral proteases, papain-like protease (PLpro) and 3C-like protease (3CLpro), and PLPro has the function of processing the viral polyprotein and also perform the function of stripping ubiquitin and the ubiquitin-like interferon (IFN)-stimulated gene 15 (ISG15) from the hosts to facilitate coronavirus replication and help in evading the immune response of the host. These inhibitors can also play a role in disrupting signalling cascades in infected cells, protecting the uninfected cells.
The chemical GRL0617 is 5-amino-2-methyl-N-[(1R)-1-(1-naphthalenyl)ethyl]benzamide and is known to inhibit the papain-like protease enzyme present in SARS-CoVs. This protease is a potential target for antiviral compounds.We found that SARS-CoV-2 PLPro has homology with SARS-CoV PLPro which complexes with ligand GRL0617 whose binding sites for protease in the structural protein of SARS-CoV-2 are very similar. This compound inhibits the enzyme that is required for the cleavage of the viral protein from SARS-CoVs. It also cleaves ubiquitin and has a structural homology with deubiquitinases (DUBs) of the ubiquitin-specific protease compound GRL0617 binding in the S4 and S3 enzyme subsites that gets the C terminal tail of ubiquitin.Our results indicate that an aza-peptide epoxide an irreversible protease inhibitor and GRL0617 a viral replication inhibitor can possibly be used to develop antivirals against novel SARS-CoV-2.
Reference & SOurce information: https://pubs.rsc.org/
Read More on: