Understanding humoral responses to SARS-CoV-2 is critical for improving diagnostics, therapeutics, and vaccines. Deep serological profiling of 232 COVID-19 patients and 190 pre-COVID-19 era controls using VirScan revealed over 800 epitopes in the SARS-CoV-2 proteome, including 10 epitopes likely recognized by neutralizing antibodies. Pre-existing antibodies in controls recognized SARS-CoV-2 ORF1, while only COVID-19 patients primarily recognized spike and nucleoprotein. A machine learning model trained on VirScan data predicted SARS-CoV-2 exposure history with 99% sensitivity and 98% specificity; a rapid Luminex-based diagnostic was developed from the most discriminatory SARS-CoV-2 peptides. Individuals with more severe COVID-19 exhibited stronger and broader SARS-CoV-2 responses, weaker antibody responses to prior infections, and higher incidence of CMV and HSV-1, possibly influenced by demographic covariates. Among hospitalized patients, males make greater SARS-CoV-2 antibody responses than females.
In this study we have provided an in-depth serological description of antibody responses to SARS-CoV-2, using VirScan to analyze sera from COVID-19 patients and pre-COVID-19 era controls. We mapped the landscape of linear epitopes in the SARS-CoV-2 proteome, characterized their specificity or cross-reactivity, and investigated serological and viral exposure history correlates of COVID-19 severity. Identification of SARS-CoV-2 epitopes recognized by COVID-19 patients VirScan detected robust antibody responses to SARS-CoV-2 in COVID-19 patients. These were primarily directed against the S and N proteins, with significant cross-reactivity to SARS-CoV and milder cross-reactivity with the distantly related MERS-CoV and seasonal HCoVs. Cross-reactive responses to SARS-CoV-2 ORF1 were frequently detected in pre-COVID-19 era controls, suggesting that these result from antibodies induced by other pathogens.
At the population level, most SARS-CoV-2 epitopes were recognized by both IgA and IgG antibodies. We found individuals often exhibited a “checkerboard” pattern, utilizing either IgG or IgA antibodies against a given epitope. This suggests that a given IgM clone often evolves into either an IgG or an IgA antibody, potentially influenced by local signals, and that, within an individual, there may often be a largely monoclonal response to a given epitope.
Examination of the humoral response to SARS-CoV-2 at the epitope level using the triple-alanine scanning mutagenesis library revealed 145 epitopes in S, 116 in N, and 562 across the remainder of the SARS-CoV-2 proteome (table S10). Most S epitopes were located on the surface of the protein or within unstructured regions that often abut, but seldom overlap, glycosylation sites (fig. S11). These epitopes ranged from private to highly public, with one public epitope cluster being recognized by 79% of COVID-19 patients. Triple-alanine scanning mutagenesis showed highly conserved antibody footprints for some epitope clusters and diverse antibody footprints for others, indicating varying levels of conservation at the antibody-epitope interface among individuals (fig. S8). Peptides containing public epitopes could be used to isolate and clone antibodies from B-cells bearing antigen-specific BCRs. If these antibodies are found to lack protective effects or have deleterious effects, these regions could be mutated in future vaccines to divert the immunological response to other regions of S that might have more protective effects. Epitopes also varied in cross-reactivity, which can be explained by the presence or absence of sequence conservation between seasonal HCoVs and SARS-CoV-2 at these regions. Antibodies against several conserved epitopes in HCoVs seemed to be anamnestically boosted in COVID-19 patients. Altogether these data help explain why many serological assays for SARS-CoV-2 produce false positives, and should be taken as a cautionary note for those trying to develop such assays.
Development of SARS-CoV-2 signature peptides for detecting seroconversion by Luminex Using machine learning models trained on VirScan data, we developed a classifier that predicts SARS-CoV-2 exposure history with 99% sensitivity and 98% specificity. We identified peptides frequently and specifically recognized by COVID-19 patients and used these to create a Luminex assay that predicted SARS-CoV-2 exposure with 90% sensitivity and 95% specificity. Remarkably, the Luminex assay only required three peptides to obtain performance comparable to full antigen ELISAs and could be further optimized in the future. This highlights the utility of VirScan-based serological profiling in the development of rapid and efficient diagnostic assays based on public epitopes.
Correlates of severity in COVID-19 patients An important goal is to uncover serological correlates of COVID-19 severity. To this end, we compared cohorts of COVID-19 patients who had (H) or had not (NH) required hospitalization. Using both VirScan and the COVID-19 Luminex assay, we noticed a striking and somewhat counterintuitive increase in recognition of peptides derived from the SARS-CoV-2 S and N proteins among the H group, with more extensive epitope spreading. Whether this is a cause or a consequence of severe disease is not clear. Individuals whose innate and adaptive immune responses are not able to quell the infection early may experience a higher viral antigen load, a prolonged period of antibody evolution and epitope spreading. Consequently, these patients might develop stronger and broader antibody responses to SARS-CoV-2 and could be more likely to have hyperinflammatory reactions such as cytokine storms that increase the probability of hospitalization. We noticed that hospitalized males had stronger antibody responses to SARS-CoV-2 than hospitalized females. This may indicate that males in this group are less able to control the virus soon after infection and is consistent with reported differences in disease outcomes for males and females (23, 24). VirScan allowed us to examine viral exposure history, and this revealed two striking correlations. First, the seroprevalence of CMV and HSV-1 was much greater in the H group compared to the NH group. The demographic differences in our relatively small cohort of H versus NH COVID-19 patients make it impossible for us to determine with certainty if CMV or HSV-1 infection impacts disease outcome or is simply associated with other covariates such as age, race and socioeconomic status. While CMV prevalence does slightly increase with age after 40, its prevalence also differs greatly among ethnic and socioeconomic groups (31, 32). CMV is a chronic herpes virus that is known to have a profound impact on the immune system: it can skew the naïve T-cell repertoire (33), decrease T and B cell function (34), and is associated with higher systemic levels of inflammatory mediators (35) and increased mortality of people over 65 years of age (36). The effects of CMV on the immune system could potentially impact COVID-19 outcomes.
The second striking correlation we observed was a significant decrease in the levels of antibodies targeting ubiquitous viruses such as Rhinoviruses, Enteroviruses, and Influenza viruses, in COVID-19 H patients compared with NH patients. When we examined only the CMV+ or HSV-1+ individuals in the two groups, we found that the strength of the antibody response to CMV and HSV-1 peptides was also reduced in the H group. We examined the effects of age on viral antibody levels in a pre-COVID-19 era cohort and found a diminution with age in the antibody response against viral peptides differentially recognized between the H and NH groups, consistent with previous studies on the effects of aging on the immune system (25). This inferred reduced immunity during aging could impact the severity of COVID-19 outcomes. In correlative analyses such as these, it is difficult to draw strong conclusions about causality given the demographic differences in the NH versus H groups. The NH group is younger and has a higher percentage of Caucasians and females (average age 42, 66% female) compared to the H group (average age 58, 42% female) (fig. S2), consistent with well-documented demographic skews in severely-affected COVID-19 patients (23, 24). However, even if age and other demographic factors are covariates, CMV seropositivity and age-related reduction in antibody titers against viral antigens as described here could still impact the severity of infection. To test these hypotheses, a much larger cohort of COVID-19 patients with severe and mild disease that could be matched for age, race and sex is required. Such future studies have the potential to enhance our understanding of the biological mechanisms underlying variable outcomes of COVID-19.
Deep serological profiling can provide a window into the breadth of viral responses, how they differ in patients with diverse outcomes, and how past infections may influence present responses to viral infections. Understanding the epitope landscape of SARS-CoV-2, particularly within S, provides a stepping stone to the isolation and functional dissection of both neutralizing antibodies and antibodies that might exacerbate patient outcomes through ADE and could inform the production of improved diagnostics and vaccines for SARS-CoV-2.
Materials and methods Sources of serum used in this study Cohort 1 Plasma samples were from volunteers recruited at Brigham and Women’s Hospital who had recovered from a confirmed case of COVID-19. All volunteers had a PCR- confirmed diagnosis of COVID-19 prior to being admitted to the study. Volunteers were invited to donate specimens after recovering from their illness and were required to be symptom free for a minimum of 7 days. Participants provided verbal and/or written informed consent and provided blood specimens for analysis. Clinical data including date of initial symptom onset, symptom type, date of diagnosis, date of symptom cessation, and severity of symptoms was recorded for all participants, as were results of COVID-19 molecular testing. Participation in these studies was voluntary and the study protocols have been approved by the respective Institutional Review Boards.
Cohort 2 Serum samples from patients with PCR-confirmed COVID-19 cases while admitted to the hospital and from patients who were actively enrolled into a prospective study of COVID-19 infection were provided by collaborators from the University of Washington. Residual clinical blood specimens were used. Clinical data, including symptom duration and comorbidities were extracted from medical records and from participant-completed questionnaires. All study procedures have been approved by the University of Washington Institutional Review Board.
Cohort 3 Plasma samples were provided by collaborators from Ragon Institute of MGH, MIT and Harvard and Massachusetts General Hospital from study participants in three settings: 1) PCR-confirmed COVID-19 cases while admitted to the hospital; 2) PCR-confirmed SARS-CoV-2 infected cases seen in an ambulatory setting; 2) PCR-confirmed COVID-19 cases in their convalescent stage. All study participants provided verbal and/or written informed consent. Basic data on days since symptom onset were recorded for all participants as were results of COVID-19 molecular testing. Participation in these studies was voluntary and the study protocols have been approved by the Partners Institutional Review Board.
Cohort 4 Patients were enrolled in the Emergency Department (ED) in Massachusetts General Hospital from 3/15/2020 to 4/15/2020 in Boston during the peak of the COVID-19 surge, with an institutional IRB-approved waiver of informed consent. These included patients 18 years or older with a clinical concern for COVID-19 upon ED arrival, and with acute respiratory distress with at least one of the following: 1) tachypnea ≥ 22 breaths per minute, 2) oxygen saturation ≤ 92% on room air, 3) a requirement for supplemental oxygen, or 4) positive-pressure ventilation. A blood sample was obtained in a 10 mL EDTA tube concurrent with the initial clinical blood draw in the ED. Day 3 and Day 7 blood draws were obtained if the patient was still hospitalized at those times. Clinical course was followed to 28 days post-enrollment, or until hospital discharge if that occurred after 28 days. Enrolled subjects who were SARS-CoV-2 positive were categorized into four outcome groups: 1) Requiring mechanical ventilation with subsequent death, 2) Requiring mechanical ventilation and recovered, 3) Requiring hospitalization on supplemental oxygen but not requiring mechanical ventilation, and 4) Discharge from ED and not subsequently readmitted with supplemental oxygen. Those who were SARS-CoV-2 negative were categorized as Controls. Demographic, past medical and clinical data were collected and summarized for each outcome group, using medians with interquartile ranges and proportions with 95% confidence intervals, where appropriate.
Cohorts 5 and 6 Longitudinal Hopkins Cohort: Remnant serum specimens were collected longitudinally from PCR confirmed COVID-19 patients seen at Johns Hopkins Hospital. Samples were de-identified prior to analysis, with linked time since onset of symptom information. Specimens were obtained and utilized in accordance with an approved IRB protocol.
Cohort 9 Plasma samples were collected from consented participants of the Partner’s Biobank program at BWH during the period from July to August 2016 from 37 female and 51 male individuals with ages ranging from 18 to 85 years old. Plasma was harvested after a 10 min 1200xg ficoll density centrifugation from blood that was diluted 1:1 in phosphate buffered saline. Samples were frozen at −30 C in 1 mL aliquots. All samples were collected with Partners Institutional Review Board (IRB) approval. Blood sample collection methods For Cohorts 1-3: Blood samples were collected into EDTA (Ethylenediamine Tetraacetic Acid) tubes and spun for 15 min at 2600rpm according to standard protocol. Plasma was aliquoted into 1.5ml cryovials and stored in −80°C until analyzed. Only de-identified plasma aliquots including metadata (e.g., days since symptom onset, severity of illness, hospitalization, ICU status, survival) were shared for this study. When appropriate for non-convalescent samples plasma/serum was also heat inactivated at 56°C for 60 min, and stored at ≤20°C until analyzed.
For Cohort 4: Blood samples were collected in EDTA tubes, and processed no more than 3 hours post blood draw in a Biosafety Level 2+ laboratory on site. Whole blood was diluted with room temperature RPMI medium in a 1:2 ratio to facilitate cell separation for other analyses using the SepMate PBMC isolation tubes (STEMCELL) containing 16ml of Ficoll (GE Healthcare). Diluted whole blood was centrifuged at 1200 rcf for 20 min at 20°C. After centrifugation, plasma (5 mL) was pipetted into 15 mL conical tubes and placed on ice during PBMC separation procedures. Plasma was then centrifuged at 1000 rcf for 5 min at 4°C, pipetted in 1.5 mL aliquots into 3 cryovials (4.5 mL total), and stored at −80°C. For the current study samples (200 uL) were first randomly allocated onto a 96 well plate based on disease outcome grouping. Design and cloning of the SARS-CoV-2 tiling and triple-alanine scanning library Multiple VirScan libraries were constructed as described below. We created ~200 nt oligos encoding peptide sequences 56 amino acids in length, tiled with 28-amino acid overlap through the proteomes of all coronaviruses known to infect humans including HCoV-NL63, HCoV-229E, HCoV-OC43, HCoV- HKU1, SARS-CoV, MERS-CoV and SARS-CoV-2 as well as three closely related bat viruses: BatCoV-Rp3, BatCoV-HKU3 and BatCoV-279. For SARS-CoV-2 we included a number of coding variants available in early sequencing of the viruses. For SARS-CoV-2 we additionally made a 20 amino acid peptide library tiling every 5 amino acids. Additionally, for SARS-CoV-2 we made triple-alanine mutant sequences scanning through all 56-mer peptides. Non-alanine amino acids were mutated to alanine, and alanines were mutated to glycine. Each peptide in all three libraries was encoded in two distinct ways such that there were duplicate peptides that could be distinguished by DNA sequencing. We reverse-translated the peptide sequences into DNA sequences that were codon-optimized for expression in Escherichia coli, that lacked restriction sites used in downstream cloning steps (EcoRI and XhoI), and that were unique in the 50 nt at the 5′ end to allow for unambiguous mapping of the sequencing results. Then we added adapter sequences to the 5′ and 3′ ends to form the final oligonucleotide sequences (table S1): these adapter sequences facilitated downstream PCR and cloning steps. Different adapters were added to each sub-library so that they could be amplified separately. The resulting sequences were synthesized on a releasable DNA microarray (Agilent). We PCR-amplified the DNA oligo library with the primers shown below, digested the product with EcoRI and XhoI, and cloned it into the EcoRI/SalI site of the T7FNS2 vector (5). We packaged the resultant library into T7 bacteriophage using the T7 Select Packaging Kit (EMD Millipore) and amplified the library according to the manufacturer’s protocol. Primers used for analysis of the different libraries employed. CoV 56-mer Library 5′ Adapter: 5′- GAATTCGGAGCGGT -3′ 3′ Adapter: 5′- CACTGCACTCGAGA -3′ Forward Primer: 5′- AATGATACGGCGTGAATTCGGAGCGGT -3′ Reverse primer: 5′- CAAGCAGAAGACGTCTCGAGTGCAGTG -3′ SARS CoV-2 Triple-alanine scanning library 5′ Adapter: 5′- GAATTCCGCTGCGT -3′ 3′ Adapter: 5′- CAGGGAAGAGCTCG -3′ Forward Primer: 5′- AATGATACGGCGGGAATTCCGCTGCGT -3′ Reverse primer: 5′- CAAGCAGAAGACTCGAGCTCTTCCCTG -3′ SARS-CoV-2 20mer Library 5′ Adapter: 5′- GAATTCCGCTGCGT -3′ 3′ Adapter: 5′- GTACTATACCTACGGAAGGCTCG -3′ Forward Primer: 5′- AATGATACGGCGGGAATTCCGCTGCGT -3′ Reverse primer: 5′- TATCTCGCATAGCGCATATACTCGAGCCTTCCGTAGGTATAGTAC -3′ Phage immunoprecipitation and sequencing We performed phage IP and sequencing as described previously or with slight modifications (5–8). For the IgA and IgG chain isotype-specific immunoprecipitations, we substituted magnetic protein A and protein G Dynabeads (Invitrogen) with 6 μg Mouse Anti-Human IgG Fc-BIOT (Southern Biotech) or 4 μg Goat Anti-Human IgA-BIOT (Southern Biotech) antibodies. We added these antibodies to the phage and serum mixture and incubated the reactions overnight a 4°C. Next, we added 25 μL or 20 μL of Pierce Streptavidin Magnetic Beads (Thermo-Fisher) to the IgG or IgA reactions, respectively, and incubated the reactions for 4 hours at room temperature, then continued with the washing steps and the remainder of the protocol, as previously described (9). Machine learning classifiers Gradient boosting classifier models for the VirScan data were generated using the XGBoost algorithm (version 1.0.2). Classifier models were trained to discriminate either COVID-19+ and COVID-19- patients (n = 232 and n = 190 respectively) or severe disease and mild disease (n = 101 hospitalized patients and n = 131 non-hospitalized patients). Two models were generated in each case, one using the Z-scores for each VirScan peptide from the IgG immunoprecipitation as input features, and the other using the Z-scores for each VirScan peptide from the IgA immunoprecipitation as input features. Additionally, a third logistic regression classifier was trained on the output probabilities from the IgG and IgA models to generate a combined prediction. The performance of each of the three model was assessed using a 20-fold cross-validation procedure, whereby predictions for each 5% of the data points were generated from a model trained on the remaining 95%. The SHAP package was used to identify the top discriminatory peptide features from each of the XGBoost models. The logistic regression models for the Luminex data were generated using the scikit-learn python package. The raw MFI values were preprocessed using the RobustScalar function, then a logistic regression model was trained using the three most discriminate SARS-CoV-2 peptides. The model performance was quantified by 10-fold cross-validation. High-resolution epitope identification and clustering For each position in the 56-mer, the relative enrichment for each amino-acid was calculated as the mean fold-change of the three mutant peptides containing an alanine-mutation at that location relative to the median fold-change of all alanine mutants for the 56-mer. Overlapping 56-mers were combined by taking the minimum value at each shared position to account for the possibility that an epitope is interrupted in one of the tiles by the peptide junction. To map the boundaries of antibody footprints from the triple-alanine scanning data for each sample we used the HMMlearn python package to develop a three-state HMM assuming a Gaussian distribution of relative- enrichment emissions for each state. Mapped antibody footprints smaller than 5 amino acids in length were removed from the subsequent analysis. Next, we performed a two-step hierarchical clustering procedure to identify the number of unique epitopes. First, for each protein all antibody footprints identified across the 169 COVID-19+ patient samples were clustered based on the start and stop locations predicted by the HMM classifier to generate epitope clusters. Next, to identify unique epitopes, we performed an additional step of hierarchical clustering on the samples with epitopes within each epitope cluster based on the relative-enrichment values of the triple-alanine mutants spanning the epitope (fig. S8). Similarity-score calculation Pairwise alignments were generated for the S proteins of SARS-CoV-2 and each of the four common HCoVs. Similarity scores were calculated separately for a 21-amino acid window centered at each position of the SARS-CoV-2 S protein. The mean similarity score between SARS-CoV-2 and the corresponding sequence of the other HCoV was calculated for each window using the BLOSUM62 substitution matrix with a gap opening and extending penalty of −10 and −1 respectively. The maximum similarity was score was calculated as the maximum value among the pairwise-similarity scores between SARS-CoV-2 and each of the four common HCoVs for the sliding window centered at each position. Luminex multiplex peptide epitope serology assays Multiplexed SARS-CoV-2 peptide epitope assays were built using the peptides listed in table S9. Peptides were synthesized by the Ragon/MGH Peptide Core Facility with a Proparglyglycine (Pra, X) moiety in the amino terminus to facilitate crosslinking to Luminex beads using a “click” chemistry strategy as described (18). In brief, Luminex beads were first functionalized with amine-PEG4-azide and then reacted with the peptides to generate 20 different Luminex beads with attached peptides. Luminex bead-based serology assays were performed in 96-well U-bottom polypropylene plates using PBS + 0.1% bovine serum albumin as the assay buffer. Bead washes were done using PBS + 0.05% Triton X-100 by incubation for 1 min on a strong magnetic plate (Millipore-Sigma, Burlington, MA). All assay incubation times were 20 min. In the first step, beads were incubated with 20 μL of plasma samples. Samples used for the classifier were diluted 1:100, samples used to compare disease severity were diluted 1:300. After a wash step, bound IgA or IgG was detected by adding 40 μL of biotin-labeled anti-IgA or IgG antibodies at 0.1 μg/ml (Southern Biotechnology, Birmingham, AL). Next 40 μL of phycoerythrin (PE)-labeled streptavidin (0.2 μg/ml) (Biolegend, San Diego, CA) and assay plates were analyzed on a Luminex FLEXMAP 3D instrument (Luminex Corporation, Austin, Texas) to generate median fluorescence intensity (MFI) values to quantify peptide-specific IgA or IgG levels.
ELISA serology assays ELISAs were performed separately using the SARS-CoV-2 N protein, S protein, or the S receptor-binding domain (RBD). 96-well plates were coated with antigen overnight. The plates were then blocked in PBS+3%BSA. After washing with PBS+0.05% Tween-20, the plasma sample were diluted 1:100, added to the plates and incubated overnight at 4°C. Following incubation, the plates were washed 3x with PBS+0.05% Tween-20. The bound IgG was detected by adding anti-Human IgG-alkaline phosphatase (Southern Biotech, Birmingham, AL) and incubating for 90 min at room temperature. The plates were washed an additional three times after which p-nitrophenyl phosphate solution (1.6 mg/mL in 0.1M glycine, 1mM ZnCl2, 1mM MgCl2, pH 10.4) was added to each well and allowed to develop for 2 hours. Bound IgG was quantified by measuring the OD405, and the reported values were calculated as the fold change over the pre-COVID-19 controls.
Reference & Source Information: https://science.sciencemag.org/
Read More on: