Phylogenetic analysis of ACE2s and key amino acids for
SARS-CoV-2 utilization
To evaluate the phylogenetic relationship of the 20 ACE2s assessed
above, we built a phylogenetic tree based on the aa sequences of all the
ACE2s (Figure 3A ). In the tree, we observed no branches
corresponding to the ACE2 utilization by SARS-CoV-BJ01 or SARS-CoV-2
pseudoviruses, indicating that the whole sequence analysis could hardly
reveal the key factors underlying the receptor utilization by the
viruses.
In our previous study, we predicted 9 key aa sites on human ACE2
potentially critical for the receptor utilization, including T20, K31,
Y41, K68, Y83, K353, D355, R357 and M383, based on the SARS-CoV-2
utilization of human, bat, civet, swine and mouse ACE2s (Qiu, Zhao et
al. 2020). Here we tested 15 more species to validate the role of the 9
sites. As shown in Figure 3A , Y/H41, D355, R357 and R383 were
conserved in all ACE2s, and K68 was conserved in both mouse and chicken
ACE2s which could not be used by SARS-CoV-2, indicating that these sites
were not determining the receptor utilization by SARS-CoV-2. Q42 was
conserved in mouse ACE2 but was substituted by E42 in chicken ACE2,
indicating a complicated role of this site. On the contrary, T20, K31
and Y83 in usable ACE2s were distinct from the corresponding aa in
unusable ACE2s, indicating their cirtical role in determining SARS-CoV-2
utilization.
Among the three key sites, substitutions of K31 and Y83 have been
reported to abolish or strongly inhibit SARS-CoV binding and the
mechanism has been well documented (Li, Zhang et al. 2005, Wan, Shang et
al. 2020). Remarkably, our study revealed T20 of ACE2 as a potential key
aa residue for SARS-CoV-2 utilization. T20 has not been reported to
affect SARS-CoV utilization but mouse ACE2 with T20L and chicken ACE2
with T20V could not be used by SARS-CoV-2. T20 is located at the N
terminus of most mature mammalian ACE2s. According to the structure of
SARS-CoV-2 receptor-binding domain complexed with human ACE2, the
N-terminal T20 of ACE2 is close to S477 and T478 of SARS-CoV-2 RBM
(Figure 3B ) (Shang, Ye et al. 2020). Both threonine and serine
contain hydroxyl radicals that allow them to form hydrogen bonds with
each other. Thus, T20 of human ACE2 is likely to bridge with S477 and
T478 of SARS-CoV-2 spike via hydrogen bonds, which stabilizes the
ACE2-spike binding. However, L20 on mouse ACE2 and V20 on chicken ACE2
are both aliphatic aa that cannot form hydrogen bonds to support the
ACE2-spike binding, which may impair the utilization of these two ACE2s
by SARS-CoV-2. In SARS-CoV spike, the two aa residues are substituted by
G463 and K464. K464 can form hydrogen bonds with threonine and G463 is a
non-polar aa that can interact with valine or leucine via hydrophobic
bond together with the adjacent A461. Therefore, SARS-CoV spike can
interact with various N-terminal aa of ACE2s, and this may be the reason
why SARS-CoV can utilize mouse and chicken ACE2s while SARS-CoV-2
cannot. Pangolin CoV spike shares high similarity with SARS-CoV-2 which
keeps S477 and T478. However, Pangolin CoV spike harbors alkaline H498
that can interact with acidic Q42 and E42 of mouse and chicken ACE2,
respectively, via ionic affinity, which may complementally support the
ACE2-spike binding and allow the utilization of mouse and chicken ACE2
by Pangolin CoV. On the contrary, SARS-CoV-2 spike substitutes H498 with
acidic Q498, leading to ionic repulsion with Q42 and E42, blocking its
binding with mouse and chicken ACE2s. Nevertheless, all the speculation
about the aa residues on ACE2 key for spike-ACE2 binding is based on the
sequence analysis, and mutation validation is necessary to further
verify the roles of those sites.
In summary, our results showed less ACE2 utilization by SARS-CoV-2
compared to SARS-CoV and pangolin CoV, especially for murine and bird
ACE2s, indicating narrower host range of SARS-CoV-2. Meanwhile, we found
that the N-terminal T20 and Q42 might be critical in determining the
difference of ACE2 utilization by the three SARSr-CoVs. Our findings
deepen the understanding about the receptor utilization and the host
range of SARS-CoV-2, providing useful information for tracing virus
transmission routes and preventing pandemics caused by CoVs in the
future.