Phylogenetic analysis of ACE2s and key amino acids for SARS-CoV-2 utilization
To evaluate the phylogenetic relationship of the 20 ACE2s assessed above, we built a phylogenetic tree based on the aa sequences of all the ACE2s (Figure 3A ). In the tree, we observed no branches corresponding to the ACE2 utilization by SARS-CoV-BJ01 or SARS-CoV-2 pseudoviruses, indicating that the whole sequence analysis could hardly reveal the key factors underlying the receptor utilization by the viruses.
In our previous study, we predicted 9 key aa sites on human ACE2 potentially critical for the receptor utilization, including T20, K31, Y41, K68, Y83, K353, D355, R357 and M383, based on the SARS-CoV-2 utilization of human, bat, civet, swine and mouse ACE2s (Qiu, Zhao et al. 2020). Here we tested 15 more species to validate the role of the 9 sites. As shown in Figure 3A , Y/H41, D355, R357 and R383 were conserved in all ACE2s, and K68 was conserved in both mouse and chicken ACE2s which could not be used by SARS-CoV-2, indicating that these sites were not determining the receptor utilization by SARS-CoV-2. Q42 was conserved in mouse ACE2 but was substituted by E42 in chicken ACE2, indicating a complicated role of this site. On the contrary, T20, K31 and Y83 in usable ACE2s were distinct from the corresponding aa in unusable ACE2s, indicating their cirtical role in determining SARS-CoV-2 utilization.
Among the three key sites, substitutions of K31 and Y83 have been reported to abolish or strongly inhibit SARS-CoV binding and the mechanism has been well documented (Li, Zhang et al. 2005, Wan, Shang et al. 2020). Remarkably, our study revealed T20 of ACE2 as a potential key aa residue for SARS-CoV-2 utilization. T20 has not been reported to affect SARS-CoV utilization but mouse ACE2 with T20L and chicken ACE2 with T20V could not be used by SARS-CoV-2. T20 is located at the N terminus of most mature mammalian ACE2s. According to the structure of SARS-CoV-2 receptor-binding domain complexed with human ACE2, the N-terminal T20 of ACE2 is close to S477 and T478 of SARS-CoV-2 RBM (Figure 3B ) (Shang, Ye et al. 2020). Both threonine and serine contain hydroxyl radicals that allow them to form hydrogen bonds with each other. Thus, T20 of human ACE2 is likely to bridge with S477 and T478 of SARS-CoV-2 spike via hydrogen bonds, which stabilizes the ACE2-spike binding. However, L20 on mouse ACE2 and V20 on chicken ACE2 are both aliphatic aa that cannot form hydrogen bonds to support the ACE2-spike binding, which may impair the utilization of these two ACE2s by SARS-CoV-2. In SARS-CoV spike, the two aa residues are substituted by G463 and K464. K464 can form hydrogen bonds with threonine and G463 is a non-polar aa that can interact with valine or leucine via hydrophobic bond together with the adjacent A461. Therefore, SARS-CoV spike can interact with various N-terminal aa of ACE2s, and this may be the reason why SARS-CoV can utilize mouse and chicken ACE2s while SARS-CoV-2 cannot. Pangolin CoV spike shares high similarity with SARS-CoV-2 which keeps S477 and T478. However, Pangolin CoV spike harbors alkaline H498 that can interact with acidic Q42 and E42 of mouse and chicken ACE2, respectively, via ionic affinity, which may complementally support the ACE2-spike binding and allow the utilization of mouse and chicken ACE2 by Pangolin CoV. On the contrary, SARS-CoV-2 spike substitutes H498 with acidic Q498, leading to ionic repulsion with Q42 and E42, blocking its binding with mouse and chicken ACE2s. Nevertheless, all the speculation about the aa residues on ACE2 key for spike-ACE2 binding is based on the sequence analysis, and mutation validation is necessary to further verify the roles of those sites.
In summary, our results showed less ACE2 utilization by SARS-CoV-2 compared to SARS-CoV and pangolin CoV, especially for murine and bird ACE2s, indicating narrower host range of SARS-CoV-2. Meanwhile, we found that the N-terminal T20 and Q42 might be critical in determining the difference of ACE2 utilization by the three SARSr-CoVs. Our findings deepen the understanding about the receptor utilization and the host range of SARS-CoV-2, providing useful information for tracing virus transmission routes and preventing pandemics caused by CoVs in the future.