3.1 | Alpha-helical transmembrane microproteins
Intergenic regions of eukaryotic genomes are rich in A/T residues relative to genes, which are G/C rich. When microproteins are expressed from “noncoding” regions, they therefore tend to contain predicted transmembrane helices arising from the preponderance of T/U residues within codons that correspond to hydrophobic and aromatic amino acids. This intergenic sequence bias therefore affects the amino acid composition of evolutionarily young, species-specific microproteins, that arise de novo from previously noncoding regions of the genome. A recent study demonstrated that C-terminal hydrophobic patches tend to target evolutionarily young microproteins to the BAG6 membrane protein triage complex, resulting either in membrane insertion or, if mislocalized or improperly folded, proteasomal degradation. Interestingly, species-specific transmembrane microproteins that exhibit low expression can nonetheless contribute fitness advantage to cells, and examples have been shown to function in processes such as yeast mating. Not all membrane-associated microproteins are evolutionarily novel; a large and growing number of well-characterized, conserved transmembrane microproteins are predicted to contain transmembrane helices, such as the lysosomal membrane-localized polypeptide regulator of mTORC1, SPAR, and the plasma membrane localized micropeptide Myomixer, which is required for myoblast fusion during skeletal muscle development. The class of alpha-helical transmembrane microproteins is therefore large, and of outsize biological importance. We turn our attention in this section to those membrane-associated microproteins that have been subjected to experimental structure determination.
AcrZ, previously named YbhT, was reported in a seminal study identifying unannotated small protein genes in E. coli utilizing computational tools that incorporate ribosome binding site prediction. AcrZ is a 49-amino acid microprotein that is conserved in many Gram-negative bacteria and localizes to the E. coli inner and outer membranes by virtue of an N-terminal transmembrane helix. AcrZ binds to the AcrB subunit of the AcrAB/tolC multidrug efflux pump, increasing the efficiency of transport of (and, thus, resistance of E. coli to) a subset of its substrates. Multiple structures of AcrZ in complex with the AcrB homotrimer have been solved, including crystal structures of detergent-solubilized complexes, as well as a cryo-electron microscopy (cryo-EM) structure of the complex reconstituted in lipid discs (Figure 4A). AcrZ binds to a transmembrane groove within each molecule of AcrB. The cryo-EM structure revealed that AcrZ exhibits a profound bend between positions 10-15, conferred by a helix-breaking proline residue. Mutagenesis studies revealed that the proline is required for interaction of AcrZ with AcrB. At the same time, proline, or an equally helix-breaking glycine residue, can be moved to any position within the AcrZ interaction motif while retaining its association with AcrB. Several of these mutations that retain AcrB binding also recapitulate the selective drug transport-promoting phenotype of wild-type AcrZ. While the precise effects of AcrZ binding on cargo occupancy and transport are not fully clear, allosteric modulation of binding sites in AcrB is evident by comparing the AcrB vs. AcrBZ structures. Furthermore, AcrZ promotes cardiolipin association with AcrB, likely contributing to allosteric modulation of cargo binding pockets in the transporter. Taken together, these results indicate that the bend in the transmembrane helical shape of AcrZ, and not its amino acid sequence, is essential for interaction and modulation of AcrB.
E. coli CydX was originally identified as YbgT, a predicted 37-amino acid microprotein encoded downstream of the cytochrome bd oxidase operon genes cydA and cydB . Cytochrome bd oxidases operate as terminal electron acceptors in the electron transport chain under hypoxic conditions due to their high oxygen affinity. The two canonical subunits, CydA and CydB, form a pseudosymmetric heterodimer, of which the CydA subunit contains all three heme residues responsible for reduction of molecular oxygen to water, as well as the Q loop that is responsible for binding an electron donor quinol. CydX is a single-pass alpha helical transmembrane protein that copurifies with the CydAB complex and is required for the assembly, stability, and/or activity of cytochrome bd oxidase in multiple species. Several atomic structures of cytochrome bd oxidases have revealed the role of CydX homologs in the complex (Figure 4B). First, the presence of an unannotated, CydX homolog, CydS, was serendipitously discovered in a crystal structure of cytochrome bd oxidase purified from the gram-positive bacterium Geobacillus thermodenitrificans . CydS forms an alpha helix that binds between helices 5 and 6 of CydA, leading the authors to speculate that it may stabilize the heme cofactor when the Q loop undergoes dynamic movement during catalysis. A subsequent cryo-EM structure of the E. coli cytochrome BD oxidase revealed CydX bound to CydA between helices 1 and 6, again suggesting a structural role. Interestingly, the E. coli CydAB unexpectedly revealed the presence of another single-pass transmembrane microprotein, CydH, which is encoded in the ynhF gene that is not located within the cytochrome bd oxidase operon. CydH binds between transmembrane helices 1 and 8 of CydA, on the opposite face of CydA relative to CydX. CydH is proposed to occlude the proposed oxygen-conducting channel from the Geobacillus complex structure, which has been replaced with a hydrophobic channel that traverses CydB directly to the heme d site. The CydH oxygen channel rearrangement was proposed to be required due to the swapped positions of two heme cofactors in the E. coli enzyme relative to theGeobacillus structure, and, accordingly, CydH homologs are found in Proteobacteria. Overall, cytochrome bd oxidase is a unique system in which microproteins are required for activity, structure and stability of a critical complex of proteins.
In another well-characterized example, a class of microproteins (also called micropeptides) termed “regulins” regulate the activity of the sarco/endoplasmic reticulum (SR/ER) calcium ATPase (SERCA). During muscle contraction, including the contraction of the heart and calcium-dependent signaling processes, calcium is released from the SR/ER into the cytosol; then, to terminate signaling or contraction, calcium is pumped back into the SR/ER against its concentration gradient using the energy of ATP hydrolysis by SERCA. Regulins colocalize with SERCA in the SR/ER membrane, and each micropeptide is expressed in the same, specific tissue as the SERCA isoform that it regulates. The first known regulins, phospholamban and sarcolipin, were identified as inhibitors of SERCA in cardiac and skeletal muscle, respectively. Structural analysis of these canonical regulins, both of which are <100 amino acids, reveals that they are small, single-pass membrane proteins bearing a single transmembrane alpha-helix. The crystal structure of the SERCA-sarcolipin complex reveals that the micropeptide binds in a transmembrane groove in the SERCA channel between helices 2, 6 and 9, where it allosterically alters the conformation of SERCA to decrease its apparent calcium affinity. Phospholamban binds to the same regulatory groove (Figure 4C). One seminal discovery of novel SERCA regulating micropeptides came from a study in Drosophila. In this work, Couso and colleagues analyzed putative lncRNAs associated with polysomes, suggesting that they are translated. Of these lncRNAs, one contained an sORF encoding a peptide predicted to be homologous to phospholamban and sarcolipin, which was accordingly given the name sarcolamban. Sarcolamban may have arisen via duplication of an ancestral phospholamban/sarcolipin gene in insects, which subsequently diverged to the sarcolamban sequence. Sarcolamban was demonstrated to bind SERCA in flies and its deletion caused heart arrythmias, consistent with a role in regulating SERCA. Docking the predicted structure of sarcolamban onto SERCA was consistent with a similar binding mode as that observed for phospholamban and sarcolipin. Just as importantly, additional novel regulins have also been discovered in mammals. In analyses of mammalian lncRNAs to identify potential micropeptides expressed in skeletal muscle and other tissues lacking known regulin expression, translated sORFs were identified that encode the novel SERCA binding micropeptides myoregulin, endoregulin, and another-regulin, all of which bind to the same transmembrane groove of SERCA, exhibit similar inhibition of SERCA to phospholamban, and are predicted to have similar single-pass transmembrane alpha-helical structures. Interestingly, an unannotated, SERCA-activatingmicropeptide, DWORF, was identified in yet another long noncoding RNA in mouse. DWORF is expressed in skeletal muscle, and ectopic over-expression of DWORF in heart tissue enhances contractility and reverses heart failure in a model of heart failure. However, the mechanism by which DWORF activates SERCA was unclear, since it is predicted to bear a similar alpha-helical transmembrane domain and binds to the same SERCA groove as previously characterized regulins, which are all inhibitory. Some evidence from fluorescence resonance energy transfer suggests that DWORF binding can directly activate SERCA. A recent NMR structural study demonstrated that the alpha helix of DWORF is kinked at a unique proline residue, creating a significant bend in the transmembrane region without disrupting its binding to SERCA (Figure 4C). Mutating this proline residue diminished the bend angle between the two alpha helical regions of DWORF, and not only prevented its activation of SERCA, but converted it into a SERCA inhibitor. Therefore, activation of SERCA by DWORF appears to require its proline-induced kink, and, by extension, inhibition of SERCA by phospholamban, sarcolipin, myoregulin, endoregulin and another-regulin may be hypothesized to require binding of their uninterrupted transmembrane helices to the regulatory groove of SERCA. It is also fascinating to note the parallels between DWORF and AcrZ (see above), both of which utilize kinked transmembrane alpha-helices to allosterically regulate the membrane transporters SERCA and AcrB, respectively.