Characterization of promoters in archaeal genomes based on DNA
structural parameters
Abstract
The transcription machinery of archaea can be roughly classified as a
simplified version of eukaryotic organisms. The basal transcription
factor machinery binds to the TATA-box found around 28 nucleotides
upstream of the transcription start site; however, some transcription
units lack a clear TATA-box and still have TBP/TFB binding over them.
This apparent absence of conserved sequences could be a consequence of
sequence divergence associated with the upstream region, operonic and
gene organization. Furthermore, earlier studies have found that a
structural analysis gains more information compared to a simple sequence
inspection. In this work, we evaluated and coded 3630 archaeal promoter
sequences of three organisms, Haloferax volcanii, Thermococcus
kodakarensis, and Sulfolobus solfataricus into DNA duplex stability,
enthalpy, curvature, and bendability parameters. We also split our
dataset into conserved TATA and degenerated TATA promoters in order to
identify differences among these two classes of promoters. The
structural analysis reveals variations in archaeal promoters’
architecture, i.e., a distinctive signal is observed in the TFB, TBP,
and TFE binding sites independently of these being TATA-conserved or
TATA-degenerated. In addition, the promoter encountering method was
validated with upstream regions of 13 other archaea, suggesting that
there might be promoter sequences among them. Therefore, we suggest a
novel method for locating promoters within the genome of archaea based
on energetic/structural features.