You need to sign in or sign up before continuing. dismiss

Lincong Wang

and 2 more

The assignment of protein secondary structure elements (SSEs) underpins the structural analysis and prediction. The backbone of a protein could be adequately represented using a pc-polyline that passes through the centers of its peptide planes. One salient feature of pc-polyline representation is that the secondary structure of a protein becomes recognizable in a matrix whose elements are the pairwise distances between two peptide plane centers. Thus a pc-polyline could in turn be used to assign SSEs. Using convolutional neuron network (CNN) here we confirm that a pc-polyline indeed contains enough information for it to be used for the accurate assignments of six types of secondary structure elements: α-helix, β-sheet, β-bulge, 3 10 -helix, turn and loop. The applications to three large data sets show that the assignments made by our CNN-based P2PSSE program agree very well with those by DSSP , STRIDE and quite well with those by five other programs. The analyses of the assignments by P2PSSE and those by other programs raise some general questions about the characterizations of protein secondary structure. In particular the analyses illustrate the difficulty with giving a quantitative and consistent definition for each of the six SSE types especially for 3_10 -helix, β-bulge, turn or loop in terms of either backbone H-bond patterns, or backbone dihedral angles, or Cα -polylines or pc-polylines. The difficulty suggests that the SSE space though being dominated by the regions for the six SSE types is to a certain degree continuous.