Thomas Mulvaney

and 7 more

CASP assessments primarily rely on comparing predicted coordinates with experimental reference structures. However, errors in the reference structures can potentially reduce the accuracy of the assessment. This issue is particularly prominent in cryoEM-determined structures, and therefore, in the assessment of CASP15 cryoEM targets, we directly utilized density maps to evaluate the predictions. A method for ranking the quality of protein chain predictions based on rigid fitting to experimental density was found to correlate well with the CASP assessment scores. Overall, the evaluation against the density map indicated that the models are of high accuracy although local assessment of predicted side chains in a 1.52 Å resolution map showed that side-chains are sometimes poorly positioned. The top 136 predictions associated with 9 protein target reference structures were selected for refinement, in addition to the top 40 predictions for 11 RNA targets. To this end, we have developed an automated hierarchical refinement pipeline in cryoEM maps. For both proteins and RNA, the refinement of CASP15 predictions resulted in structures that are close to the reference target structure, including some regions with better fit to the density. This refinement was successful despite large conformational changes and secondary structure element movements often being required, suggesting that predictions from CASP-assessed methods could serve as a good starting point for building atomic models in cryoEM maps for both proteins and RNA. Loop modeling continued to pose a challenge for predictors with even short loops failing to be accurately modeled or refined at times. The lack of consensus amongst models suggests that modeling holds the potential for identifying more flexible regions within the structure.

Daniel Rigden

and 7 more

The results of tertiary structure assessment at CASP15 are reported. For the first time, recognising the outstanding performance of AlphaFold 2 (AF2) at CASP14, all single chain predictions were assessed together, irrespective of whether a template was available. At CASP15 there was no single stand-out group, with most of the best-scoring groups - led by PEZYFoldings, UM-TBM and Yang Server - employing AF2 in one way or another. Many top groups paid special attention to generating deep Multiple Sequence Alignments (MSAs) and testing variant MSAs, thereby allowing them to successfully address some of the hardest targets. Such difficult targets, as well as lacking templates, were typically proteins with few homologues: small size, high α-helical content and monomeric structure were other likely aggravating factors. Local divergence between prediction and target correlated with localisation at crystal lattice or chain interfaces, and with regions exhibiting high B-factor factors in crystal structure targets, but should not necessarily be considered as representing error in the prediction. However, analysis of exposed and buried side chain accuracy showed room for improvement even in the latter. Nevertheless, a majority of groups, including those applying methods similar to those used to generate major resources such as the AlphaFold Protein Structure Database and the ESM Metagenomic atlas, produced high quality predictions for most targets which are valuable for experimental structure determination, functional analysis and many other tasks across biology.