Ameya Harmalkar

and 8 more

Critical Assessment of PRediction of Interactions (CAPRI) rounds 47 through 55 introduced 49 targets comprising multistage assemblies, antibody-antigen complexes, and flexible interfaces. For these rounds, we combined various Rosetta docking approaches (RosettaDock, ReplicaDock, and SymDock) with deep learning approaches (AlphaFold2, IgFold, and AlphaRED). Since prior CAPRI rounds, we have developed methods to better capture conformational changes, updated our scoring function, and integrated structure prediction tools such as AlphaFold2 in our docking routines. Here, we highlight several notable CAPRI targets and address the major challenges in the blind prediction of protein-protein interactions, including binding-induced conformational changes, large multimeric proteins, and antibody-antigen interactions. Although predictors have achieved modest improvements in accuracy of simpler targets post-AlphaFold2, performance for more flexible complexes remains limited. We employed RosettaDock 4.0, ReplicaDock 2.0, and AlphaRED to enhance backbone conformational sampling for flexible complexes. Our docking routines improved the DockQ score (0.77 vs. 0.62 for AF2-multimer) for a GP2 bacteriophage protein (T194), effectively capturing binding-induced conformational changes. Additionally, we introduce a fold-and-dock approach for predicting the assembly of a surface-layer SAP protein derived from Bacillus anthracis (T160), a large hetero-multimer comprising six distinct sub-units. For large symmetric complexes, we used Rosetta-based SymDock 2.0, successfully predicting a human DNA repair protein complex with A10 stoichiometry (T230) with high CAPRI-quality ranking. We also address the challenges in modeling antibody/nanobody-antigen interactions, particularly through the integration of deep learning tools and docking methods. Despite advances with tools like IgFold and AlphaFold2, accurately predicting CDR H3 loops and antibody-antigen binding interfaces remains challenging. Combining ReplicaDock 2.0 with deep learning highlights these difficulties and underscores the need for extensive sampling and CDR-focused strategies to improve prediction accuracy.

Marc Lensink

and 112 more

We present the results for CAPRI Round 54, the 5th joint CASP-CAPRI protein assembly prediction challenge. The Round offered 37 targets, including 14 homo-dimers, 3 homo-trimers, 13 hetero-dimers including 3 antibody-antigen complexes, and 7 large assemblies. On average ~70 CASP and CAPRI predictor groups, including more than 20 automatics servers, submitted models for each target. A total of 21941 models submitted by these groups and by 15 CAPRI scorer groups were evaluated using the CAPRI model quality measures and the DockQ score consolidating these measures. The prediction performance was quantified by a weighted score based on the number of models of acceptable quality or higher submitted by each group among their 5 best models. Results show substantial progress achieved across a significant fraction of the 60+ participating groups. High-quality models were produced for about 40% for the targets compared to 8% two years earlier, a remarkable improvement resulting from the wide use of the AlphaFold2 and AlphaFold-Multimer software. Creative use was made of the deep learning inference engines affording the sampling of a much larger number of models and enriching the multiple sequence alignments with sequences from various sources. Wide use was also made of the AlphaFold confidence metrics to rank models, permitting top performing groups to exceed the results of the public AlphaFold-Multimer version used as a yard stick. This notwithstanding, performance remained poor for complexes with antibodies and nanobodies, where evolutionary relationships between the binding partners are lacking, and for complexes featuring conformational flexibility, clearly indicating that the prediction of protein complexes remains a challenging problem.