Abstract
Critical Assessment of PRediction of Interactions (CAPRI) rounds 47
through 55 introduced 49 targets comprising multistage assemblies,
antibody-antigen complexes, and flexible interfaces. For these rounds,
we combined various Rosetta docking approaches (RosettaDock,
ReplicaDock, and SymDock) with deep learning approaches (AlphaFold2,
IgFold, and AlphaRED). Since prior CAPRI rounds, we have developed
methods to better capture conformational changes, updated our scoring
function, and integrated structure prediction tools such as AlphaFold2
in our docking routines. Here, we highlight several notable CAPRI
targets and address the major challenges in the blind prediction of
protein-protein interactions, including binding-induced conformational
changes, large multimeric proteins, and antibody-antigen interactions.
Although predictors have achieved modest improvements in accuracy of
simpler targets post-AlphaFold2, performance for more flexible complexes
remains limited. We employed RosettaDock 4.0, ReplicaDock 2.0, and
AlphaRED to enhance backbone conformational sampling for flexible
complexes. Our docking routines improved the DockQ score (0.77 vs. 0.62
for AF2-multimer) for a GP2 bacteriophage protein (T194), effectively
capturing binding-induced conformational changes. Additionally, we
introduce a fold-and-dock approach for predicting the assembly of a
surface-layer SAP protein derived from Bacillus anthracis (T160),
a large hetero-multimer comprising six distinct sub-units. For large
symmetric complexes, we used Rosetta-based SymDock 2.0, successfully
predicting a human DNA repair protein complex with A10 stoichiometry
(T230) with high CAPRI-quality ranking. We also address the challenges
in modeling antibody/nanobody-antigen interactions, particularly through
the integration of deep learning tools and docking methods. Despite
advances with tools like IgFold and AlphaFold2, accurately predicting
CDR H3 loops and antibody-antigen binding interfaces remains
challenging. Combining ReplicaDock 2.0 with deep learning highlights
these difficulties and underscores the need for extensive sampling and
CDR-focused strategies to improve prediction accuracy.