This new paper describes our ClusPro web server, a widely used tool for protein-protein docking. ClusPro provides a simple interface for basic use, but it also offers a number of advanced options to modify the search. These include the removal of unstructured protein regions, application of attraction or repulsion, accounting for pairwise distance restraints, construction of homo-multimers, consideration of small-angle X-ray scattering (SAXS) data, and location of heparin-binding sites. Applications of ClusPro include docking X-ray or NMR structures of proteins, modeling antibody-antigen interactions, constructing the structure of multidomain proteins, building homo-oligomers, peptide docking, homology model docking, and more.
This Nature Protocols paper describes in details the series of web servers developed based on FTMap including FTSite, to predict ligand-binding sites, FTFlex, to account for side chain flexibility, FTMap/param, to parameterize additional probes, and FTDyn, for mapping ensembles of protein structures. Applications of the FTMap family of servers include determining the druggability of proteins, identifying ligand moieties that are most important for binding, finding the most bound-like conformation in ensembles of unliganded protein structures and providing input for fragment-based drug design. FTMap is more accurate than classical mapping methods such as GRID and MCSS, and it is much faster than the more-recent approaches to protein mapping based on mixed molecular dynamics.
CAPRI (Critical Assessment of Predicted Interactions) is a community-wide experiment devoted to the prediction of protein complexes based on the structures of the component proteins.
The results for targets 43-58 were evaluated at the Fifth CAPRI Evaluation Meeting in Utrecht in April 2013, for 63 predictor groups and 12 automated docking servers.
Automatic protein docking server ClusPro v2.0 developed by the groups of Dima Kozakov and Sandor Vajda, was the best in the server category. In particular, the server’s performance was comparable to that of the best human predictor groups, although the latter had access to all information available in the literature. The summary of the results is also shown below. For each predictor group, the table shows the number of acceptable or better predictions, and among those the number of high quality models, indicated by three stars, as well as the number of medium quality solutions, indicated by two stars.
|1||CLUSPRO (Boston University)||6/4**/2*|
|2||HADDOCK (Utrecht University)||4/1***/2**|
|3||SWARMDOCK (London Research I)||4/1**/3*|
|4||PIE-DOCK (U Texas)||3/1**/2*|
|1||A. Bonvin (Utrecht University)||9/1***/3**|
|2||P. Bates ( London Research I)||8/2**/6*|
|3||I. Vakser (University of Kansas)||7/1***/6*|
|4||D. Kozakov/ S.Vajda (Boston University)||6/2***/4**|
|5||Y. Shen (TTIC)||6/1***/3**|
|6||Fernandez-Recio (Barcelona SC)||6/1***/3**|
|7||CLUSPRO (server, Boston University)||6/4**/2*|
|8||X. Zou (University of Missouri)||6/1***/2**|
|9||M.Zacharias (Jacobs University)||6/1***/5*|
Thanks to successful CAPRI participation, ClusPro v2.0 enjoys heavy usage by academic community. In the last 4 years it ran more than 50000 jobs for 4000 registered and around 3000 unregistered users. Although the number of CAPRI targets is still too small for any significant conclusion, we believe that our results provide some information on the current state of automated protein docking.
Our main observations are as follows.
- ClusPro reliably yields correct predictions for the relatively “easy” targets with at most moderate conformational changes in the backbone. In addition to unbound proteins of known structure, such “easy” targets may include designed proteins obtained by mutating a few residues. Targets T50 and T53 were in this category, and ClusPro provided good results. The CAPRI community submitted many good predictions for targets T47, T48, T49, T50, T53, and T57, that is, exactly for the ones ClusPro also predicted well, confirming that these targets are relatively easy. Based on this logic we should have obtained an acceptable or better model for an additional target, T58, but the change in the backbone conformation of a lysozyme loop was too large for ClusPro, although other groups using rigid-body methods such as GRAMM were able to produce an acceptable model, but only for manual submission. The three other targets, T46, T51, and T54 which were difficult for ClusPro were also difficult for the entire CAPRI community, resulting in very few acceptable submissions. As will be further discussed, all these targets required homology modeling.
- The quality of automated docking by ClusPro is very close to that of the best human predictor groups, including of our own. We consider this very important, because servers have to submit results within 48 h and the predictions should be reproducible by the server, whereas human predictors have several weeks and can use any type of information. In Rounds 22–27 three predictor groups (Bonvin, Bates, and Vakser) did extremely well, and submitted acceptable or better predictions for more than six targets. These three were followed by six groups that had good predictions for six targets: Vajda (2*** + 3** + 1*), Fernandez-Recio (1*** + 3** + 2*), Shen (1*** + 3** + 2*), Zou (1*** + 2** + 3*), Zacharias (1*** + 5*), and ClusPro (4** + 2*). The only difference between ClusPro and the other five groups is due to the ability of the human predictors obtaining high accuracy predictions for T47 by template-based modeling. Since ClusPro does not have this option, it had to use direct docking, and produced only a medium accuracy model. We emphasize that in the earlier rounds of CAPRI server predictions were substantially inferior to those of the human predictors—this is definitely not the case for ClusPro 2.0 in Rounds 22–27. However, ClusPro seems to be an exception, as for most other groups the manual submissions are generally much better than the submissions from their servers.
- As mentioned, our manual submissions were obtained by refining the ClusPro results using “stability analysis”, requiring a large number of relatively short MCM runs. In spite of substantial computational efforts, the improvements due to the refinement are moderate. Apart from T47, where obtaining high accuracy predictions were trivial, the refinement improved predictions only for two targets, T53 and T57. However, it appears that refining predictions to high accuracy was generally very difficult for all targets (again, not considering T47). In fact, the only high accuracy model submitted by any group for any target in Rounds 22–27 was our manual submission for target T53.
- Fourth, a new development, not seen in previous rounds of CAPRI, is that the top ranked model M01 provided by ClusPro was acceptable or better quality for all the six targets that Cluspro was able to predict. M01 was also the highest quality model for five of these six targets. The only exception was T48, where models M06 and M07 were medium quality, while model M01 was only acceptable. Due to the very small number of targets the generality of this observation is not at all clear, but suggests that ranking predictions based on cluster size can reliably identify the highest accuracy models.
- The most difficult targets, T46, T51, and T54 required the construction of homology models based on templates with moderate sequence identity. The poor results for these targets, either by ClusPro or rest of the CAPRI community, show that the quality of homology models plays a critical role in docking. For example, while ClusPro did not produce any prediction for target T54 with the models we constructed, an acceptable submission was found by the Shen group, who also relied on ClusPro for the initial docking, but used a better homology model. Thus, there is a need for methods that are specifically designed for docking homology models, for example, by further reducing the sensitivity of the scoring function to steric clashes involving mutated side chains and predicted loop regions.
Lensink, M. F. and Wodak, S. J. 2013. Docking, scoring, and affinity prediction in CAPRI. Proteins: Structure, Function, and Bioinformatics. link
Kozakov D, Beglov D, Bohnuud T, Mottarella SE, Xia B, Hall DR, Vajda S. 2013. How good is automated protein docking? Proteins: Structure, Function, and Bioinformatics. link
Seventeen-year-old Eric Chen won the 17-18 age category AND the Grand Prize in this year’s Google Science Fair, with his project, Computer-aided Discovery of Novel Influenza Endonuclease Inhibitors to Combat Flu Pandemic. Among a huge amount of computational studies and biological assays, he used FTMap program developed by Vajda and Kozakov groups.
FTMap software enables high school student research in drug discovery against flu, and gets him to Google Science Fair finals
Google Science Fair is an online science competition for 13-18 year old students around the globe sponsored by Google, Lego, CERN, National Geographic and Scientific American. This year one of the fifteen finalists selected across the world was Eric Chen (USA) with the project “Computer-aided Discovery of Novel Influenza Endonuclease Inhibitors to Combat Flu Pandemic”. The key result of the research was identification a number of novel, potent endonuclease inhibitors, which can serve as leads for a new type of anti-flu medicine, effective against all influenza viruses including pandemic strains. One of the key elements of inhibitor discovery protocol was FTMap Server and Software developed by Vajda and Kozakov groups.
See Business Insider’s coverage of the story here.
Paper on connection of druggable and alanine scanning hotspots published by Structural Bioinformatics lab was among top 10 most read papers published in JCIM in 3rd quarter of 2012 (list copied below).
What are your colleagues reading in the Journal of Chemical Information and Modeling? The articles below represent the most read from Journal of Chemical Information and Modeling between July and September 2012. Journal of Chemical Information and Modeling Most Read e-alerts are the easiest way to stay up-to-date with the hottest topics in your research community. John J. Irwin, Teague Sterling, Michael M. Mysinger, Erin S. Bolstad, Ryan G. ColemanDOI: 10.1021/ci3001277 Thomas Scior, Andreas Bender, Gary Tresadern, José L. Medina-Franco, Karina Martinez-Mayorga, Thierry Langer, Karina Cuanalo-Contreras, Dimitris K. AgrafiotisDOI: 10.1021/ci200528d Richard D. Smith, Alaina L. Engdahl, James B. Dunbar, Heather A. CarlsonDOI: 10.1021/ci200612f Noé Sturm, Jérémy Desaphy, Ronald J. Quinn, Didier Rognan, Esther KellenbergerDOI: 10.1021/ci300196g What is Wrong with Quantitative Structure-Property Relations Models Based on Three-Dimensional Descriptors?M. Hechinger, K. Leonhard, W. MarquardtDOI: 10.1021/ci300246m Brandon S. Zerbe, David R. Hall, Sandor Vajda, Adrian Whitty, Dima KozakovDOI: 10.1021/ci300175u Laura Silvestri, Flavio Ballante, Antonello Mai, Garland R. Marshall, Rino RagnoDOI: 10.1021/ci300160y Daniel N. Santiago, Yuri Pevzner, Ashley A. Durand, MinhPhuong Tran, Rachel R. Scheerer, Kenyon Daniel, Shen-Shu Sung, H. Lee Woodcock, Wayne C. Guida, Wesley H. BrooksDOI: 10.1021/ci300073m Marco Pasi, Matteo Tiberti, Alberto Arrigoni, Elena PapaleoDOI: 10.1021/ci300213c Johannes Kirchmair, Mark J. Williamson, Jonathan D. Tyzack, Lu Tan, Peter J. Bond, Andreas Bender, Robert C. GlenDOI: 10.1021/ci200542m
In this study we developed a new version of FTMap to map DNA structure which successfully identified the binding hot spots in the minor groove of B-DNA. We also provide some insight on how the recently discovered high-frequency Hoogsteen flipping of base pairs could affect DNA’s reactivity with formaldehyde.
This work is accepted by Nucleic Acids Research and has been chosen as one of the Featured Articles, which “represent the top 5% of papers in terms of originality, significance and scientific excellence”.
Please read the paper, Bohnuud T, Beglov D, Ngan CH, Zerbe B, Hall DR, Brenke R, Vajda S, Frank-Kamenetskii MD, Kozakov D. 2012. Computational mapping reveals dramatic effect of Hoogsteen breathing on duplex DNA reactivity with formaldehyde. Nucleic Acids Research.
Paper on druggability of Protein-Protein interactions was featured in PNAS, and highlighted in Nature Reviews Drug Discovery
Despite the growing number of examples of small-molecule inhibitors that disrupt protein–protein interactions (PPIs), the origin of druggability of such targets is poorly understood. To identify druggable sites in protein–protein interfaces we combine computational solvent mapping, which explores the protein surface using a variety of small “probe” molecules, with a conformer generator to account for side-chain flexibility. Applications to unliganded structures of 15 PPI target proteins show that the druggable sites comprise a cluster of binding hot spots, distinguishable from other regions of the protein due to their concave topology combined with a pattern of hydrophobic and polar functionality. This combination of properties confers on the hot spots a tendency to bind organic species possessing some polar groups decorating largely hydrophobic scaffolds. Thus, druggable sites at PPI are not simply sites that are complementary to particular organic functionality, but rather possess a general tendency to bind organic compounds with a variety of structures, including key side chains of the partner protein. Results also highlight the importance of conformational adaptivity at the binding site to allow the hot spots to expand to accommodate a ligand of drug-like dimensions. The critical components of this adaptivity are largely local, involving primarily low energy side-chain motions within 6 Å of a hot spot. The structural and physicochemical signature of druggable sites at PPI interfaces is sufficiently robust to be detectable from the structure of the unliganded protein, even when substantial conformational adaptation is required for optimal ligand binding.
This study was published in PNAS, was featured in PNAS, and highlighted in Nature Reviews Drug Discovery.
For more information, please read the paper:
Kozakov D, Hall DR, Chuang G-Y, Cencic R, Brenke R, Grove LE, Beglov D, Pelletier J, Whitty A, Vajda S. 2011. Structural conservation of druggable hot spots in protein-protein interfaces. Proceedings of the National Academy of Sciences. 108(33):13528-13533.
In the January 18 issue of PNAS, Cencic et. al . reported the results of an ultra-high-throughput screening for inhibitors of the translation initiation complex eIF4F. This screening resulted in the identification of a compound that prevents the formation of this complex from its components eIF4E and eIF4G. Blocking this interaction sensitizes many cancer types to the apoptotic response to DNA damage.
To better understand the action of this molecule, FTMap was used to characterize the hot spots in the interface and predict the binding mode of the inhibitor.
For more information, please read the paper at PNAS.