Analysis of Protein-Metabolite Interactions

Sandor Vajda, Andrew Emili, and Daniel Segre

Supported by:
Supplement to NIH R35 GM118078 “Analysis and Prediction of Molecular Interactions” (PI: Sandor Vajda) for the purchase of a Mass Spectrometer.

The Emili lab performs mass spectrometry pull-down experiments to provide information on small molecules that bind specific cellular proteins. The data provided by such experiments, performed in a rigorous quantitative manner, will be used in a number of applications relevant to our research program. According to the preliminary mass spectroscopy data from E. coli, pulldowns of most cellular proteins yields a number of physically associated small molecules. If no outside compounds are added, these molecules most likely represent endogenous metabolites. However, in most cases the results suggest a group of metabolites with similar molecular weight and charge rather than a specific compound. The general goal of our collaborative analysis is identifying the compounds and their roles using bioinformatics and structural approaches. This will include identifying the metabolic pathways the proteins are participating in, finding potential binding sites of the proteins, determining the sites the metabolites are likely to engage and the specific compounds that most likely bind there. In collaboration with the Segre lab, the interactions will be matched against the known regulatory interactions and against known metabolic pathways to identify potentially biological relationships and pathway crosstalk, whereas the structural analysis will involve placing the molecules by docking and molecular dynamics simulations, and estimating the structural basis and strength of the interactions by energy and free energy calculations. The analysis will be largely developed and tested on E. coli that already has fairly complete protein structure and metabolic network information. The steps of the analysis will be as follows.

  1. Large scale analysis of the native ligands and potential binding sites on all E. coli proteins, considering known structures and homology models. This will include determining the druggability of the predicted sites.
  2. Matching metabolites from pull-down experiments with the predicted binding sites. Although our FTMap/Param server can already place user-specified molecules into the hot spots, we plan to improve the specificity of fragment placement by the ongoing re-design of the protein mapping scoring function.
  3. Further improving the location of bound metabolites by docking methods.
  4. Determining the stability of interactions by Monte Carlo and molecular dynamics simulations.
  5. Experimental validation of most likely interactions.

We expect to identify specific protein-ligand interactions, resulting in the potential discovery of novel regulatory and allosteric sites and determining their druggability. The drawback of this approach is that it requires X-ray structures or good homology models of the target proteins, but this type of information becomes increasingly available for proteins of biomedical significance.

Once the methodology is developed and tested for the analysis of PMIs in E. coli, we plan to apply it to human cells. The analysis will use the Human Metabolome Database or HMDB, which is a web-enabled metabolomic database containing comprehensive information about human metabolites along with their biological roles, physiological concentrations, disease associations, chemical reactions, and metabolic pathways. The database lists 114,100 metabolites, over 7000 with disease links, and a large number of pathway maps. However, HMDB does not attempt to specify the proteins the metabolites are likely to interact with, and information on such interactions is generally very limited in the literature. In view of this limitation, we plan to target specific proteins of known structure and biomedical significance and to determine their interacting metabolites in pull-down LC/MS experiments. The identification of interactions that are most likely functional will be performed as in the E. coli study. Again, the primary goal is to find novel regulatory and allosteric sites as well as compounds that interact with such sites with substantial affinity. This analysis provides an interesting drug discovery platform. Indeed, once fairly strong binders are found, we will use the structures of the metabolites as starting points in ligand-based methods such as the ROCS program (OpenEye Scientific Software) for the identification of similar compounds that may have stronger affinity. This platform can be particularly important when applied to proteins and metabolites that occur in tumor cells or are associated with specific diseases.

Vajda S, Emili A. Mapping global protein contacts. Science. 2019 Jul 12;365(6449):120-121. doi: 10.1126/science.aay1440.