Department of Cellular and Molecular Pharmacology and bThe California Institute for Quantitative Biomedical Research, University of California, San Francisco, California 94158-
Abstract: Defining protein complexes is critical to virtually all aspects of cell biology- Two recent affinity purification-mass spectrometry studies in Saccharomyces cerevisiae have vastly increased the available protein interaction data- The practical utility of such high throughput interaction sets, however, is substantially decreased by the presence of false positives- Here we created a novel probabilistic metric that takes advantage of the high density of these data, including both the presence and absence of individual associations, to provide a measure of the relative confidence of each potential protein-protein interaction- This analysis largely overcomes the noise inherent in high throughput immunoprecipitation experiments- For example, of the 12,122 binary interactions in the general repository of interaction data -BioGRID- derived from these two studies, we marked 7504 as being of substantially lower confidence- Additionally, applying our metric and a stringent cutoff we identified a set of 9074 interactions -including 4456 that were not among the 12,122 interactions- with accuracy comparable to that of conventional small scale methodologies- Finally we organized proteins into coherent multisubunit complexes using hierarchical clustering- This work thus provides a highly accurate physical interaction map of yeast in a format that is readily accessible to the biological community-