A Proteome-Scale Map of the Human Interactome Network

更新时间:2023-05-17 21:14:44 阅读: 评论:0

Resource A Proteome-Scale Map
of the Human Interactome Network
Thomas Rolland,1,2,19Murat Tas x an,1,3,4,5,19Benoit Charloteaux,1,2,19Samuel J.Pevzner,1,2,6,7,19Quan Zhong,1,2,8,19 Nidhi Sahni,1,2,19Song Yi,1,2,19Irma Lemmens,9Celia Fontanillo,10Roberto Mosca,11Atanas Kamburov,1,2
Susan D.Ghiassian,1,12Xinping Yang,1,2Lila Ghamsari,1,2Dawit Balcha,1,2Bridget E.Begg,1,2Pascal Braun,1,2
Marc Brehme,1,2Martin P.Broly,1,2Anne-Ruxandra Carvunis,1,2Dan Convery-Zupan,1,2Ror Corominas,13
Jasmin Coulombe-Huntington,1,14Elizabeth Dann,1,2Matija Dreze,1,2Ame´lie Dricot,1,2Changyu Fan,1,2Eric Franzosa,1,14 Fana Gebreab,1,2Bryan J.Gutierrez,1,2Madeleine F.Hardy,1,2Mike Jin,1,2Shuli Kang,13Ruth Kiros,1,2Guan Ning Lin,13 Katja Luck,1,2Andrew MacWilliams,1,2Jo¨rg Menche,1,12Ryan R.Murray,1,2Alexandre Palagi,1,2Matthew M.Poulin,1,2 Xavier Rambout,1,2,15John Rasla,1,2Patrick Reichert,1,2Viviana Romero,1,2Elien Ruyssinck,9Julie M.Sahali
e,1,2 Annemarie Scholz,1,2Akash A.Shah,1,2Amitabh Sharma,1,12Yun Shen,1,2Kerstin Spirohn,1,2Stanley Tam,1,2 Alexander O.Tejeda,1,2Shelly A.Trigg,1,2Jean-Claude Twizere,1,2,15Kerwin Vega,1,2Jennifer Walsh,1,2
Michael E.Cusick,1,2Yu Xia,1,14Albert-La´szlo´Baraba´si,1,12,16Lilia M.Iakoucheva,13Patrick Aloy,11,17
Javier De Las Rivas,10Jan Tavernier,9Michael A.Calderwood,1,2,20David E.Hill,1,2,20Tong Hao,1,2,20
Frederick P.Roth,1,3,4,5,18,*and Marc Vidal1,2,*
1Center for Cancer Systems Biology(CCSB)and Department of Cancer Biology,Dana-Farber Cancer Institute,Boston,MA02215,USA
2Department of Genetics,Harvard Medical School,Boston,MA02115,USA
3Departments of Molecular Genetics and Computer Science,University of Toronto,Toronto,ON M5S3E1,Canada
4Donnelly Centre,University of Toronto,Toronto,ON M5S3E1,Canada
5Lunenfeld-Tanenbaum Rearch Institute,Mount Sinai Hospital,Toronto,ON M5G1X5,Canada
6Department of Biomedical Engineering,Boston University,Boston,MA02215,USA
7Boston University School of Medicine,Boston,MA02118,USA
8Department of Biological Sciences,Wright State University,Dayton,OH45435,USA
9Department of Medical Protein Rearch,VIB,9000Ghent,Belgium
10Cancer Rearch Center(Centro de Investigacio´n del Cancer),University of Salamanca and Conjo Superior de Investigaciones
Cientı´ficas,Salamanca37008,Spain
11Joint IRB-BSC Program in Computational Biology,Institute for Rearch in Biomedicine(IRB Barcelona),Barcelona08028,Spain
12Center for Complex Network Rearch(CCNR)and Department of Physics,Northeastern University,Boston,MA02115,USA
13Department of Psychiatry,University of California,San Diego,La Jolla,CA92093,USA
14Department of Bioengineering,McGill University,Montreal,QC H3A0C3,Canada
15Protein Signaling and Interactions Lab,GIGA-R,University of Liege,4000Liege,Belgium
16Department of Medicine,Brigham and Women’s Hospital,Harvard Medical School,Boston,MA02115,USA
17Institucio´Catalana de Recerca i Estudis Avanc¸ats(ICREA),Barcelona08010,Spain
18Canadian Institute for Advanced Rearch,Toronto M5G1Z8,Canada
19Co-first author
20Co-nior author
*h@utoronto.ca(F.P.R.),marc_vidal@dfci.harvard.edu(M.V.)
/10.ll.2014.10.050
SUMMARY
Just as reference genome quences revolutionized human genetics,reference maps of interactome networks will be critical to fully understand geno-type-phenotype relationships.Here,we describe a systematic map of$14,000high-quality human bi-nary protein-protein interactions.At equal quality, this map is$30%larger than what is available from small-scale studies published in the literature in the last few decades.While currently available informa-tion is highly biad and only covers a relatively small portion of the proteome,our systematic map appears strikingly more homogeneous,revealing a‘‘broader’’human interactome network than currently appre-ciated.The map also uncovers significant inter-connectivity between known and candidate cancer gene products,providing unbiad evidence for an expanded functional cancer landscape,while demon-strating how high-quality interactome models will help‘‘connect the dots’’of the genomic revolution. INTRODUCTION
Since the relea of a high-quality human genome quence a decade ago(International Human Genome Sequencing Con-sortium,2004),our ability to assign genotypes to phenotypes has exploded.Genes have been identified for most Mendelian dis-orders(Hamosh et al.,2005)and over100,000alleles have been implicated in at least one disorder(Stenson et al.,2014).Hundreds of susceptibility loci have been uncovered for numerous complex traits(Hindorff et al.,2009)and the gen
omes of a few thousand hu-man tumors have been nearly fully quenced(Chin et al.,2011). This genomic revolution is poid to generate a complete descrip-tion of all relevant genotypic variations in the human
population.
1212Cell159,1212–1226,November20,2014ª2014Elvier Inc.
Genomic quencing will,however,if performed in isolation, leave fundamental questions pertaining to genotype-phenotype relationships unresolved(Vidal et al.,2011).The causal changes that connect genotype to phenotype remain generally unknown, especially for complex trait loci and cancer-associated mu-tations.Even when identified,it is often unclear how a causal mu-tation perturbs the function of the corresponding gene or gene product.To‘‘connect the dots’’of the genomic revolution,func-tions and context must be assigned to large numbers of geno-typic changes.
Complex cellular systems formed by interactions among genes and gene products,or interactome ne
tworks,appear to underlie most cellular functions(Vidal et al.,2011).Thus,a full understand-ing of genotype-phenotype relationships in human will require mechanistic descriptions of how interactome networks are per-turbed as a result of inherited and somatic dia susceptibil-ities.This,in turn,will require high-quality and extensive genome and proteome-scale maps of macromolecular interactions such as protein-protein interactions(PPIs),protein-nucleic acid inter-actions,and posttranslational modifiers and their targets.
First-generation human binary PPI interactome maps(Rual et al.,2005;Stelzl et al.,2005)have already provided network-bad explanations for some genotype-phenotype relation-ships,but they remain incomplete and of insufficient quality to derive accurate global interpretations(Figure S1A available on-line).There is a dire need for empirically-controlled(Venkatesan et al.,2009)high-quality proteome-scale interactome reference maps,reminiscent of the high-quality reference genome quence that revolutionized human genetics.
The challenges are manifold.Even considering only one splice variant per gene,approximately20,000protein-coding genes (Kim et al.,2014;Wilhelm et al.,2014)must be handled and $200million protein pairs tested to generate a comprehensive bi-nary reference PPI map.Whether such a comprehensive network could ever be mapped by the collective efforts of small-
scale studies remains uncertain.Computational predictions of protein interactions can generate information at proteome scale(Zhang et al.,2012)but are inherently limited by bias in currently avail-able knowledge ud to infer such interactome models.Should in-teractome maps be generated for all individual human tissues us-ing biochemical cocomplex association data,or would‘‘context-free’’information on direct binary biophysical interaction for all possible PPIs be preferable?To what extent would the ap-proaches be complementary?Even with nearly complete,high-quality reference interactome maps of biophysical interactions, how can the biological relevance of each interaction be evaluated under physiological conditions?Here,we begin to address the questions by generating a proteome-scale map of the human bi-nary interactome and comparing it to alternative network maps.
RESULTS
Vast Uncharted Interactome Zone in Literature
To investigate whether small-scale studies described in the liter-ature are adequate to qualitatively and comprehensively map the human binary PPI network,we asmbled all binary pairs identi-fied in such studies and available as of2013from ven public databas(Figure S1B,e Extended Experimental Procedures,
Section1).Out of the33,000lit erature b inary pairs extracted,two thirds were reported in only a s ingle publication and detected by only a single method(Lit-BS pairs),thus potentially prenting higher rates of curation errors than binary pairs supported by m ul-tiple pieces of evidence(Lit-BM pairs;Tables S1A,S1B,and S1C) (Cusick et al.,2009).Testing reprentative samples from both of the ts using the mammalian protein-protein interaction trap (MAPPIT)(Eyckerman et al.,2001)and yeast two-hybrid(Y2H) (Dreze et al.,2010)assays,we obrved that Lit-BS pairs were recovered at rates that were only slightly higher than the ran-domly lected protein pairs ud as negative control(random reference t;RRS)and significantly lower than Lit-BM pairs(Fig-ure1A and Table S2A;e Extended Experimental Procedures, Section2).Lit-BS pairs co-occurred in the literature significantly less often than Lit-BM pairs as indicated by STRING literature mining scores(Figure1A and Figure S1C;e Extended Experi-mental Procedures,Section2)(von Mering et al.,2003),suggest-ing that the pairs were less thoroughly studied.Therefore,u of binary PPI information from public databas should be restricted to interactions with multiple pieces of evidence in the literature.In2013,this corresponded to11,045high-quality pro-tein pairs(Lit-BM-13),more than an order of magnitude below current estimates of the number of PPIs in the full human interac-tome(Stumpf et al.,2008;Venkatesan et al.,2009).
punching
The relatively low number of high-quality binary literature PPIs may reflect inspection bias inherent to small-scale studies. Some genes such as RB1are described in hundreds of publi-cations while most have been mentioned only in a , the unannotated C11orf21gene).To investigate the effect of such bias on the current coverage of the human interactome network,we organized the interactome arch space by ranking proteins according to the number of publications in which they are mentioned(Figure1B).Interactions between highly studied proteins formed a striking‘‘den zone’’in contrast to a large sparly populated zone,or‘‘spar zone,’’involving poorly studied proteins.Candidate gene products identified in genome-wide association studies(GWAS)or associated with Mendelian disorders distribute homogeneously across the pub-lication-ranked interactome space(Figure1B and Figure S1D), demonstrating a need for unbiad systematic PPI mapping to cover this uncharted territory.
A Proteome-wide Binary Interactome Map
Bad on literature-curated information,the human interactome appears to be restricted to a narrow den zone,suggesting that half of the human proteome participates only rarely in the inter-actome network.Alternatively,the zone that appears spar in the literature could actually be homogeneously populated by PPIs that have been overlooked due to sociological or experi-mental bias.
To distinguish between the possibilities and address other fundamental questions outlined above,we generated a new pro-teome-scale binary interaction map.By acting on all four param-eters of our empirically-controlled framework(Venkatesan et al., 2009),we incread the coverage of the human binary interac-tome with respect to our previous h uman i nteractome data t obtained by investigating a arch space defined by$7,000pro-tein-coding genes(‘‘Space I’’)and published in2005(HI-I-05) Cell159,1212–1226,November20,2014ª2014Elvier Inc.1213
B
C
红楼梦英文Figure 1.Vast Uncharted Interactome Zone in Literature and Generation of a Systematic Binary Data Set
(A)Validation of binary literature pairs extracted from public databas (Bader et al.,2003;Berman et al.,2000;Chatr-Aryamontri et al.,2013;Kerrien et al.,2012;Licata et al.,2012;Keshava Prasad et al.,2009;Salwinski et al.,2004).Fraction of pairs recovered by MAPPIT at increasing RRS recovery rates (top left)and at 1%RRS recovery rate (bottom left),found to co-occur in the literature as reported in the STRING databa (upper right),and recovered by Y2H (lower right).Shading and error bars indicate standard error of the proportion.p values,two-sided Fisher’s exact tests.For n values,e Table S6.
(B)Adjacency matrix showing Lit-BM-13interactions,with proteins in bins of $350and ordered by number of publications along both axes.Upper and right histograms show the median number of publications per bin.The color intensity of each square reflects the total number of interactions between proteins for the corresponding bins.Total number of interactions per bin (lower histogram).Number of products from GWAS loci (Hindorff et al.,2009),Mendelian dia (Hamosh e
t al.,2005),and Sanger Cancer Gene Census (Cancer Census)(Futreal et al.,2004)genes per bin (circles).
(legend continued on next page)
1214Cell 159,1212–1226,November 20,2014ª2014Elvier Inc.
(Rual et al.,2005)(Figures1C and1D;e Extended Experi-mental Procedures,Section3).A arch space consisting of all pairwi combinations of proteins encoded by$13,000genes (‘‘Space II’’;Table S2B)was systematically probed,reprenting a3.1-fold increa with respect to the HI-I-05arch space.To gain in nsitivity,we performed the Y2H assay in different strain backgrounds that showed incread detection of pairs of a pos-itive reference t(PRS)compod of high-quality pairs from the literature without increasing the detection rate of RRS pairs.To increa our sampling,the entire arch space was screened twice independently.Pairs identified in thisfirst pass were sub-quently tested pairwi in quadruplicate starting from fresh yeast colonies.To ensure reproducibility,only pairs testing pos-itive at least three times out of the four attempts and with confirmed identity were considered interacting pairs,resulting in$14,000distinct interacting protein pairs.宁波教育网
We validated the binary interactions using three binary pro-tein interaction assays that rely on diffe
rent ts of conditions than the Y2H assay:(1)reconstituting a membrane-bound recep-tor complex in mammalian cells using MAPPIT,(2)in vitro using the well-bad nucleic acid programmable protein array (wNAPPA)assay(Braun et al.,2009;Ramachandran et al., 2008),and(3)reconstituting afluorescent protein in Chine hamster ovary cells using a protein-fragment complementation assay(PCA)(Nyfeler et al.,2005)(e Extended Experimental Procedures,Section4).The Y2H pairs exhibited validation rates that were statistically indistinguishable from a PRS of$500Lit-BM interactions while significantly different from an RRS of $700pairs with all three orthogonal assays and over a large range of score thresholds(Figure1D,Tables S2A and S2C),demon-strating the quality of the entire data t.Using three-dimensional cocrystal structures available for protein complexes in the Protein Data Bank(Berman et al.,2000)and for domain-domain interac-tions(Stein et al.,2011)(Figure S2and Tables S2D,S2E,and S2F; e Extended Experimental Procedures,Sections5and6),we also demonstrated that our binary interactions reflect direct bio-physical contacts,a conclusion in stark contrast to a previous report suggesting that Y2H interactions are inconsistent with structural data(Edwards et al.,2002).Our results also suggested that Y2H nsitivity correlates with the number of residue-residue contacts and thus presumably with interaction affinity.The corresponding h uman i nteractome data t covering Space II and reported in2014(HI-II-14;Table S2G)is the largest experi-mentally-determined binary interaction map yet reported,with 13,944interactions among4,303distinct proteins.
Overall Biological Significance
To asss the overall functional relevance of HI-II-14,we combined computational analys with a large-scale experi-mental approach.Wefirst measured enrichment for shared Gene Ontology(GO)terms and phenotypic annotations and obrved that HI-II-14shows significant enrichments that are similar to tho of Lit-BM-13(Figures2A and2B;e Extended Experimental Procedures,Section7).Second,we measured how much binary interactions from HI-II-14reflect membership in larger protein complexes as annotated in CORUM(Ruepp et al.,2010)or reported in a cocomplex association map (Woodsmith and Stelzl,2014).In both cas,we obrved a significant enrichment for binary interactions between protein pairs that belong to a common complex(p<0.001;Figure2B). Third,we performed a similar analysis using tissue-specific mRNA expression data across the16human tissues of the Illu-mina Human Body Map2.0project as well as cellular compart-ment localization annotations from the GO Slim terms.Again, HI-II-14was enriched for interactions mediated by protein pairs prent in at least one common compartment or cell type(Fig-ures2C and2D).Finally,we measured the overlap of HI-II-14 with specific biochemical relationships,as reprented by kina-substrate interactions.Both HI-II-14and Lit-BM-13 contained significantly more PPIs reflecting known kina-substrate relationships(Hornbeck et al.,2012)than th
e corre-sponding degree-controlled randomized networks(Figure2E). In addition,HI-II-14tended to connect tyrosine and rine/ threonine kinas(Manning et al.,2002)to proteins with tyro-sine or rine/threonine phospho-sites(Hornbeck et al.,2012; Oln et al.,2010),respectively(Figure S3A),pointing to the corresponding interactions being genuine kina-sub-strate interactions.In short,our systematic interactome map, which was generated independently from any pre-existing biological information,reveals functional relationships at levels comparable to tho en for the literature-bad interaction map.
www comap comTo further investigate the overall biological relevance of HI-II-14,we ud an experimental approach that compares the impact of mutations associated with human disorders to that of common variants with no reported phenotypic conquences on biophysical interactions(Figure3).Our rationale is that a t of interactions corresponding to genuine functional relationships should more likely be perturbed by dia-associated mu-tations than by common variants.The following example will illustrate this concept.Mutations R24C and R24H in CDK4are clearly associated with melanoma by conferring resistance to CDKN2A inhibition(Wo¨lfel et al.,1995),whereas N41S and S52N mutations are of less clear clinical significance(Zhong et al.,2009)and have remained functionally uncharacterized. HI-II-14containsfive CDK4interactors:two inhibitors(CDKN2C and CDKN2D),two cy
clins(CCND1and CCND3),and HOOK1,a novel interacting partner and a potential phosphorylation target
马赫数(C)Improvements fromfirst-generation to cond-generation interactome mapping bad on an empirically-controlled framework(Venkatesan et al.,2009). Completeness:fraction of all pairwi protein combinations tested;Assay nsitivity:fraction of all true biophysical interactions that are identifiable by a given assay;Sampling nsitivity:fraction of identifiable interactions that are detected in the experiment;Precision:fraction of reported pairs that are true positives.PRS:positive reference t;RRS:random reference t.
(D)Experimental pipeline for identifying high-quality binary protein-protein interactions(left).ORF:open reading frame.Fraction of HI-II-14,PRS,and RRS pairs (right)recovered by MAPPIT,PCA,and wNAPPA at increasing assay stringency.Shading indicates standard error of the proportion.p>0.05for all assays when comparing PRS and HI-II-14at1%RRS,two-sided Fisher’s exact tests.For n values,e Table S6.
See also Figures S1and S2and Tables S1and S2.
Cell159,1212–1226,November20,2014ª2014Elvier Inc.1215
of CDK4(Figure S3B).In agreement with previous reports,the comparative interaction profile shows that R24C and R24H,but not N41S and S52N,specifically perturb CDK4binding to CDKN2C (Figure 3).In total,we identified 32human genes for which:(1)the cor-responding gene product is reported to have binary interactors in HI-II-14,(2)germline dia-associated misn mutations have been reported,and (3)common coding misn variants
unlikely to be involved in any dia have been identified in the 1000Genomes Project (1000Genomes Project Consortium,2012).To avoid overreprentation of certain genes,we le-cted a total of 115variants,testing up to four dia and four common variants per dia gene for their impact on the ability of the corresponding proteins to interact with known interaction partners (e Extended Experimental Procedures ,Section 8).Dia variants were 10-fold more likely to perturb interactions than nondia variants (Figure 3and Table S3).Strikingly,more than 55%of the 107HI-II-14interactions tested were perturbed by at least one dia-associated variant,and the same trend was obrved when considering only mutants with evidence of expression in yeast as indicated by their ability to mediate at least one interaction (Figure S3C).Examples of novel specifically perturbed interactions include
AANAT-
Prediction
G
e n e  O n t o l
o g y M
o u s e  p h e n o t y p e s U n i o n B P M F C
经典英语文章
C
C O R U
sistersM M S
1101001,000E n r i c h m e n t  o d d s  r a t i o
1
101001,000C o m p l e x e s 0
especially150300868890920150300
0250500
250500F r e q u e n c y  i n  r a n d o m i z e d  n e t w o r k s
Fraction in same compartment (%)Fraction in same
cell type (%)
Shared GO terms Shared phenotypes
Cocomplex membership Coexpresd Colocalized
Kina-substrate relationship
Number of known kina-substrate interactions
510
04008000
150300A
B
C
D大学英语作文模板
E
Figure 2.Overall Biological Significance
(A)Schematic of the method to asss biological relevance of binary maps.
(B)Enrichment of binary interactome maps for functional relationships (left)and cocomplex memberships (right).Error bars indicate 95%confidence intervals.BP:Biological process;MF:Molecular function;CC:Cellular component.Mou phenotypes:Shared phenotypes in mou models by orthology mapping.MS:Mass-spec-trometry-bad map.Enrichments:p %0.05for all annotations and maps,two-sided Fisher’s exact tests.For n values,e Table S6.
(C)and (D)Fraction of binary interactions between proteins localized in a common cellular compart-ment and proteins coprent in at least one cell type (arrows)compared to tho in 1,000degree-controlled randomized networks.Empirical p values.For n values,e Table S6.
(E)Number of known kina-substrate interactions found in binary maps (arrows)compared to tho in 1,000randomized networks.Empirical p values are shown.
See also Figure S3.
befit
BHLHE40and RAD51D-IKZF1(Figure 3).In the first ca,the A129T mutation in AANAT is known to be associated with
a delayed sleeping pha syndrome
and specifically perturbs an interaction
between AANAT and BHLHE40,the
product of a gene reported to function in circadian rhythm regulation (Naka-shima et al.,2008).In the cond ca,the breast-cancer-associated RAD51D E233G mutation perturbs interactions
with a number of partners,including the known cancer gene product IKZF1(Futreal et al.,2004).Altogether the computational and experimental results pro-vide strong evidence that HI-II-14pairs correspond to biologi-cally relevant interactions and reprent a valuable resource to further our understanding of the human interactome and its per-turbations in human dia.A ‘‘Broader’’Interactome
Unlike literature-curated interactions,HI-II-14protein pairs are distributed homogeneously across the interactome space (Fig-ure 4A),indicating that sociological bias,and not fundamental biological pro
perties,underlie the existence of a denly popu-lated zone in the literature.Since 1994,the number of high-qual-ity binary literature PPIs has grown roughly linearly to reach $11,000interactions in 2013(Figure 4B),while systematic data ts are punctuated by a few large-scale releas.Although the spar territory of the literature map gradually gets popu-lated,interaction density in this zone continues to lag behind that of the den zone (Figure 4B).In terms of proteome coverage,the expansion rate is faster for systematic maps than for literature maps,especially in the spar territory (Fig-ure 4C and Figure S4A;e Extended Experimental Procedures ,
1216Cell 159,1212–1226,November 20,2014ª2014Elvier Inc.

本文发布于:2023-05-17 21:14:44,感谢您对本站的认可!

本文链接:https://www.wtabcd.cn/fanwen/fan/78/672636.html

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。

标签:模板   教育网   经典   大学   宁波
相关文章
留言与评论(共有 0 条评论)
   
验证码:
推荐文章
排行榜
Copyright ©2019-2022 Comsenz Inc.Powered by © 专利检索| 网站地图