a r X i v :1003.2424v 1 [p h y s i c s .s o c -p h ] 11 M a r 2010
Signed Networks in Social Media
Jure Leskovec Stanford University jure@cs.stanford.edu
Daniel Huttenlocher Cornell University ll.edu
Jon Kleinberg Cornell University ll.edu
ABSTRACT
Relations between urs on social media sites often reflect a mixture of positive (friendly)and negative (antagonistic)interactions.In contrast to the bulk of rearch on social net-works that has focud almost exclusively on positive inter-pretations of links between people,we study how the inter-play between positive and negative relationships affects the structure of on-line social networks.We connect our anal-ys to theories of signed networks from social psychology.We find that the classical theory of structural balance tends to capture certain common patterns of interaction,but that it is also at odds with some of the fundamental phenomena we obrve —particularly related to the evolving,directed na-ture of the on-line networks.We then develop an alternate theory of status that better explains the
obrved edge signs and provides insights into the underlying social mechanisms.Our work provides one of the first large-scale evaluations of theories of signed networks using on-line datats,as well as providing a perspective for reasoning about social media sites.
Author Keywords
signed networks,structural balance,status theory,positive edges,negative edges,trust,distrust.
ACM Classification Keywords
H.5.3Information Systems:Group and Organization Inter-faces—Web-bad interaction .
General Terms
Human Factors,Measurement,Design.
INTRODUCTION
Social network analysis provides a uful perspective on a range of social computing applications.The structure of net-works arising in such applications offers insights into pat-terns of int
eractions,and reveals global phenomena at scales that may be hard to identify when looking at a finer-grained resolution.At the same time,there is an ongoing challenge in adapting such network approaches to the study of social computing:urs develop rich relationships with one an-other in the ttings,while network analys generally re-
Permission to make digital or hard copies of all or part of this work for personal or classroom u is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page.To copy otherwi,or republish,to post on rvers or to redistribute to lists,requires prior specific permission and/or a fee.
CHI 2010,April 10–15,2010,Atlanta,Georgia,USA Copyright 2010ACM 978-1-60558-929-9/$10.00.
duce the complex relationship to the existence of simple pairwi links.It is a fundamental rearch problem to bridge the gap between the richness of the existing relationships and the stylized nature of network reprentations of the rela-tionships.
The main focus of our work here is to examine the inter-play between positive and negative links in social media —a dimension of on-line social network analysis that has been largely unexplored.With
relatively few exceptions (e.g.,[1,15,16]),rearch in on-line social networks has focud on contexts in which the interactions have largely only positive interpretations —that is,connecting people to their friends,fans,followers,and collaborators.But in many ttings it is important to also explicitly take negative relations into con-sideration,especially when studying interactions in social media:discussion lists are filled with controversy and dis-agreement,and social-networking sites harbor antagonism alongside amity.The richness of a social network in such cas generally consists of a mixture of both positive and negative interactions,co-existing in a single structure.We aim to develop a better understanding of the role that net-work structure plays when some links between people are positive while others are negative.For instance,in on-line rating sites such as Epinions,people can give both positive and negative ratings not only to items but also to other raters.In on-line discussion sites such as Slashdot,urs can tag other urs as “friends”and “foes”.Our approach here is to adapt and extend theories from social psychology to an-alyze the types of signed networks as they ari in social computing applications.The theories enable us to char-acterize the differences between the obrved and predicted configurations of positive and negative links in on-line so-cial networks.We also u contrasts between the theories to draw inferences about how links are being ud in particular social computing applications.In addition to insights into the applications themlves,our studies provide,to the best of our knowledge,some of th
e first large-scale evaluations of the social-psychological theories via on-line datats.Positive and negative links in on-line data.To carry out such an investigation,we need two fundamental ingredients:(i)large-scale datats from social applications where the sign of each link —whether it is positive or negative —can be reliably determined,and (ii)theories of signed networks that help us reason about how different patterns of positive and negative links provide evidence for the expression of dif-ferent kinds of relationships across the applications.
++
+----
+
++
--
----
--
triad T3triad T1triad T2triad T0 Figure1.Undirected signed triads.Bad on the number of positive edges we label triads with odd number of plus as balanced(T3,T1), and triads with even positive edges(T2,T0)as unbalanced.
We investigate social network structures from three widely-
ud Web sites.Thefirst is the trust network of Epinions,
年度审计计划where urs create signed directed relations to each other in-
dicating trust or distrust.The cond is the social network of the technology blog Slashdot,where urs designate others
as“friends”or“foes.”The third is the network defined by
votes for Wikipedia admin candidates.When a Wikipedia
ur is considered for a promotion to the status of an ad-min,the community is able to cast public votes in favor of or against the promotion of this admin candidate.We view a positive vote as corresponding to a positive link from the voter to the candidate,and a negative vote as a negative lin
k. The Epinions and Slashdot networks are explicitly prented to urs as social networking features of the sites,whereas in the ca of Wikipedia the network interpretation is implicit. The meanings of positive and negative signs are different across the ttings,and this is precily the point:we wish to u theories of signed edges to evaluate how the posi-tive and negative edges are being ud in each tting,and to identify commonalities and differences in the underlying networks in relatively different application contexts.More-over,while the current work focus on domains in which the signs of edges are overtly denoted(either explicitly by direct linking,or implicitly through actions such as voting on Wikipedia),we believe the underlying issues reach more broadly into any application where positive and negative at-titudes between urs can be conveyed,such as through n-timent in text[20].
Theories of signed networks:Balance.We analyze the
on-line signed networks using two different theories,and a
central issue in our study is the extent to which each of the
theories provides a plausible explanation for the structure and dynamics of the obrved networks.
美好生活英文Thefirst of the theories is structural balance theory,which
originated in social psychology in the mid-20th-century.As
formulated by Heider in the1940s[14],and subquently
cast in graph-theoretic language by Cartwright and Harary [4],structural balance considers the possible ways in which triangles on three individuals can be signed,and posits that triangles with three positive signs(three mutual friends,Fig-ure1T3)and tho with one positive sign(two friends with a common enemy,Fig.1T1)are more plausible—and hence should be more prevalent in real networks—than triangles with two positive signs(two enemies with a common friend, T2)or none(three mutual enemies,T0).Balanced triangles with three positive edges exemplify the principle that“the friend of my friend is my friend,”whereas tho with one positive and two negative edges capture the notions that“the friend of my enemy is my enemy,”“the enemy of my friend is my enemy,”and“the enemy of my enemy is my friend.”Structural balance theory has been developed extensively in
the time since this initial work[21],including the formula-
tion of a variant—weak structural balance—propod by
Davis in the1960s as a way of eliminating the assumption that“the enemy of my enemy is my friend”[7].In partic-
ular,weak structural balance posits that only triangles with
exactly two positive edges are implausible in real networks,
and that all other kinds of triangles should be permissible. Theories of signed networks:Status.Balance theory can be viewed as a model of likes and dislikes.However,as Guha et al.obrve in the context of Epinions[13],a signed link from A to B can have more than one possible inter-pretation,depending on A’s intention in creating the link. In particular,a positive link from A may mean,“B is my friend,”but it also may mean,“I think B has higher status than I do.”Similarly,a negative link from A to B may mean “B is my enemy”or“I think B has lower status than I do.”Here we develop this idea into a new theory of status,which provides a different organizing principle for directed net-works of signed links.In this theory of status,we consider a positive directed link to indicate that the creator of the link views the recipient as having higher status;and a negative directed link indicates that the recipient is viewed as having lower status.The relative levels of status can then be prop-agated along multi-step paths of signed links,often leading to different predictions than balance theory.
Comparing the two theories.To give a n for how the
中国女排电影differences between status and balance ari,consider the situation in which a ur A links positively to a ur B,and B in turn links positively to a ur C.If C then forms a link to A,what sign should we expect this link to have?Balance theory predicts that since C is a friend of A’s friend,we should e a positive link from C to A.Status theory,on the other hand,predicts that A regards B as having higher status, and B regards C as having higher status—so C should regard A as having low status and hence be inclined to link negatively to A.In other words,the two theories suggest opposite conclusions in this ca.
Thus balance theory predicts that certain types of triads such
as all-positive cycles should be overreprented compared to
chance,whereas status theory makes predictions that often
differ.We study all the possible types of signed triads and the predictions made by the different theories.In doing so we consider veral experimental conditions,including both directed and undirected networks,as well as both respecting and ignoring the order in which edges were created.
For each such experimental condition we consider whether the obrved number of triads of each type is overreprented or underreprented compared to chance,and contrast that with the predictions made by the balance and status theories. This analysis give us a picture of the aggregate patterns of links in the social networks,and the degree to which they are explained in terms of each theory.
Summary of Findings:Comparison of Balance and Sta-
tus.Both of the theories concern relationships between
people;by adapting them to our on-line network datats,
they provide potentially informative perspectives on the link
structures wefind there.
Balance theory was initially intended as a model for undi-
rected networks,although it has been commonly applied to
directed networks by simply disregarding the directions of the links[21].When we do this,wefind significant align-
ment between the obrved network data and Davis’s notion
of weak structural balance:triangles with exactly two posi-
tive edges are massively underreprented in the data relative to chance,while triangles with three positive edges are mas-sively overreprented.In two of the three datats,triangles with three negative edges are also overreprented,which is at odds with Heider’s formulation of balance theory.The findings are already intriguing,since it has traditionally been difficult to evaluate the predictions of structural balance the-ory on large network datats.Rather,empirical investi-gations to date have generally focud on small networks where social relations can be obrved through direct inter-action with the individuals involved([8]).The trou-ble with asssing structural balance at small scales is that one expects its predictions to be aggregate rather than abso-lute—that is,one expects to e certain kinds of triangles as statistically more abundant or less abundant in the data, and the significance of such bias towards certain kinds of triangles can stand out much more clearly when they are ac-cumulated over a large amount of data.
Ultimately,however,we would like to understand the net-
works in the on-line systems as directed structures that evolve over time.When we view the network data in this way,our main conclusion is that the theory of status is more effective at explaining local patterns of signed links,and that it naturally extends to capture richer aspects of ur behav-ior,including heterogeneity in their linking tendencies.For example in the ca offered as an illustration above,where ur A links positively to ur B and ur B links positively to ur C,wefind that negative links from C to A are mas-sively overreprented relative to chance,with positive links correspondingly underreprented.
Implications.There are veral potentially interesting im-
plications of our results.First,the comparison of balance
and status provides insights into ways in which people u
linking mechanisms in social computing applications.In particular,there are important domains such as rating re-viewers on Epinions and voting for admins on Wikipedia in which such links appear,in aggregate,to be ud more dom-inantly for expressions of status than for expressions of likes and dislikes.
The contrast between balance and status is also related to the
distinction between undirected and directed interpretations
of links.Ourfindings suggest that it is important to under-
stand the roles of different theories in both undirected and directed reprentations of networks.Indeed,the theory of status only makes n with directed links—since it posits a status differential from the creator of a link to its recipient —while the theory of balance has been applied in both undi-rected and directed ,[21]).The fact that(weak) balance is broadly consistent with the undirected reprenta-tion of our network data,while status is more consistent with the directed reprentation,shows that it possible for differ-ent theories to be appropriate to different levels of resolution in the reprentation of a single network.
In thefinal part of the paper,we describe further structural investigations that provide insight into ways in which signed links are ud in the applications.First,wefind that as-pects of the theory of balance hold more strongly on the subt of links in the networks that are reciprocated—consisting of directed links in both directions between two urs.This suggests that reciprocal link formation may fol-low a different pattern of u in the systems than unrecip-rocated link formation.
However,it is important to note that such reciprocal relations account for only a small proportion of the links between people on the sites.
Second,wefind a connection between the sign of a link and the extent to which it is embedded[12],i.e.,with the two endpoints having links to many common neighbors.A link is significantly more likely to be positive when its two end-points have multiple neighbors(of either sign)in common. This obrvation is consistent with qualitative notions of so-cial capital[3,5]—urs with common neighbors have rela-tions that are“on display”in a social n,and hence have greater implicit pressure to remain positive.Indeed in the three different social applications that we study,this effect is strongest in the ca of voting for Wikipedia admins,which is the tting that makes the relations most prominently visi-ble to urs.This suggests some of the ways in which the prence of common neighbors,and more overt forms of public display,can have an effect on the u of signed links. Thefindings about aggregate structural properties also be-gin to address a broad and largely open issue,which is to understand the sources of individual variation in linking be-havior.While reciprocation and embeddedness are only two dimensions along which to explore such variation,we be-lieve that the definitions and analysis pursued here can help in framing further investigation of questions regarding indi-vidual variation.
RELATED WORK
There is by now a large and rapidly growing literature on the analysis of social networks arising in on-line domains[18]; as we noted at the outt,this line of work has almost exclu-sively treated networks as implicitly having positive signs only.For example,portions of our analysis can be viewed as variants on the problem of link prediction[17]and tie-strength prediction[10],but in each ca adapted to take the signs of links into account.
Two recent papers in the analysis of on-line social networks stand out as taking the signs of links into account.Brzo-zowski et al.study the positive and negative relationships that exist on ideologically oriented sites such as Esmbly [1],but with the goal of predicting outcomes of group votes rather than the broader organization of the social network. Kunegis et al.study the friend/foe relationships on Slash-dot,and compute global network properties[15],but do not evaluate theories of balance and status as we do here.
Epinions Wikipedia
82,144
549,202
描写天气冷的句子
77.4%
22.6%
1,508,105
Meaning
Signed triad,also the number of triads of type T i ∆
Fraction of positive edges in the network
p(T i)
A priori prob.of T i(bad on sign distribution) E[T i]
Surpri,s(T i)=(T i−E[T i])/
p(T i)s(T i)
Epinions
T30.8701881.1
T10.071249.4
题诗后贾岛T20.052-2104.8
T00.007227.5
Slashdot
+++1,266,6460.464
+−−109,3030.119
++−115,8840.406
−−−16,2720.012去韩国必买的东西
T30.702379.6
T10.207289.1
牛顿简介
T20.080-572.6
T00.01110.8 Table3.Number of balanced and unbalanced undirected triads. into account.In this context,we can evaluate the predictions of structural balance theory by considering the frequencies of different types of signed triads—ts of three nodes with signed edges among all pairs.
qq签名Table3gives the counts of the four possible signed undi-rected triads,while Table2summarizes the symbols we u throughout the paper.Let p denote the fraction of positive edges in the network.The four possible signed undirected triads are denoted T0,T1,T2,and T3(Figure1).Among all triads in the data,the number that are of type T i is denoted |T i|and the fraction of type T i is denoted p(T i).Now,we would like to compare how this empirical frequency of triad types compares to the corresponding frequencies if edge signs were produced at random from the same background distri-bution of positive and negative signs.Thus,we shuffle the signs of all edges in the graph(keeping the fraction p of pos-itive edges the same),and we let p0(T i)denote the expected fraction of triads that are of type T i after this shuffling.
If p(T i)>p0(T i),then triads of type T i are overreprented in the data relative to chance;if p(T i)<p0(T i),then they are underreprented.We also want to measure how signif-icant this over-or underrepres
entation is.Thus,we define the surpri s(T i)to be the number of standard deviations by which the actual quantity of type-T i triads differs from the expected number under the random-shuffling model. Due to the Central Limit Theorem the distribution of s(T i) is approximately a standard normal distribution and so we would expect surpri on the order of tens to already be sig-nificant(s(T i)=6gives a p-value of≈10−8).However, the values of surpri wefind in our data are typically much larger.This means that due to the scale of the data and the large number of triads almost all our obrvations are statis-tically significant with p-values practically equal to zero. Wefind that the all-positive triad T3is heavily overrepre-nted in all three datats,and the triad T2consisting of two enemies with a common friend is heavily underreprented. Bad on the relative magnitudes of p(T i)and p0(T i),we e that T3tends to be over reprented by about40%in all three datats.Similarly,the unbalanced triad T2is under-reprented by about75%in Epinions and Slashdot and50% in Wikipedia.The obrvations so farfit well into Heider’s original notion of structural balance.
However,the relative abundances of triad types T1(single
positive edge)and T0(all negative edges)differ between
the datats,and none of the datats follow Heider’s theory
in both having T1overreprented and T0underreprented. Thus,the picture is more consistent with Davis’s weaker no-
tion of balance,where T2is viewed as implausible but there
is no a priori reason to favor one of T1or T0over the other.
ANALYSIS OF EVOLVING DIRECTED NETWORKS
We now consider the networks in the systems as directed graphs,incorporating the fact that the links being created go from one ur to another,with the sign of a link from A to B being generated by A.In the introduction,we discusd how the theories of balance and status offer competing inter-pretations for how we should expect such directed links to be signed.For example,as noted there,positive cycles—that is,directed triads with positive links from A to B to C to A—are underreprented in the data.This conflicts with balance theory,but is consistent with status theory.
Timing and Diversity:Generative and Receptive Ba-lines.Beyond just the directionality of links,there are ad-ditional features of the data that we take into account when evaluating the models.First,links are created at specific points in time,so rather than thinking of directed triads as ex
isting in a static snapshot of the network,we consider the order in which links are added to the network.Thus,we study how directed triads form,as follows.When a ur A links to a ur B,suppo there is already a ur X with the property that X has links to or from A,and also to or from B.This means there is a two-step mi-path from A to B through X(a path in which the directions of the edges do not matter),and the formation of the A-B link adds a short-cut to this path,producing a directed triad on A,B,and X. Second,different urs make u of positive and negative signs differently.At the most basic level,some urs pro-duce links almost exclusively of one sign or the other,while others produce a relatively even mix of both positive and negative links.We will refer to the overall fraction of posi-tive signs that a ur creates,considering all her links,as her generative baline.Similarly,some urs receive links that are almost exclusively of one sign or the other,while others receive a mix of signs.We will refer to the overall fraction of positive signs in the links a ur receives as his receptive baline.Given this,we should compare the abundance of positive and negative links to the generative and receptive balines of the urs producing and receiving the links. Once we incorporate the aspects of the data,we discover further mysteries—beyond just the scarcity of positive cy-cles—that em to call for alternatives to balance theory. For example,consider the ca of joint positive endorment —a situation in which a node X links positively to each of two nodes A and B.Suppo that in this ca,A now forms a link to ,triad t9
of Figure2);should we expect there to be an elevated probability of the link being positive,or a reduced probability of the link being positive?
In fact,in our data,the question turns out to have a more subtle answer than either of the alternatives.The link that is produced in this situation is more likely to be positive than
the generative baline of A,but at the same time less likely
to be positive than the receptive baline of B.Balance the-
ory,of cour,makes a much more naive prediction:since A and B are both friends of X,they should be friends of each
other.Can status theory explain this dual and opposite pair
of deviations from the balines of A and B?
We now show that in fact it can,and explaining how this
works forms the motivation for a theory of how status effects can influence the signs of directed links.
Formulating a Theory of Status
Since the phenomenon we are trying to capture is subtle but in the end familiar from everyday life,we begin with a hy-pothetical example to motivate the subquent definitions.
A Motivating Example.Suppo we were to interview the players on a college soccer team:for certain players A,and certain teammates
B of A,we ask,“How do you think the skill of player B compares to yours?”Suppo further that the players roughly agree on a ranking of each other by skill, which rves as an approximate(though not perfect)ranking of the team members by status.From the results of the interviews,we could produce a signed directed graph who nodes are the players,and with a directed edge from A to B if we asked A for her opinion of B.A positive link from A to B would indicate that A thinks highly of B’s skill relative to her own,while a negative link would indicate that A thinks she is better than B.
If we were just given this signed directed graph,and knew nothing el about the soccer team,then we could still make inferences about the signs of links that we haven’t yet ob-rved,using the context provided by the rest of the network. Suppo for example that we are about to ask player A’s opinion of another player B,but we don’t currently have A’s answer and hence don’t yet know the sig
n of the link from A to B.We can nonetheless make predictions about it from the links who signs we do know,as follows.Suppo that we know from the data already collected that A and B have each received a positive evaluation from a third player X.Here is a pair of facts we could conjecture about the link from A to B,given the positive links from X to A and B.•Since B has been positively evaluated by another team member,B is more likely than not to have above-average skill.Therefore,the evaluation that A gives B should be more likely to be positive than an evaluation given by A to a random team member.
•Since A has been positively evaluated by another team member,A is also more likely than not to have above-average skill.Therefore,the evaluation that A gives B should be less likely to be positive than an evaluation re-ceived by B from a random team member.
There are veral subtleties here.First,we’re using the indi-rection provided by a third party X to make inferences about the relation between A and B,bad on assumptions about status.Second,the context provided by X caus the sign of the A-B link to deviate from a random baline in different