Eigensolver Methods for Progressive
Multidimensional Scaling of Large Data
Ulrik Brandes and Christian Pich广州塔攻略
Department of Computer&Information Science,University of Konstanz,Germany {Ulrik.Brandes,Christian.Pich}@uni-konstanz.de
Abstract.We prent a novel sampling-bad approximation technique
for classical multidimensional scaling that yields an extremely fast layout
algorithm suitable even for very large graphs.It produces layouts that
compare favorably with other methods for drawing large graphs,and it
is among the fastest methods available.In addition,our approach allows
for progressive a rough approximation of the layout can
be produced even faster,and then be refined until satisfaction.
1Introduction
The term multidimensional scaling(MDS)refers to a family of techniques for dimensionality reduction that are ud to reprent high-dimensional data in low-dimensional space while approximately prerving distances.For drawing graphs,methods bad on the objective function of distance scaling are ud widely,but the classical scaling approach has only occasionally been recognized as a uful alternative[7,21,24].Indeed the computational complexity of this method is quadratic in the input size and thus prohibitive for large graphs.
In this paper we propo a sampling-bad approximation technique to over-come this restriction and to reduce time and space complexity esntially to linearity.The propod algorithm is simple to implement,yet extremely fast and therefore applicable to very large graphs.Moreover,it allows for progressive computation by very quickly producing a rough approximation of the layout, which can then be improved by successive refinement.
This paper is organized as follows.Background on multidimensional scaling and derived methods is provided in Section2.In Section3we introduce two variants of the eigensolver approach,which are evaluated and compared to each other in Section4.Section5concludes our contribution.
2Related Work
Thefirst MDS algorithm is due to Torgerson[30]and nowadays referred to as classical MDS or classical scaling.Its objective is a low-dimensional reprenta-tion of high-dimensional data byfitting inner products;it has a global optimum which can be directly computed by spectral decomposition.The method we pro-po in this paper is an efficient approximation of classical scaling.
M.Kaufmann and D.Wagner(Eds.):GD2006,LNCS4372,pp.42–53,2007.QQ怎么发说说
c Springer-Verlag Berlin Heidelberg2007
Eigensolver Methods for Progressive MDS of Large Data43 Another MDS variant best known and most widely ud today has been pro-
pod by Kruskal[22]and is sometimes distinguished as distance scaling.The
objective is to directlyfit Euclidean distances in the drawing to the given graph-theoretical distances,typically by minimizing a stress measure.It is performed
by iterative replacement according to a spring model of attracting and repelling
forces or an energy model,as widely known in the graph drawing literature[18], or by iterative algebraic techniques[12].Due to their time and space complex-
ity,straightforward implementations of distance scaling methods are restricted to data ts of moderate cardinality.
In the graph drawing literature,methods bad on linear algebra have become
popular in recent years.Examples are High-Dimensional Embedding(HDE)[15], fast multiscale approaches bad on eigenvectors of the Laplacian[20],subspace-
restricted layout[19],and stress majorization[12].
Poor scalability to large data ts due to quadratic complexity is a well-known problem of all MDS algorithms.It was addresd as early as in the1960s[23],and
马头琴独奏since then,many approaches to speeding up spring-force computations have been
devid[6,16,25,26].Likewi,methods for speeding up the spectral methods have been propod[11,31].Clost to our approach is Landmark MDS[10];
we give an experimental comparison in Sect.4.Relationships between the approaches are discusd in[2,27].For more general surveys on spar techniques
for dimensionality reduction and related spectral methods e[5,28].
MDS ems to have been thefirst computerized layout method ud for draw-ing social networks[17]towards the end of the1960s.Even in this restricted雁荡山风景区
application domain there are many extensions and variants,such as incremental
or interactive MDS[1,4,8,32].For further information about MDS,its history, and other applications we refer the reader to recent textbooks[3,9].
3Multidimensional Scaling and Its Approximation
LetΔ∈R n×n denote a symmetric matrix of metric dissimilarities or distances
δij between items i,j∈{1,...,n}.The goal of multidimensional scaling is to find positions x i∈R d in d-dimensional space,d n,such that x i−x j ≈δij, i.e.distances are reprented well in this low-dimensional space.Note that for
notational convenience we write positions x i as column vectors,and that d∈{2,3}for visualization purpos.WithΔ(2)we denote matrixΔwith squared [Δ(2)]ij=[Δ]2ij.
In graph drawing and network analysis,Δfrequently consists of shortest-path distances(,[8]for an alternative graph distance).In other contexts it is often induced by a high-dimensional feature space with an associated distance function.
In this ction,we briefly describe a standard technique for multidimensional
scaling,a recently introduced method for its fast approximation,and our new variant of this approximation.It turns out that,technically,our method is very similar to one of the fastest algorithms for drawing large graphs[15],but elimi-nates some of its shortcomings.This is outlined in Sect.3.5.
44U.Brandes and C.Pich
3.1Classical MDS
We briefly describe the scaling method known as Classical MDS[30].Recall that we are looking for an embedding in d-dimensional a matrix X∈R n×k with X=[x1,...,x n]T,such thatδij≈ x i−x j .Since this implies δ2ij≈ x i−x j 2=(x i−x j)T(x i−x j)=x T i x i−2x T i x j+x T j x j,
consider the matrix B=XX T of inner products b ij=x T i x j.While we do not know X,it can be shown that
b ij=−1
2
δ2ij−
1
n
n
r=1
δ2rj−
1
n自愈混凝土
n
s=1
δ2is+
1
n2
n
r=1
n
肚脐上面疼s=1
δ2rs
,
so that B can also be obtained by double-centering the matrix of squared dis-similaritiesΔ(2),i.e.each column and each row of B sums to zero.
Knowing B,positions X are reasonably reconstructed using the eigendecom-position B=VΛV T,whereΛis the diagonal matrix of the eigenvalues of B, and V is the orthonormal matrix of its eigenvectors.Simply let
X=V(d)Λ1/2
(d)
,
whereΛ(d)∈R d×d is the diagonal matrix of the d largest eigenvalues of B and V(d)∈R n×d is an n×d matrix of associated eigenvectors.Thus,the esnce of classical scaling is tofit inner products rather than distances as in distance scaling.
It is important to note that the two or three required eigenvectors can be computed by power by repeatedly multiplying a starting vector x∈R n with B.The iterate is periodically normalized;further eigenvectors are found by orthogonalization against previously computed eigenve
ctors.See, e.g.,[13]for background on matrix computations.
The running time for drawing an unweighted graph with n vertices and m edges by performing classical MDS on its matrixΔof shortest-path dis-tances is thus O(nm)for computingΔusing breadth-first arch,Θ(n2)for constructing B,and another O(n2)per iteration.Running times and also stor-age requirements are therefore prohibitive for large graphs.
3.2Landmark MDS
Landmark MDS(LMDS)[10]is a fast method for approximating the results of Classical MDS using a sparsification of the transformed distance matrix.It is bad on distinguishing a few items as landmarks,and computing the eigen-decomposition only on the double-centered matrix of squared distances among tho landmarks.Positions of non-landmarks are then determined as linear com-binations of landmark items are placed in the weighted barycenter of all landmarks where the weights are derived from the original distances.
Eigensolver Methods for Progressive MDS of Large Data45 The rationale is that a t of appropriate reference points is sufficient to de-
termine the projection into low-dimensional space.To be reprentative,the
k landmarks,d<k n,should be distributed well.Common experience shows that a MaxMin strategy,in which the next landmark maximizes the minimum
distance to the previous landmarks,yields satisfactory results.Note that this
corresponds to a well-known2-approximation of the k-center problem in facility location.We have tried other simple strategies such as MaxSum,random lec-
tion,and hybrids,but none proved to be superior consistently.More advanced techniques are propod in[29].
Time and space complexity of LMDS are significantly smaller than for Clas-
sical MDS.Landmark lection and distance computations are carried out in O(k·|E|)time,each power iteration step requires only O(k2)time,and thefinal positioning is done in O(kn)time.Since,in general,choosing k<100yields
satisfactory results on most practical instances,LMDS can be regarded a linear-time algorithm.Moreover,it is only necessary to store theΘ(kn)distances to landmarks.
3.3Pivot MDS
We now introduce a new variant of spar MDS which we call Pivot MDS (PMDS).It is motivated by a potential shortcoming of the LMDS strategy to po-sition landmarks only with respect to each other:it is possible that the(already available)distance information to non-landmarks can be utilized to improve the quality of the result.
Recall that Classical MDS is bad on an eigendecomposition of the double-centered n×n-matrix of squared distances B,and that Landmark MDS is bad on the corresponding decomposition of the double-centered k×k-submatrix of squared distances among lected items only.Pivot MDS is bad on the double-centered n×k-submatrix C of squared distances from every item to tho -lected,having entries
c ij=−1
2
δ2ij−
1
n
n
r=1
δ2rj−
1
k
k
s=1
δ2is+
1
nk
n
r=1
k
s=1
δ2rs
,
where i∈{1,...,n}and j∈{1,...,k},and thus contains all distance informa-tion available.
Note that the n-dimensional left singular vectors of C∈R n×k are equal to the eigenvectors of CC T∈R n×n.If they are computed using power iteration, an iteration consists of two steps:first,positions of pivots are determined using the current positions of all items(multiplication with C T∈R k×n),and then all items are positioned relative to the pivots(multiplication with C∈R n×k).1
1This interpretation motivates the name“pivot,”in contrast to“landmarks”which arefirst assigned theirfinal location and then ud to determine the position of all other items.
46U.Brandes and C.Pich
An intuitive interpretation is that the eigenvectors of CC T approximate the eigenvectors of B 2,and thus of B .This follows from the assumption
B 2
ij = BB T ij =n =1b i b j ≈k =1
c i c j = CC T
ij ,
so matrix entries [B 2]ij and [CC T ]ij reprent the same type of transformed distance sums,though in the latter ca with a truncated list of intermediaries.If the are sufficiently well distributed,the relative size of entries in CC T is reprentative for tho in B 2.
At face value the iteration time of PMDS is O (kn ).However,we can rewrite (CC T )i =C (C T C )i −1C T so that the iteration is performed only on the k ×k -matrix C T C .The initial multiplication with C T can be omitted (in Sect.3.4we will argue,though,that it is sometimes desirable),since the starting vector is arbitrary.The final multiplication with C is similar to the final projection step of LMDS.The algorithm is summarized in Alg.1.
Except for the additional O (kn +k 2n )cost of double-centering and computing C T C ,the running time is therefore esntially the same as in LMDS.Algorithm 1:Pivot MDS
Input :undirected graph G =(V,E ),number k ∈N of pivots
Output :coordinates x,y ∈R n
lect k pivots from V
for i ∈{1,...,k }do
i -th column of Δ(k )←BFS(i -th pivot)C ←doublecenter Δ(k )(2)板栗如何保存
(v 1,v 2)←poweriterate(C T C )
//2largest eigenvectors
松木家具的优缺点x ←Cv 1,y ←Cv 23.4Progressive MDS
When using pivot approximation there is a natural trade-offbetween running time and memory usage;urs might have to experiment with various numbers of pivots and different strategies.Instea
d of iteratively re-executing the algorithm with a larger t of pivots for layout improvement,we propo to u a progressive form of MDS computation that we shall describe in the following.
Let Δ(k )∈R n ×k denote a submatrix of the matrix of pairwi distances,and let x ∈R n be a component in the placement computed by PMDS bad on it.To improve approximation quality,Δ(k )can be extended by a certain number of new pivot columns to Δ(k )∈R n ×k (k ≥k ).Note that all operations for com-puting the new columns in Δ(k ),double-centering of Δ(k )(2)to obtain C ,and
determination of matrix C T C can be implemented to run in O ((k −k )·|E |).The new vector x ∈R n is computed by replacing C with C in Algorithm 1.
Eigensolver Methods for Progressive MDS of Large Data47
Fig.1.Progressively drawing thefinan512graph(|V|=74752,|E|=261120)with increasing pivot t(k=3,6,12,25,50,100)using the minmax strategy To prevent artificial effects through rotation and reflection in the transition from x to x due to indeterminacies in the basis of the eigenspace of C T C ,
the initial solution y∈R k for the power iteration is derived from the previous layout by y=C T x.Compared to random initialization,the iteration process for computing the new layout x is thus more likely to converge towards a solution clo to x,and we have obrved that transitions between intermediate layouts tend to become smoother and visually more pleasing.
For smaller graphs,pivots may be added in batches before computing the layout,while it can make more n for very large graphs to extendΔ(k)col-umn by column,after each inrtion computing the layout anew.In Sect.4our experiments indicate that most of the running time of Pivot MDS is consumed by the distance computations,while a layout bad on the distances can be computed quickly.It is thus worthwhile to progressively compute the layout until the quality does not improve significantly.
3.5Pivot MDS vs.HDE
In retrospect,our propod method is reminiscent of another fast algorithm for drawing large graphs,the high-dimensional embedder(HDE)of[15].
HDE proceeds as follows:From a t of k lected nodes(the pivots),distances to all other nodes are determined.The distances are,however,neither squared nor double-centered,but directly interprete
d as coordinates in a k-dimensional space.In this space,they are centered to place the mean at zero coordinate,and yield a high-dimensional embedding X∈R n×k.This k-dimensional embedding is then projected into two dimensions by Principal Component Analysis(PCA),