Riemannian Manifold Learning for Nonlinear
Dimensionality Reduction
Tony Lin1, ,Hongbin Zha1,and Sang Uk Lee2
1National Laboratory on Machine Perception,
Peking University,Beijing100871,China
{lintong,zha}@cis.pku.edu
2School of Electrical Engineering,
Seoul National University,Seoul151-742,Korea
sanguk@ipl.snu.ac.kr
Abstract.In recent years,nonlinear dimensionality reduction(NLDR)
techniques have attracted much attention in visual perception and many
other areas of science.We propo an efficient algorithm called Rie-
mannian manifold learning(RML).A Riemannian manifold can be con-
structed in the form of a simplicial complex,and thus its intrinsic
dimension can be reliably estimated.Then the NLDR problem is solved
by constructing Riemannian normal coordinates(RNC).Experimental
results demonstrate that our algorithm can learn the data’s intrinsic
geometric structure,yielding uniformly distributed and well organized
low-dimensional embedding data.
1Introduction肉酱拌面
In visual perception,a human face image of size of64×64pixels is often repre-nted by a vector in a4096-dimensional space.Obviously,the4096-dimensional vector space is too large to allow any efficient image processing.A typical way to avoid this”cur of dimensionality”problem[1]is to u dim
ensionality reduc-tion techniques.Classical linear methods,such as Principal Component Analysis (PCA)[2]and Multidimensional Scaling(MDS)[3],can only eflat Euclidean structures,and fail to discover the curved and nonlinear structures of the in-put data.Previous nonlinear extensions of PCA and MDS,including Autoen-coder Neural Networks[4],SOM[5],Elastic Nets[6],GTM[7],and Principal Curves[8],suffer from the difficulties in designing cost functions and training too many free parameters,or are limited in low-dimensional data ts.In re-cent years some nonlinear manifold learning techniques have been developed, such as Isomap[9,10],LLE[11],Laplacian Eigenmaps[12,13],Hessian Eigen-maps[14],SDE[15],manifold charting[16],LTSA[17],diffusion maps[18].Due to their nonlinear nature,geometric intuition and computational practicability, the nonlinear manifold learning techniques have attracted extensive attention This work was supported by NSFC(60302005),NSFC(60333010),NKBRP (2004CB318005)and F ANEDD(200038),China.
A.Leonardis,H.Bischof,and A.Prinz(Eds.):ECCV2006,Part I,LNCS3951,pp.44–55,2006.
c Springer-Verlag Berlin Heidelberg2006
Riemannian Manifold Learning for Nonlinear Dimensionality Reduction45 of the rearchers from different disciplines.The basic assumption is that the
出没无常input data lie on or clo to a smooth low-dimensional manifold[19].
Each manifold learning algorithm attempts to prerve a different geometrical property of the underlying manifold.Local LLE[11],Laplacian
Eigenmaps[12],LTSA[17])aim to prerve the local geometry of the data.They
are also called spectral methods,since the low dimensional embedding task is reduced to solving a spar eigenvalue problem under the unit covariance con-
straint.However,due to this impod constraint,the aspect ratio is lost and the global shape of the embedding data can not reflect the underlying mani-
fold.In contrast,global approaches like Isomap[9]attempt to prerve metrics
at all scales and therefore give a more faithful embedding.However,Isomap,or isometric mapping,can be only applied to intrinsicallyflat 2D de-
velopable surfaces(cylinders,cones,and tangent surfaces).Conformal mapping
[10,20]appears to be a promising direction.
We propo a general framework called Riemannian manifold learning(RML).
The problem is formulated as constructing local coordinate charts for a Rieman-
nian manifold.The most widely ud is the Riemannian normal coordinates (RNC)chart.In[21]Brun et al.prented a method for manifold learning di-
rectly bad on the concept of RNC.In order to calculate the geodesic directions, high sampling density is required and the cond order polynomial interpolation
is computationally expensive.In this paper,we propo a more efficient method
to calculate RNC.The basic idea is to prerve geodesic distances and directions only in a local neighborhood.We also describe a novel method for estimating
intrinsic dimension of a Riemannian manifold.Our method is derived by recon-
structing the manifold in the form of an simplicial complex,who dimension is determined as the maximal dimension of its simplices.
2Mathematical Preliminaries
In this ction we briefly review some basic concepts of Riemannian geometry
[22].A bijective map is called a homeomorphism if it is continuous in both directions.A(topological)manifold M of dimension m is a Hausdorffspace
for which every point has a neighborhood U homeomorphic to an open t V
of R m withφ:U→V⊂R m.(U,φ)is called a local coordinate chart.An atlas for M means a collection of charts{(Uα,φα)|α∈J}such that{Uα|α∈J}is an open cover of M.A manifold M is called a differential manifold if
蓦地
there is an atlas of M,{(Uα,φα)|α∈J},such that for anyα,β∈J,the
武汉市天气预报
compositeφαφ−1
β:φβ(Uα∩Uβ)→R m is differentiable of class C∞.A differential
manifold M endowed with a smooth inner product(called Riemannian metric) g(u,v)or u,v on each tangent space T p M is called a Riemannian manifold (M,g).
An exponential map exp p(v)is a transform from a tangent vector v∈T p M into a point q∈γ⊂M such that dist(p,q)=||v||= v,v 1/2,whereγis the unique geodesic traveling through p such that its tangent vector at p is v.A geodesic is a smooth curve which locally join their points along the shortest path.
46T.Lin,H.Zha,and S.U.Lee学海导航
All the geodesics passing through p are called radial geodesics.The local coordi-nates defined by the chart(U,exp−1p)are called Riemannian Normal Coordinates (RNC)with center p.Note that the RNC mapping prerves the distances on ra-dial geodesics.A simple example is paring an orange,which maps a sphere onto a plane,while the distances on the great circles of the sphere are prerved.
3Manifold Assumption
Most manifold learning algorithms[9,11,19]assume that a t of image data may generate a low-dimensional manifold in a high-dimensional image space. Here we prent a simple geometric imaging model(shown in Fig.1)for human face images to clarify this assumption.Varying pos and lighting conditions are considered in this model,as they are two important factors in face detection and recognition.The model may be adapted to image data of other cars),if similar imagin
g conditions are encountered.
Fig.1.A geometric imaging model for human face images We model the head of a human as a unit sphere S2,where the frontal hemi-sphere is the human face.Different pos are obtained by moving the camera, as the human face is kept in stationary.The focal length is assumed to be un-changed in the imaging process.We also assume that the distance from the camera to the face isfixed,so the face images have similar scales.Commonly, the center axis of the camera is t to passing through the center of the sphere. The camera is allowed to have some degree of planar rotations.The lighting
is simply modeled with a point light source far away from the sphere.Under the variations,this t of face images generates a5-dimensional manifold,which is homeomorphic to
M={P Q e|P∈S2,Q∈S2,e∈S1},
试用期交社保吗
where P and Q are two interction points on S2by the center axis of the camera and the lighting ray,and e is a unit vector to show the planar rotation angle
Riemannian Manifold Learning for Nonlinear Dimensionality Reduction47 of the camera.This reprentation is just a simple extension of the parametric reprentation of a surface,r=r(u,v),where(u,v)are two varying parameters. If the lighting variation is ignored,a3-dimensional manifold may be generated:
M ={P e|P∈S2,e∈S1}.
This is called a circle bundle on a sphere,which is one of the simplest tangent bundles.This manifold can be visualized as the earth running in its circular orbit in the4-dimensional space-time.
4Manifold Reconstruction
The m-manifold M generated from a t of data points in R n is modeled with an approximating simplicial complex,who dimension rves as a reliable estima-tion of the intrinsic dimension of M.Our manifold reconstruction is a simplified version of Freedman’s method[23],which involves a computationally expensive optimization for convex hulls.
The key to the reconstruction problem from unstructured sample points is to recover the edge connections within a local neighborhood.The neighborhood of one point p∈M,denoted NBD(p),is defined as the K nearest points to p.K is often t as c×m ,where c is a constant number between2and5, and m is an initial estimation of the intrinsic dimension m.Then we lect k (1≤k≤K)edge points from the K neighbors,such that the edge connections are built between p and each edge point.Note that the number of edge points, k,is varying with p.A point q is said to be an edge point of p if no other point r parates p and q by the normal plane passing through r and perpendicular to the line(p,r).Formally,the edge point t of point p is defined as
EP(p)={q∈NBD(p)| p−r,q−r ≥0,any r∈NBD(p)}.
It is easy to show that by this definition,the angle between any two adja-cent edges is acute or right,while obtu angles are prohibited.This property guarantees to yield well-shaped simplices,whic
关于爱情的唯美句子h are basic building blocks to construct the target simplicial complex.The underlying reason for this property is explained by a simple example shown in Fig.2.It is often believed that the 1D reconstruction in(b)is much better than the2D reconstruction in(c).The points are more likely to be sampled from a1D curve,rather than a2D surface. The width of the2D complex in(c)is too small and thus can be ignored.In fact, any thin rope in the physical world can be modeled as a1D curve by ignoring its radius.This definition of an edge point permits edge connections like(b)while (c)is prohibited.
Simplices in each dimension are constructed by grouping adjacent edges.For example,if(p,q)is an edge and r is any other point,a triangle(p,q,r)is gen-erated when there are two edges(p,r)and(q,r).This procedure repeats from low-dimensional to high-dimensional,until there are no new simplices generated. The target simplicial complex is compod of all the simplices.The dimension of the complex is a good estimate of the intrinsic dimension of M.
48T.Lin,H.Zha,and S.U.Lee
Fig.2.Reconstruction offive points sampled from a curve.(a)Unorganized points.
(b)1D reconstruction.(c)2D reconstruction.
5Manifold Charting
Manifold charting consists of two steps:(1)Compute the tangent space and t up a Cartesian coordinate system;(2)U Dijkstra’s algorithm tofind single-source shortest paths,and calculate the Riemannian Normal Coordinates(RNC) for each end point of the shortest paths.草鱼头
In principle,the ba point p for a RNC chart may be freely lected.Here we choo the ba point clo to the center of the input data.For each candidate point,the maximal geodesic distance(called geodesic radius)is computed using Dijkstra’s algorithm.One point with the minimal geodesic radius is the optimal ba point.
A local coordinate chart is t up by computing the tangent space T p M:
x0+span{x1−x0,...,x m−x0},
where{x0,x1,...,x m}are(m+1)geometrically independent edge points(or nearest neighbors)of p.Any point on the tangent space can be reprented as
x0+
m
i=1
λi(x i−x0).
An orthonormal frame,denoted(p;e1,...,e m),is computed from the vectors {x1−x0,...,x m−x0}by using the Gram-Schmidt orthogonalization.
Then the Dijkstra’s algorithm[24]is exploited tofind single-source shortest paths in the graph determined by the simplicial complex.Each time a new shortest path is found,we compute the RNC of the end point on this path.If the end point q is an edge point of p,we directly compute the projection of q, denoted q ∈R m,onto the tangent space frame(p;e1,...,e m)by solving the following least squares problem
min X∈R m q−(p+
m
i=1
x i e i) 2,
where X=(x1,x2,...,x m)are the projection coordinates of q in the tangent space.The RNC of q is given by
q−p R n
X R m X,