P AR M E T I S∗
Parallel Graph Partitioning and Spar Matrix Ordering
Library
Version4.0llol
George Karypis and Kirk Schloegel
University of Minnesota,Department of Computer Science and Engineering
Minneapolis,MN55455
karypis@cs.umn.edu
March30,2013
∗P AR M E T I S is copyrighted by the regents of the University of Minnesota.
Contents乖孩子英文
1Introduction3 2Changes Across Key Releas3
2.1Changes between4.0and3.2 (3)
2.2Changes between3.2and3.1 (3)
2.3Changes between3.0/3.1and2.0 (4)
3Algorithms Ud in P AR M E T I S5
3.1Unstructured Graph Partitioning (5)
3.2Partitioning Meshes Directly (6)
ohmygod3.3Partitioning Adaptively Refined Meshes (7)
3.4Partition Refinement (8)
3.5Partitioning for Multi-pha and Multi-physics Computations (8)
3.6Partitioning for Heterogeneous Computing Architectures (9)
3.7Computing Fill-Reducing Orderings (10)
4P AR M E T I S’API10
4.1Headerfiles (10)
4.2Input and Output Formats ud by P AR M E T I S (10)bring的过去式
4.2.1Format of the Input Graph (10)
4.2.2Format of Vertex Coordinates (12)
4.2.3Format of the Input Mesh (12)
4.2.4Format of the Computed Partitionings and Orderings (13)
4.3Numbering and Memory Allocation (14)
5Calling Sequence of the Routines in P AR M E T I S15
5.1Graph Partitioning (16)
ParMETIS V3PartKway (16)
ParMETIS V3PartGeomKway (18)
ParMETIS V3PartGeom (20)
ParMETIS V3PartMeshKway (21)
5.2Graph Repartitioning (23)
ParMETIS V3AdaptiveRepart (23)
5.3Partitioning Refinement (25)
ParMETIS V3RefineKway (25)
5.4Fill-reducing Orderings (27)
ParMETIS V3NodeND (27)
ParMETIS V32NodeND (28)
5.5Mesh to Graph Translation (30)
ParMETIS V3Mesh2Dual (30)
6Restrictions&Limitations31 7Hardware&Software Requirements,and Contact Information31 8Copyright&Licen Notice31
月光传说1Introduction
P AR M E T I S is an MPI-bad parallel library that implements a variety of algorithms for partitioning and repartitioning unstructured graphs and for computingfill-reducing orderings of spar matrices.P AR M E T I S is particularly suited for parallel numerical simulations involving large unstructured meshes.In this type of computation,P AR M E T I S dramati-cally reduces the time spent in communication by computing mesh decompositions such that the numbers of interface elements are minimized.
The algorithms in P AR M E T I S are bad on the multilevel partitioning andfill-reducing ordering algorithms that are implemented in the widely-ud rial package M E T I S[5].However,P AR M E T I S extends the functionality provided by M E T I S and includes routines that are especially suited for
parallel computations and large-scale numerical simulations. In particular,P AR M E T I S provides the following functionality:
奥斯卡获奖影片
•Partition unstructured graphs and meshes.
视频剪辑培训•Repartition graphs that correspond to adaptively refined meshes.
•Partition graphs for multi-pha and multi-physics simulations.
•Improve the quality of existing partitionings.
•Computefill-reducing orderings for spar direct factorization.
•Construct the dual graphs of meshes
The rest of this manual is organized as follows.Section2briefly describes the differences between major versions of P AR M E T I S.Section3describes the various algorithms that are implemented in P AR M E T I S.Section4.2describes the format of the basic parameters that need to be supplied to the routines.Section5provides a detailed description of the calling quences for the major routines in P AR M E T I S.Finally,Section7describes software and hardware requirements and provides contact information.
2Changes Across Key Releas
2.1Changes between4.0and
3.2
The4.0relea of P AR M E T I S reprents a major code refactoring to allow full support of64bit architectures.As part of that re-factoring,no additional capabilities have been added to the library.However,since the4.0relea relies on the latest version of M E T I S,it allows for better support of multi-constraint partitioning.Here is the list of the major changes in version4.0:
•Support for64bit architectures by explicitly defining the width of the scalar“integer”data type(idx t)ud to store the adjancency structure of the graph.
•It is bad on the5.0distribution of M E T I S,which itlf contains many enhancements over the previous version.
•A complete re-write of its internal memory management,which resulted in lower memory requirements.
•Better quality partitionings for multi-constraint partitioning problems.
2.2Changes between
3.2and3.1
The major change in version3.2is its better support for computingfill-reducing orderings of spar matrices.Specifi-cally,version3.2contains the following enhancements/additions:
•A new parallel parator refinement algorithm that leads to smaller parators and lessfill-in.
•Parallel orderings can now be computed on non power-of-two processors.
•It provides support for computing multiple parators at each level(both during the parallel and the rial phas).The smallest parator among the multiple runs is lected.
•There is a new API routine,ParMETIS V32NodeND that expos additional parameters to the ur in order to better control various aspects of the algorithm.The old API routine(ParMETIS V3NodeND)is still valid and is mapped to the new ordering routine.
The end results of the enhancements is that the quality of the orderings computed by P AR M E T I S are now compa-rable to tho computed by M E T I S’nested disction routines.In addition,versisasha blonde
on3.2contains a number of bug-fixes and documentation corrections.Note that changes in the documentation are marked using change-bars.
2.3Changes between
3.0/3.1and2.0
Version3.x contains a number of changes over the previous major relea(version2.x).The changes include the following:
Version1.0Version2.0Version3.0
PARKMETIS ParMETIS PartKway ParMETIS V3PartKway
PARGKMETIS ParMETIS PartGeomKway ParMETIS V3PartGeomKway
PARGMETIS ParMETIS PartGeom ParMETIS V3PartGeom
PARGRMETIS Not available Not available
PARRMETIS ParMETIS RefineKway ParMETIS V3RefineKway
PARUAMETIS ParMETIS RepartLDiffusion
PARDAMETIS ParMETIS RepartGDiffusion
ParMETIS V3AdaptiveRepart
Not available ParMETIS RepartRemap
Not available ParMETIS RepartMLRemap
PAROMETIS ParMETIS NodeND ParMETIS V3NodeND
Not available Not available ParMETIS V3PartMeshKway
Not available Not available ParMETIS V3Mesh2Dual
Table1:The relationships between the names of the routines in the different versions of P AR M E T I S.
•The names and calling quence of all the routines have changed due to expanded functionality that has been provided in this relea.Table1shows how the names of the various routines map from
version to version.Note that Version3.0is fully backwards compatible with all previous versions of P AR M E T I S.That is,the old API calls have been mapped to the new routines.However,the expanded functionality provided with this relea is only available by using the new calling quences.
加油的英文怎么写
•The four adaptive repartitioning routines:ParMETIS RepartLDiffusion,ParMETIS RepartGDiffusion, ParMETIS RepartRemap,and ParMETIS RepartMLRemap have been replaced by a(single)implementa-tion of a unified repartitioning algorithm[15],ParMETIS V3AdaptiveRepart,that combines the best features of the previous routines.
•Multiple vertex weights/balance constraints are supported for most of the routines.This allows P AR M E T I S to be ud to partition graphs for multi-pha and multi-physics simulations.
•In order to optimize partitionings for specific heterogeneous computing architectures,it is now possible to specify the target sub-domain weights for each of the sub-domains and for each balance constraint.This feature, for example,allows the ur to compute a partitioning in which one of the sub-domains is twice the size of all of the others.
•The number of sub-domains has been de-coupled from the number of processors in both the static and the adaptive partitioning schemes.Hence,it is now possible to u the parallel partitioning and re
partitioning
algorithms to compute a k-way partitioning independent of the number of processors that are ud.Note that Version2.0provided this functionality for the static partitioning schemes only.
爱你单词•Routines are provided for both directly partitioning afinite element mesh,and for constructing the dual graph of a mesh in parallel.In version3.1the routines have been extended to support mixed element meshes.
3Algorithms Ud in P AR M E T I S
P AR M E T I S provides a variety of routines that can be ud to compute different types of partitionings and repartitionings as well asfill-reducing orderings.Figure1provides an overview of the functionality provided by P AR M E T I S as well as a guide to its u.
Figure1:A brief overview of the functionality provided by P AR M E T I S.The shaded boxes correspond to the actual routines in P AR M E T I S that implement each particular operation.
3.1Unstructured Graph Partitioning
ParMETIS V3PartKway is the routine in P AR M E T I S that is ud to partition unstructured graphs.This routine takes a graph and computes a k-way partitioning(where k is equal to the number of sub-domains desired)while attempting to minimize the number of edges that are cut by the ,the edge-cut).ParMETIS V3PartKway makes no assumptions on how the graph is initially distributed among the processors.It can effectively partition a graph that is randomly distributed as well as a graph that is well distributed1.If the graph is initially well distributed among the processors,ParMETIS V3PartKway will take less time to run.However,the quality of the computed partitionings 1The reader should note the difference between the terms graph distribution and graph partition.A partitioning is a mapping of the vertices to the processors that results in a distribution.In other words,a partitioning specifies a distribution.In order to partition a graph in parallel,an initial distribution of the nodes and edges of the graph among the processors is required.For example,consider a graph that corresponds to the dual of a finite-element mesh.This graph could initially be partitioned simply by mapping groups of n/p concutively numbered elements to each processor where n is the number of elements and p is the number of processors.Of cour,this naive approach is not likely to result in a very good distribution becau elements that belong to a number of different regions of the mesh may get mapped to the same processor.(That is,each processor may get a number of small sub-domains as oppod to a single c
ontiguous sub-domain).Hence,you would want to compute a new high-quality partitioning for the graph and then redistribute the mesh accordingly.Note that it may also be the ca that the initial graph is well distributed,as when meshes are adaptively refined and repartitioned.