A survey on clustering algorithms for wireless nsor networkssao paulo
Ameer Ahmed Abbasi
a,*
,Mohamed Younis
b
a
Department of Computing,Al-Hussan Institute of Management and Computer Science,Dammam 31411,Saudi Arabia
b
Department of Computer Science and Electrical Engineering,University of Maryland,Baltimore County,Baltimore,MD 21250,USA
Available online 21June 2007
Abstract
The past few years have witnesd incread interest in the potential u of wireless nsor networks (WSNs)in applications such as disaster management,combat field reconnaissance,border protection and curity surveillance.Sensors in the applications are expected to be remotely deployed in large numbers and to operate autonomously in unattended environments.To support scalability,nodes are often grouped into disjoint and mostly non-overlapping clusters.In this paper,we prent a taxonomy and general classification of pub-lished clustering schemes.We survey different clustering algorithms for WSNs;highlighting their objectives,features,complexity,etc.We also compare of the clustering algorithms bad on metrics such as convergence rate,cluster stability,cluster overlapping,location-awareness and support for node mobility.Ó2007Published by Elvier B.V.
xocecoKeywords:Wireless nsor networks;Clustering algorithms;Scalability;Network architecture;Energy aware design
1.Introduction
Recent advances in miniaturization and low-power design have led to the development of small-sized bat-tery-operated nsors that are capable of detecting ambient conditions such as temperatur
e and sound.Sensors are generally equipped with data processing and communica-tion capabilities.The nsing circuitry measures parameters from the environment surrounding the nsor and trans-forms them into an electric signal.Processing such a signal reveals some properties about objects located and/or events happening in the vicinity of the nsor.Each nsor has an onboard radio that can be ud to nd the collected data to interested parties.Such technological development has encouraged practitioners to envision aggregating the lim-ited capabilities of the individual nsors in a large scale network that can operate unattended [1–7].Numerous civil and military applications can be leveraged by networked nsors.A network of nsors can be employed to gather meteorological variables such as temperature and pressure.
The measurements can be then ud in preparing fore-casts or detecting harsh natural phenomena.In disaster management situations such as earthquakes,nsor net-works can be ud to lectively map the affected regions directing emergency respon units to survivors.In military situations (Fig.1),nsor networks can be ud in surveil-lance missions and can be ud to detect moving targets,chemical gas,or the prence of micro-agents.
One of the advantages of wireless nsors networks (WSNs)is their ability to operate unattended in harsh envi-ronments in which contemporary human-in-the-loop mon-itoring schemes are risky,ineffici
ent and sometimes infeasible.Therefore,nsors are expected to be deployed randomly in the area of interest by a relatively uncontrolled dropped by a helicopter,and to collectively form a network in an ad-hoc manner [8,9].Given the vast area to be covered,the short lifespan of the battery-oper-ated nsors and the possibility of having damaged nodes during deployment,large population of nsors are expected in most WSNs applications.It is envisioned that hundreds or even thousands of nsor nodes will be involved.Designing and operating such large size network would require scalable architectural and management strat-egies.In addition,nsors in such environments are energy
0140-3664/$-e front matter Ó2007Published by Elvier B.V.doi:10.2007.05.024
*
Corresponding author.
E-mail address:ameer_abbasi@hussan.edu.sa (A.A.Abbasi),younis@ce.umbc.edu (M.Younis).
/locate/comcom
Computer Communications 30(2007)
2826–2841
constrained and their batteries cannot be recharged.There-fore,designing energy-aware algorithms becomes an important factor for extending the lifetime of nsors.Other application centric design high fidel-ity target detection and classification,are also considered [10].
Grouping nsor nodes into clusters has been widely pur-sued by the rearch community in order to achieve the net-work scalability objective.Every cluster would have a leader,often referred to as the cluster-head (CH).Although many clustering algorithms have been propod in the liter-ature for ad-hoc networks [11–15],the objective was mainly to generate stable clusters in environments with mobile nodes.Many of such techniques care mostly about node reachability and route stability,without much concern about critical design goals of WSNs such as network longev-ity and coverage.Recently,a number of clustering algo-rithms have been specifically designed for WSNs [16–20].The propod clustering techniques widely vary depending on the node deployment and bootstrapping schemes,the pursued network architecture,the characteristics of the CH nodes and the network operation model.A CH may be elected by the nsors in a cluster or pre-assigned by the network designer.A CH may also be just one of the nsors or a node that is richer in resources.The cluster membership may be fixed or variable.CHs may form a cond tier net-work or may just ship t
he data to interested a ba-station or a command center.
In addition to supporting network scalability,clustering has numerous advantages.It can localize the route t up within the cluster and thus reduce the size of the routing table stored at the individual node [21].Clustering can also conrve communication bandwidth since it limits the scope of inter-cluster interactions to CHs and avoids redundant exchange of messages among nsor nodes [22].Moreover,clustering can stabilize the network topol-ogy at the level of nsors and thus cuts on topology main-tenance overhead.Sensors would care only for connecting
with their CHs and would not be affected by changes at the level of inter-CH tier [23].The CH can also implement optimized management strategies to further enhance the network operation and prolong the battery life of the indi-vidual nsors and the network lifetime [22].A CH can schedule activities in the cluster so that nodes can switch to the low-power sleep mode most of the time and reduce the rate of energy consumption.Sensors can be engaged in a round-robin order and the time for their transmission and reception can be determined so that the nsors reties are avoided,redundancy in coverage can be limited and medium access collision is prevented [24–27].Furthermore,a CH can aggregate the data collected by the nsors in its cluster and thus decrea the number of relayed packets [28].
In this paper,we opt to categorize clustering algorithms propod in the literature for WSNs.We report on the state of the rearch and summarize a collection of published schemes stating their features and shortcomings.We also compare the different approaches and analyze their appli-cability.In the next ction,we discuss the different classi-fications of clustering techniques and enumerate a t of attributes for categorizing published algorithms.In Section 3,we summarize a collection of clustering algorithms for WSNs and prent classification of the various approaches pursued.Finally,Section 4concludes the paper.2.Taxonomy of clustering attributes
Clustering techniques for WSNs propod in the litera-ture can be generally classified bad on the overall net-work architectural and operation model and the objective of the node grouping process including the desired count and properties of the generated clusters.In this ction we discuss the different classifications and prent taxon-omy of a clustering attributes.We later u such attributes to categorize and compare the surveyed clustering algorithms.
2.1.Classifying clustering techniques
2.1.1.Network model
Different architectures and design goals/constraints have been considered for various applications of
WSNs.The following enlists some the relevant architectural parameters and highlight their implications on network clustering.
•Network dynamics:Basically WSNs consist of three main components:nsor nodes,ba-station and moni-tored events.Aside from the few tups that utilize mobile nsors [29,30],most of the network architectures assume that nsor nodes are stationary [19,31,32].Sometimes it is deemed necessary to support the mobility of ba-station or CHs.Node mobility would make clustering very chal-lenging since the node membership will dynamically change,forcing clusters to evolve over time.On the other hand,the events monitored by a nsor can be either
Command Node
Command Node
Command Node
Sensor nodes Ba-station
Command Node
Command Node
A.A.Abbasi,M.Younis /Computer Communications 30(2007)2826–28412827
intermittent or continual depending on the application.For instance,in a target detection/tracking application,the event(phenomenon)is dynamic whereas forest monitoring for earlyfire prevention is an example of intermittent events.Monitoring intermittent events allows the network to work in a reactive mode,simply generating traffic when reporting.Continual events in most applications require periodic reporting and conquently generate significant traffic to be routed to the sink.Although continual events would mostly make the clusters stable,it may unevenly load CHs relative to the nodes in the cluster and a rotation of the CH role may be required if the CH is randomly picked from the nsor population.Intermittent events would favor adaptive clustering strategies if the number of events significantlyfluctuates.
•In-network data processing:Since nsor nodes might generate significant redundant data,similar packets from multiple nodes can be aggregated so that the number of transmissions would be reduced.Data aggregation com-bines data from different sources by using functions such as suppression(eliminating duplicates),min,max and aver-age[33].Some of the functions can be performed either partially or fully in each nsor node,by allowing nsor nodes to conduct in-network data reduction.Recognizing that computation would be less energy consuming than commu
厦门广告设计nication,substantial energy savings can be obtained through data aggregation.This technique has been ud to achieve energy efficiency and traffic optimization in a num-ber of routing protocols.In some network architectures,all aggregation functions are assigned to more powerful and specialized nodes.Data aggregation is also feasible through signal processing techniques.In that ca,it is referred as data fusion where a node is capable of producing a more accurate signal by reducing the noi and using some tech-niques such as beamforming to combine the signals[20].It will be intuitive to expect CHs to perform such data aggre-gation/fusion which may restrict the choice of CH to only specialized node or require limiting the number of nsors per cluster in order to ensure that CHs are not overbur-dened[16].It is worth noting that sometimes it may be nec-essary to assign backup CHs for a cluster or rotate the role of being CH among the nsors in the cluster[20,34].Obvi-ously,such design choices/constraints influence the cluster-ing scheme.
•Node deployment and capabilities:Another consider-ation is the topological deployment of nodes.This is application dependent and affects the need and objective of the network clustering.The deployment is either deter-ministic or lf-organizing.In deterministic situations,the nsors are manually placed and data is routed through pre-determined paths.Therefore,clustering is such tup is also pret or unnecessary.However in lf-organizing systems,the nsor nodes ar
e scattered randomly creating an infrastructure in an ad hoc manner[8,19,20].In that infrastructure,the position of the ba-station or the CH is also crucial in terms of energy efficiency and perfor-mance.When the distribution of nodes is not uniform,optimal clustering becomes a pressing issue to enable energy efficient network operation.In addition,in some tups different functionalities can be associated with the deployed nodes and the CH lection may be constrained. In networks of homogenous nsor all having equal capacity in terms of computation,communication and power,CHs are picked from the deployed nsors [20,35,36].Often in that ca,CHs are carefully tasked, luded from nsing duties,in order to avoid depleting their energy rather quickly.In addition,the communication range and the relative CH’s proximity to the ba-station may also be constraints/issues that have to be considered.Sensors’communication range is usually limited and a CH may not be able to reach the ba-station.Even if a node can directly communicate with the ba-station,it is still better to pursue multi-hop routes.Therefore,inter-CH connectivity becomes an important factor that affects the clustering scheme [17,37].On the other hand,heterogeneous WSNs may impo more constraints on the clustering process since some nodes may be designated for special tasks or empowered with distinct capabilities.It may then be required to either avoid such specific nodes to conrve their resources or limit the lection of CHs to a subt of the nodes.
2.1.2.Clustering objectives
Clustering algorithms in the literature varies in their objectives.Often the clustering objective is t in order to facilitate meeting the applications requirements.For exam-ple if the application is nsitive to data latency,intra and inter-cluster connectivity and the length of the data routing paths are usually considered as criteria for CH lection and node grouping.The following discussion highlights popular objectives for network clustering:
•Load balancing:Even distribution of nsors among the clusters is usually an objective for tups where CHs perform data processing or significant intra-cluster man-agement duties[16].Given the duties of CHs,it is intuitive to balance the load among them so that they can meet the expected performance goals[38].Load balancing is a more pressing issue in WSNs where CHs are picked from the available nsors[19].In such ca,tting equal-sized clus-ters becomes crucial for extending the network lifetime since it prevents the exhaustion of the energy of a subt of CHs at high rate and prematurely making them dysfunc-tional.Even distribution of nsors can also leverage data delay[37].When CHs perform data aggregation,it is imperative to have similar number of node in the clusters so that the combined data report becomes ready almost at the same time for further processing at the ba-station or at the next tier in the network.
•Fault-tolerance:In many applications,WSNs will be operational in harsh environments and thus nodes are usu-ally expod to incread risk of malfunction and physical damage.Tolerating the failure of CHs is usually necessary in such applications in order to avoid the loss of important
2828 A.A.Abbasi,M.Younis/Computer Communications30(2007)2826–2841
nsors’data.The most intuitive way to recover from a CH failure is to re-cluster the network.However,re-clustering is not only a resource burden on the nodes,it is often very dis-ruptive to the on-going operation.Therefore,contemporary fault-tolerance techniques would be more appropriate for that sake.Assigning backup CHs is the most notable scheme pursued in the literature for recovery from a CH failure.The lection of a backup and the role such spare CH will play during normal network operation varies.When CHs have long radio range,neighboring CHs can adapt the nsors in the failing cluster[34].Rotating the role of CHs among nodes in the cluster can also be a means for fault-tolerance in addition to their load balancing advantage[20].
后会无期台词
•Incread connectivity and reduced delay:Unless CHs have very long-haul communication capabilities, e.g.a satellite link,inter-CH connectivity is an important requirement in many applications.This is particularly true when CHs are picked from the nsors population. The connecti
vity goal can be just limited to ensuring the availability of a path from every CH to the ba-sta-tion[17]or be more restrictive by imposing a bound on the length of the path[40].When some of the nsors assume the CH role,the connectivity objective makes network clustering one of the many variant of the con-nected dominating t problem.On the other hand,when data latency is a concern,intra-cluster connectivity becomes a design objective or constraint.Delay is usually factored in by tting a maximum number of hops‘‘K’’allowed on a data path.K-hop clustering is K-dominat-ing t problem[41–43].
•Minimal cluster count:This objective is particularly common when CHs are specialized resource-rich nodes [39].The network designer often likes to employ the least number of the nodes since they tend to be more expensive and vulnerable than nsors.For example,if CHs are lap-top computers,robots or a mobile vehicle there will be inherently some limitation on the number of nodes.The limitation can be due to the complexity of deploying the types of when the WSN is to operate in a com-bat zone or a forest.In addition,the size of the nodes tends to be significantly larger than nsors,which makes them easily detectable.Node visibility is highly undesirable in many WSNs applications such as border protection,mil-itary reconnaissance and infrastructure curity.
•Maximal network longevity:Since nsor nodes are energy-constrained,the network’s lifetime is a major con-cern;especially for applications of WSNs in harsh environ-ments.When CHs are richer in resources than nsors,it is imperative to minimize the energy for intra-cluster commu-nication[22].If possible,CHs should be placed clo to most of the nsors in its clusters[39,44].On the other hand,when CHs are regular nsors,their lifetime can be extended by limiting their load as we mentioned earlier. Combined clustering and route tup has also been consid-ered for maximizing network’s lifetime[45].Adaptive clus-tering is also a viable choice for achieving network longevity[46,47].2.1.3.Taxonomy of clustering attributes
In this ction we opt to enumerate the t of attributes that can be u to categorizes and differentiate clustering algorithms of WSNs.Bad on the discussion above,we can identify the following attributes:
1.Cluster properties:Often clustering schemes strive to
lay backachieve some characteristics for the generated clusters.
Such characteristics can be related to the internal struc-ture of the cluster or how it relates to others.The follow-ing are the relevant attributes:
•Cluster count:In some published approaches the t of CHs are predetermined and thus the number of clusters are pret.Randomly picking CHs from the deployed nsors usually yields variable number of clusters.
•Stability:When the clusters count varies and the node’s membership evolves overtime,the clustering scheme is said to be adaptive.Otherwi,it is consid-eredfixed since nsors do not switch among clusters and the number of clusters stays the same throughout the network lifespan.
•Intra-cluster topology:Some clustering schemes are bad on direct communication between a nsor and its designated CH.However,multi-hop nsor-to-CH connectivity is sometimes required;especially when the nsor’s communication range is limited and/or the CH count is bounded.
•Inter-CH connectivity:When the CH does not have long haul communication capabilities,CHs connectivity to the ba-station has to be provisioned.In that ca, the clustering scheme has to ensure the feasibility of establishing an inter-CH route from every CH to the ba-station.Some of the published work assumes that CH would be able to directly reach the ba-station. 2.Cluster-head capabilities:As discusd earlier the
network model influences the clustering approach;par-ticularly the node capabilities and the scope of the in-network processing.The following attributes of the CH node are differentiating factors among clustering schemes:
•Mobility:When a CH is mobile,nsor’s membership dynamically changes and the clusters would need to be continuously maintained.On the other hand,station-ary CH tends to yield stable clusters and facilitate intra-and inter-cluster network management.Some-times,CHs can travel for limited distances to reposi-tion itlf for better network performance.
•Node types:As indicated earlier,in some tups a subt of the deployed nsors are designated as CHs while in others CHs are equipped with significantly more computation and communication resources.
•Role:A CH can simply act as a relay for the traffic generated by the nsors in its cluster or perform ag-gregation/fusion of collected nsors’data.Sometime,
a CH acts as a sink or a ba-station that takes actions
bad on the detected phenomena or targets.
gmd
A.A.Abbasi,M.Younis/Computer Communications30(2007)2826–28412829
3.Clustering process:The coordination of the entire clus-
tering process and the characteristics of the algorithms vary significantly among published clustering schemes.
The following attributes are deemed relevant:国旗下讲话期末考试
•Methodology:When CHs are just regular nsors nodes,clustering has to be performed in a distributed manner without coordination.In few approaches,a centralized authority partitions the nodes offline and controls the cluster membership.Hybrid schemes can also be found;especially when CHs are rich in resources.In the later ca,inter-CHs coordination is performed in a distributed manner,while each individual CH takes charge of forming its own cluster.
•Objective of node grouping:As discusd in the previ-ous ction,veral objectives have been pursued for forming clusters.Examples include fault-tolerance, load balancing,network connectivity,etc.
•Cluster-head lection:CHs can be pre-assigned or picked randomly from the deployed t of node
s.
•Algorithm complexity:Depending on the objective and the methodology,numerous clustering algorithms have been propod.The complexity and convergence rate of the algorithms can be constant or dependent on the number of CHs and/or nsors.
We would like to note that some of the attributes are mutually pret or variable cluster count, and some are not.For example,a clustering process may have multiple objectives.It is also worth noting that net-work clustering can influence or be influenced by the planned network and link layer protocols.We plan to hint on the implications of routing and MAC protocols when we summarize the published clustering schemes.Fig.2 summarizes the prented taxonomy of attributes.We u
this t of attributes in categorizing the clustering algo-rithms summarized in the next ction.
naples3.Clustering algorithms for WSNs
Generally,WSNs involve a large number of nsors rang-ing in the hundreds or even thousands.Clustering is an effec-tive mean for managing such high population of nodes.In this ctio
n we prent a literature survey of published dis-tributed algorithms for clustering WSNs.Given that scala-bility is regarded as the main advantage of network clustering,the surveyed algorithms are grouped according to their convergence rate into two subctions for variable and constant convergence time algorithms,respectively. 3.1.Variable convergence time algorithms
Time is a significant factor in the convergence of cluster-ing algorithms.Some of the propod clustering algorithms such as LCA[48],RCC[52]and CLUBS[53],have O(n) convergence time,where n reprent the number of nsor nodes in the network.It is thus practical to implement the types of clustering algorithms to the networks having small number of nodes.However,convergence time has enhanced dramatically in some recent algorithms, e.g.
[17],and showed their suitability for networks having large number of nodes.In general,variable convergence time algorithms enable more control of the cluster properties than the constant time ones.
Linked cluster algorithm(LCA):The work of Baker and Ephremides[48,49]is among the early ones on clustering of wireless networks.The focus is mainly on forming an effi-cient network topology that can handle the mobility of nodes.By clustering,CHs are hoped to form a backbone network to which clust
er members can connect while on the move.The Objective of the propod distributed algo-rithm is to form clusters such that a CH is directly con-nected to all nodes in its cluster.LCA is thus geared for maximizing network connectivity.The algorithm assumes synchronized nodes and time-bad medium access.A node is assigned the slot in the frame that matches its ID.First, each node broadcasts its ID and listens to transmission of
2830 A.A.Abbasi,M.Younis/Computer Communications30(2007)2826–2841
other nodes.In the next round,a node broadcast the t of neighbors that it heard from and thus every node will even-tually know its1-hop and2-hop neighbors.A node x becomes a CH if it has the highest ID among its neighbors or does not have the highest ID in its1-hop neighborhood, but there exists at least one neighboring node y such that x is the highest ID node in y’s1-hop neighborhood.Since LCA is found to yield excessive number of clusters,the approach is refined in[50].The idea is to pick a node x at random as thefirst CH and assign its neighbors to such first cluster.Then the node y with the lowest ID in the clus-ter is nominated as a CH.The neighbors of y that are not reachable to x would join the cond cluster.The proce-dure is repeated for the third cluster and so on.
Adaptive clustering:In[51],Lin and Gerla studied the efficient support of multimedia applications in the general multi-hop mobile ad-hoc networks using CDMA bad medium arbitration.To minimize the data delivery delay the network is clustered and distinct code is assigned to the cluster.Similar to[48]and[50],an ID-bad cluster lection scheme is employed.Like LCA,a single-hop intra-cluster topology is established.A CH arbitrates the lection of communication codes with neighboring CHs. The algorithm strives to optimally control the cluster size by balancing the interest in the spatial reus
e of channels, which is incread by having small clusters,and data deliv-ery delay,which gets reduced by avoiding inter-cluster large cluster sizes.Like LCA,TDMA is ud for intra-cluster communication.However,every cluster would u a distinct code resulting is simplified implemen-tation and great potential for meeting the QoS require-ments often found in multimedia applications.
服装店陈列
Random competition bad clustering(RCC):Although RCC[52]is designed for mobile ad hoc networks,it is also applicable to WSNs.RCC mainly focus at cluster stabil-ity in order to support mobile nodes.The RCC algorithm applies the First Declaration Wins rule,in which any node can‘‘govern’’the rest of the nodes in its radio coverage if it is thefirst to claim being a CH.After hearing the claim which is broadcasted by thefirst node,neighboring nodes join its cluster as member and give up their right to be a CH.To maintain clusters,every CH in the network broad-cast a CH claim packet periodically.Since there is a time delay between broadcasting a claim packet and receiving it,concurrent broadcast can possibly create a conflict. Being unaware of on-going claims,many neighboring nodes may broadcast CH claim packets concurrently.To avoid such a problem RCC explicitly employs a random timer and us the node ID for arbitration.Each node in the network ret its random time value,every time before broadcasting its CH claim packet.During t
his random time if it receives a broadcast message carrying CH claim packet from another node,it simply ceas the transmis-sion of its CH claim.Since random timer is not a complete solution,RCC resolve further the concurrent broadcast problems by using the node ID.If the conflict persists, node having lower ID will become the CH.Although fre-quent node mobility still has direct effect,RCC is shown to be more stable than conventional clustering schemes such as[51].A CH in adaptive clustering abandons its role when it hears a node with a lower ID,while,a CH in RCC only gives up its position when another CH moves near to it.
CLUBS:In[53],Nagpal and Coore propod CLUBS, an algorithm that forms clusters through local broadcast and converge in a time proportional to the local density of nodes.Basically,cluster formation in CLUBS is bad on the following three characteristics:
•Every node in the network must be connected to a cluster.
初三语文作文辅导•Maximum diameter of all clusters in the network should be same.
•Clusters should support the intra-cluster communica-tion,which means nodes in a cluster must be able to communicate with each others.
The algorithm forms clusters with a maximum of two hops.Each node in the network takes part in the cluster formation process by choosing a random number from a fixed integer range.Then it counts down from that number silently.If the count down was not interrupted from any other neighboring node and it reaches zero,it announces itlf CH and broadcasts a‘‘recruit’’message.When a neighboring node receives the recruit message that comes within two-hop diameter boundary,it stops the count down,accepts the invitation and joins the cluster.A node that has joined a cluster is called‘‘follower’’is no longer allowed to compete for being a CH.
Since CLUBS allows overlapping,follower nodes keep listening to additional recruit messages and can be follower of more than on CH.If a node that is competing to become a CH detects a collision or received a garbled message,it becomes a follower node and assumes that multiple CHs attempted to recruit it at the same time.It canfind out its CH later.The algorithm does not terminate unless all nodes in the network join some cluster as a CH or as a fol-lower.Fig.3,from[53],shows thefinal layout of the clus-tered network.
CLUBS can be implemented in the asynchronous envi-ronment without losing efficiency and simplicity.Further-more,CLUBS satisfies many constraints that are common in other distributed environment such as lim-ited/no topology knowledge or access to global IDs.The major problem of C
LUBS algorithm is the clusters having their CHs within1-hop range of each other.If this is the ca,both clusters will collap and CH election process will restart.
Hierarchical control clustering:Unlike most of the pub-lished schemes,the goal of Banerjee and Khuller is to form a multi-tier hierarchical clustering[37].Fig.4illustrate the concept of hierarchy of clusters.A number of cluster’s properties such as cluster size and the degree of overlap, which are uful for the management and scalability of
A.A.Abbasi,M.Younis/Computer Communications30(2007)2826–28412831