Modelling Routing in Wireless Ad Hoc Networks with Dynamic Bayesian Games
Petteri Nurmi
Helsinki Institute for Information Technology(HIIT)/Basic Rearch Unit
Department of Computer Science,P.O.Box68
FI-00014University of Helsinki,Finland
Email:petteri.nurmi@cs.helsinki.fi
Abstract—Mobile agents acting in wireless ad hoc networks are energy constrained,which leads to potential lfishness as nodes are not necessarily willing to forward packets for other nodes.Situations like this are traditionally analyzed using game theory and recently also the ad hoc networking community has witnesd game-theoretic approaches to especially routing. However,from a theoretical point-of-view the contemporary game-theoretic approaches have mainly ignored two important aspects:non-simultaneous decision making and incorporating history information into the decision making process.In this article we propo a new model thatfills the gaps and allows to analyze routing theoretically.
I.I NTRODUCTION
In wireless ad hoc networks[1]a collection of nodes, bile hosts,forms a lf-organizing network without any support from pre-established infrastructure.The lack of infrastructure support forces the nodes to implement all net-working tasks by themlves and for routing this means that packets must be routed using other nodes as intermediate relays.Simple routing schemes,such as the dynamic source routing protocol[2],are sufficient only if all nodes are willing to participate in the forwarding.However,nodes are energy constrained by their battery level and want to maximize their lifetime,which leads to potential lfishness as the nodes may refu to forward packets for other nodes.Simulations have shown that network throughput rate often degrades signif-icantly when simple routing schemes are ud and even a small portion of the nodes acts lfishly[3].For this reason methods for stimulating cooperation among the nodes are required.A method for cooperation stimulation should have afirm theoretical background.However,most of the existing approaches are mainly verified by experimental evaluation, which makes theoretical analysis of the systems difficult.
A routing decision is esntially a conflict situation that can be modelled using tools from game theory.The contemporary approaches have analyzed routing in a stage-wi manner using so-called static games[4],in which routing is modelled by looking at a single t of packets and where all relays
on the routing path act as if all nodes decide simultaneously whether to forward the packets or not.Clearly this kind of approach is unrealistic especially in multi-hop routing and in den ad hoc networks as there is delay between the 1Rearch supported by the Academy of Finland grant202203.nding of a packet by the source node and the packet processing in an intermediate node.Within the delay time,the energy level of an intermediate node usually decreas,which caus important model parameters to change and makes the assumption of simultaneous decision making no longer valid. Anotherflaw in the existing approaches is that they mainly ignore the temporal aspect(past actions)in game play or give no proper justification for the model that is ud.To overcome the deficiencies,we propo a new model that us dynamic Bayesian games[4,Chap.8].Dynamic Bayesian games allow formulating a generic model,in which no restrictions are put on the delay between nding a t of packets and forwarding them.Additionally,the new model allows analyzing optimality in terms of past actions and makes it possible to take into account various sources of uncertainty.The organization of this paper is as follows:in Section II we discuss related work on cooperation stimulation mechanisms and on game-theoretic routing models of ad hoc networks.In Section III we introduce the new model and in Section IV implementation issues related to our model are discusd.In Section V we conclude the paper and discuss future work.In the Appendix we prent game-theoretic material that is ud in the optimality proof of Section III.
II.R ELATED W ORK
The initial approaches for cooperation stimulation in ad hoc networks u either a reputation mechanism[5]or some kind of a virtual currency system.The approaches bad on reputation attempt to identify lfish nodes and to isolate non-cooperative nodes from the network.Among thefirst reputation approaches is the watchdog-mechanism in which the forwarding rate of neighbouring nodes is monitored[6]. If a neighbour does not forward messages,it is considered non-cooperative and information about the non-cooperative reputation is propagated in the network.The information about non-cooperative nodes is ud by a pathrater that rates paths between the source and the destination node.Together the watchdog and the pathrater methods make it possible to avoid paths with misbehaving nodes.From a theoretical perspective, the main problem with this approach is that misbehaviour is actually rewarded as no packets are routed through non-cooperative nodes,but the packets of the non-cooperative nodes are still forwarded.However,the general mechanism
of monitoring traffic from neighbouring nodes has been ud in many other approaches and it is also ud in the approach propod in this article.
To overcome the problems with the watchdog approach, more elaborate reputation mechanisms such as Core[7]and Confidant[8]have been propod.The methods extend the watchdog approach so that each node calculates a reputation value bad on the information it has obtained about the forwarding rate of another node.The lower the reputation value of a node,the less likely it is that a packet is forwarded for that node.The propod approaches differ what information is ud,how the information is ud,how hard misbehaviour is punished and how re-integration of temporally misbehaving nodes back in the network is performed.The main problem with the approaches is that they lack a formal model,which makes theoretical analysis difficult.In addition, the decision making mechanism is not asflexible as in our model.
The approaches using a virtual currency system model the forwarding problem as an economic market,where nding and forwarding cost money.The approach propod by But-tyán and Hubaux us a currency called NUGLET[9].Each node is assumed to have a parate cure hardware module that has a NUGLET counter.The counter increas when a node forwards a packet for another node,and decreas when a node nds a packet of its own.The NUGLET counter must always be positive,so the approach forces the nodes to forward at least the same amount of traffic as they nd themlves.
A similar kind of solution was prented by Anderegg and Eidenbenz[10],who derive a truthful Vickrey-Clarke-Groves mechanism(VCG)[11]–[13]for routing.In their approach, thefirst pha of routing is to ask the nodes for their costs to forward packets.Bad on this information the minimum energy path is calculated.Due to energy constraints,the nodes are attracted to cheating,but the authors prove that using premium ,additional pay to the nodes on the minimum energy path,it is in the best interest of a node to report the actual energy cost in thefirst pha.Another related approach was propod by Srinivasan,Nuggehalli,Chiasrini and Rao[14].Their approach us a generous TIT-FOR-TAT policy that guarantees(under certain conditions)cooperation in a game that is played repeatedly.Esntially the TIT-FOR-TAT policy[4,p.173]means that if a node does not cooperate, the other nodes will not forward packets for that node in the next time frame(packet ssion).The generous version us slightly milder punishments.
关于菊花的资料
All the virtual currency approaches u computational mechanism design[15]methods for cooperation stimulation. Although the authors of the articles do not usually formulate a proper game-theoretic model,the approaches can be en as static repeated games[4,Chap.5].This formulation does not properly take into account the uncertainty inherent in the network,which is due to node mobility,energy constraints etc. In addition,the assumption of simultaneous decision making is hard to justify especially in den ad hoc networks.
比比影院
In a recent approach[16]the authors argue that TIT-FOR-TAT strategies make threats that are not credible and that TIT-FOR-TAT strategies lead to unrealistic models.The solution the authors propo is to u milder punishments, which are"partially cooperative".This means that nodes act optimally according to their beliefs about the behaviour of other nodes,but that their actions do not necessarily lead to globally optimal behaviour,in which all resources are optimally allocated.
A properly formulated game-theoretic model was intro-duced by Urpi,Bonuccelli and Giordano[17],who u static Bayesian games[18]–[20]to model forwarding behaviour.In Bayesian games each node has a"cret type",which in this ca is the energy class(remaining energy)of a node.The cret type affects decision making,but is only known to the node itlf.However,nodes have beliefs about the types of neighbouring nodes and they ba their decision making on their beliefs.Although this model properly formulates the game the nodes are playing,it still is very unrealistic as the static framework does not allow non-simultaneous decision making.In addition,the strategies in this framework are not dependent on past behaviour.However,this approach is clost to our approach as we also u energy class and define beliefs over the energy class of neighbouring nodes.
As a conclusion we argue that the problems with existing approaches are twofold.First of all,the mod
els consider only stage-games,in which a single packet is nt.Secondly, the approaches assume that the stage-games are static by nature.In our approach we consider dynamic stage-games with incomplete information that are repeatedfinitely many times and in which the model parameters are updated after each stage-game ends.This way we can model the various sources of uncertainty and take into account the past actions.
III.D EFINITION OF THE G AME
In this ction we prent a new framework that allows non-simultaneous decision making and takes past actions into account.We attempt to keep the discussion as general as possible and for this reason we do notfix all the components of the model,but instead give the conditions that must be satisfied.We start by defining some notation.
Let be an arbitrary ad hoc network and afinite t of nodes(agents)belonging to;thus we define
and.An arbitrary node of the t
is indexed by the variable.We assume that nodes have topology information only about the nodes within the range of their transmitter(local topology),but not about the nodes outside this region.The n
odes that are within the range of the transmitter constitute the neighbourhood of a node.The variable is ud to denote the neighbourhood of node.For simplicity we assume that the neighbourhood topologies are ,.
The nodes are energy constrained as they have a limited amount of energy available.In addition,the nodes are energy-aware as they know their current energy level and try to minimize unnecessary energy consumption.A node is said to be rational if it maximizes throughput of its own messages
and minimizes unnecessary energy consumption.Moreover, we assume that the level of remaining energy can be measured
at reasonable accuracy and,without loss of generality,we assume that the energy level can be reprented with afinite t of possible values.Thefiniteness is achieved using a global discretization method so that all nodes have the same t of possible values.We call the discretized energy level the energy class of a node and u the variable to denote the energy class of node at an arbitrary time.In the rest of the paper, the energy class of a node is also called the type of a player (node).
Sending and forwarding decisions in a network are analyzed at discrete periods of time.We approach the situation from the point of view of an individual node and define for each a time period,,
so that a new period starts when the node generates some packets and decides whether to nd them to the network or to discard them.The number of packets generated by node at time period is denoted by,and the number of the generated packets that are actually nt to the network is denoted by.Thus,at each time period the relation holds.We define the action history of a nding node at time period to be a vector that contains the number of packets nt at time periods :
谨小慎微的意思
(1) For simplicity,we restrict the tting by assuming that each message nt by an arbitrary node is broadcasted to all nodes in the neighbourhood.However,every node decides individually whether to forward the packets or not. From the point of view of the nding node,the decision and the corresponding forwarding action by a node take place at time period and we can define to be the number of packets that node forwards for node at time period.The nder’s decision of how many packets to nd depends on its beliefs about the energy class of the neighbouring nodes;if it believes that all neighbouring nodes have ud all of their energy or that they are non-cooperative,it is not rational to nd anything as nding consumes energy.The energy class of the neighbouring nodes are not a priori known,but instead we assume that node has a probability distribution defined over the possible values of the energy class of a node.The probabilities of the energy class at time period depend on the joint history profile of the actions made by node and node.The history profile is
(4) where is the energy class of a nding node at time period and is the energy class of a node that is forwarding packets for at time step and is an arbitrary probability distribution.
By defining a conditional probability density in the way shown in Equation4,we construct a belief system for the node.The beliefs reflect the level of knowledge a node has in the beginning of a time period.In addition,the beliefs play an important role when we want to define optimality of our model in the Bayesian n.We return to this issue later in this ction.
Similarly as the decisions made by the nder depend on the nder’s beliefs about the energy class of the neighbouring nodes,the decisions of the forwarding nodes depend on the beliefs the nodes have about the energy class of the nder.To assure consistency of the belief system with respect to actions, the model is constructed in such a way that the probabilities depend on the number of packets nt by node.The definition of the probability distribution of a forwarding node is given in Equation5.
to denote the joint belief system of nodes and:
DEFINITION1:
A nding game is a5-tuple(I,,),where is the t of players,defines the action space of the game,
is the belief system of the game.
The t consists of two nodes and,where
属鼠几月出生最好
and the t consists of action pairs for which the relation holds.The utility function
where is the utility function of the nder and is the utility function of the forwarder for the messages arriving from nder node. The type space is equivalent to the t of possible energy class values for a node and the belief system
,where is a suitable probability distribution,is some nding node,is forwarding node and is either the node or the node.The variable denotes the action of and the variable
(8) Whereas the nder obtains new information each time the game advances to the next period,the forwarder obtains new information within the individual periods.We require that also the forwarder’s beliefs are kept in a consistent state meaning that the probabilities are updated using the Bayes’rule every time the forwarder obrves an action made by the nder. Again the posterior probabilities of period form the prior probabilities of stage and the actual calculations are carried out in a similar fashion as in Equation8.
We assume that the support of an individual node’s action space equals the complete action space.Thus,for all nodes ,we have,where the support is defined in Equation9.
(9) The assumption that the support equals the complete action space means that all the actions performed by the players have positive probabilities.This is realistic becau we look at the entire time span a node is connected to a network.Thus it is reasonable to assume that a node can fail or that a node can change its behaviour policy in the cour of time.This assumption significantly simplifies the analysis of the model as it makes the perfect Bayesian equilibrium a strong enough equilibrium concept.Otherwi stronger equilibrium concepts, such as the quential equilibrium[21],that allow unexpected actions must be ud.Moreover,if we assign positive prior
probabilities over the elements of the action space,the proba-bility of a totally unexpected action approaches zero in infinity, but the game is repeated only afinite number of times.Thus the probability remains instead a small positive value.
爱国征文In addition,we assume that the types of a player are sta-tistically independent and thus uncorrelated.The assumptions we have made are general assumptions that are usually made in game-theoretic systems that u dynamic Bayesian systems. In addition,from the definition of the be
lief system and the update rule,it follows that our model satisfies the Bayesian conditions given in the Appendix.
To analyze optimality of our model,we would like to apply subgame perfection[22]and especially its extension to perfect Bayesian equilibrium to our model.A game is said to be subgame perfect,if the restriction of strategies to a single stage(time period)constitutes a Nash-equilibrium.Thus if we"forget"the previous play and look only at the current situation,the actions the players perform must form an optimal agreement between the players.Perfect Bayesian equilibrium (PBE)extends subgame perfection to games with incomplete information.In PBE,each stage game played at a single period must constitute a Bayes-Nash equilibrium.In other words, when the actions are restricted to a single time period they must be optimal given the beliefs the players have at the beginning of that time period.In a Bayes-Nash equilibrium optimal strategies can be defined as behaviour strategies that maximize the expected utility of a player.According to this definition,the nder’s optimal strategy is the number of packets it should nd to the network in order to maximize its utility.However,before we can define the optimal actions in a broadcast model,we need to define optimality in a unicast model.
In a unicast communication model the nder and the forwarder communicate directly with each oth女人怎么补气血
er.In this model the optimal strategy of the nder is given by Definition 2.For notational simplicity,the time indices have been omitted from all variables in the remainder of this Section. DEFINITION2:The optimal behaviour policy of the nder at time period is
The probabilities were defined in Equation4and the term is a behaviour strategy of player in the game where node acts as the nder.The behaviour strategy tells the probability that node performs the action given the action of the nder.Finally,the term is the utility function of node which was defined in Equation7.
In the broadcast model it is rational to nd packets if some neighbouring node is willing to forward them.The optimal nder strategy in the broadcast model is defined as the maximum of the optimal strategies of individual"unicast" games.This modification is given in Definition3.
DEFINITION3:The optimal behaviour policy of the nder at time period in a broadcast model.
In Definition3we have made the assumption that packets are numbered and that the decision making process of forwarding nodes considers the individual packets in numbering order. The optimal strategy of the forwarder can be defined in a similar manner.The forwarder makes a decision only after the nder has already made some action.The action performed by the nder is obrvab
le so the forwarder has more information available than the nder.Again,the optimal behaviour strategy of the forwarder is the strategy that maximizes the expected utility given the current beliefs. The form of the utility should be such that it measures expected gain in future throughput.The formal definition of the optimal strategy is given in Definition4where the term was defined in Equation5.
DEFINITION4:Optimal behaviour policy of the for-warder at time period.
Here is the behaviour strategy of the nder which tells the probability that node nds packets(at time period )given her energy class.
Together with the belief system
全民国家安全教育日是哪一天
IV.R OUTING M ODEL AND I MPLEMENTATION I SSUES The theoretical model prented in the previous ction has two main contributions.First of all it allows theoretical analysis of various routing protocols.Secondly,it makes it possible to implement new"intelligent"routing protocols that can be theoretically justified.In this ction we discuss implementation issues related to our model by introducing a
pudo-protocol that implements all relevant aspects of our model.
形容大自然的成语
The pudo-protocol consists of two phas.In thefirst pha a node joins an ad hoc network by discovering its neighbours and by assigning prior probabilities over the energy class values of the neighbours.As in the previous ction, we assume that each packet is broadcasted to all neighbours and thus,to discover its neighbours,a nodefirst nds a JOIN message to the network.The nodes that respond form the neighbourhood.Additionally,the node constructs a prior probability distribution over the energy class values of the neighbours.The general form of the initialization pha is prented in Algorithm1.
Send JOIN message
while receives messages do
nder of message
Construct
end while Algorithm2Initialization of priors for
must perform the same operations.Thus each time a node receives a JOIN-message,it must construct a new probability distribution for the new node.
The cond pha of the pudo-protocol is a working pha. After a node has joined the network it is ready to work as an ordinary peer.This means that the node listens to incoming traffic and when it notices messages that require an action it decides what action to perform(if any).In addition,at certain periods of time the node generates traffic(packets of its own) that must be nt to the network.
The decisions made in the forwarding process depend on the type of the message.If the message is a JOIN-message, the node must respond by nding the corresponding respon message(e discussion above).If the message is a forward request,the node must update its belief system and decide whether to forward the message or not.The decision is bad on Definition4and the general form of the forwarding process is prented in Algorithm3.
The last type of message that needs to be discusd is that the message is a forwarded message that node has generated.This means that the stage game at period ends and that the node must update its beliefs about the energy class of the neighbour that forwarded the message.Even with
Input:messages m from node at period: Calculate posterior: