Modelling routing in wireless ad hoc networks with dynamic Bayesian games

更新时间:2023-07-23 12:25:26 阅读：评论：0

Modelling Routing in Wireless Ad Hoc Networks with Dynamic Bayesian Games

Petteri Nurmi

Helsinki Institute for Information Technology(HIIT)/Basic Rearch Unit

Department of Computer Science,P.O.Box68

FI-00014University of Helsinki,Finland

Email:petteri.nurmi@cs.helsinki.ﬁ

Abstract—Mobile agents acting in wireless ad hoc networks are energy constrained,which leads to potential lﬁshness as nodes are not necessarily willing to forward packets for other nodes.Situations like this are traditionally analyzed using game theory and recently also the ad hoc networking community has witnesd game-theoretic approaches to especially routing. However,from a theoretical point-of-view the contemporary game-theoretic approaches have mainly ignored two important aspects:non-simultaneous decision making and incorporating history information into the decision making process.In this article we propo a new model thatﬁlls the gaps and allows to analyze routing theoretically.

I.I NTRODUCTION

In wireless ad hoc networks[1]a collection of nodes, bile hosts,forms a lf-organizing network without any support from pre-established infrastructure.The lack of infrastructure support forces the nodes to implement all net-working tasks by themlves and for routing this means that packets must be routed using other nodes as intermediate relays.Simple routing schemes,such as the dynamic source routing protocol[2],are sufﬁcient only if all nodes are willing to participate in the forwarding.However,nodes are energy constrained by their battery level and want to maximize their lifetime,which leads to potential lﬁshness as the nodes may refu to forward packets for other nodes.Simulations have shown that network throughput rate often degrades signif-icantly when simple routing schemes are ud and even a small portion of the nodes acts lﬁshly[3].For this reason methods for stimulating cooperation among the nodes are required.A method for cooperation stimulation should have aﬁrm theoretical background.However,most of the existing approaches are mainly veriﬁed by experimental evaluation, which makes theoretical analysis of the systems difﬁcult.

A routing decision is esntially a conﬂict situation that can be modelled using tools from game theory.The contemporary approaches have analyzed routing in a stage-wi manner using so-called static games[4],in which routing is modelled by looking at a single t of packets and where all relays

on the routing path act as if all nodes decide simultaneously whether to forward the packets or not.Clearly this kind of approach is unrealistic especially in multi-hop routing and in den ad hoc networks as there is delay between the 1Rearch supported by the Academy of Finland grant202203.nding of a packet by the source node and the packet processing in an intermediate node.Within the delay time,the energy level of an intermediate node usually decreas,which caus important model parameters to change and makes the assumption of simultaneous decision making no longer valid. Anotherﬂaw in the existing approaches is that they mainly ignore the temporal aspect(past actions)in game play or give no proper justiﬁcation for the model that is ud.To overcome the deﬁciencies,we propo a new model that us dynamic Bayesian games[4,Chap.8].Dynamic Bayesian games allow formulating a generic model,in which no restrictions are put on the delay between nding a t of packets and forwarding them.Additionally,the new model allows analyzing optimality in terms of past actions and makes it possible to take into account various sources of uncertainty.The organization of this paper is as follows:in Section II we discuss related work on cooperation stimulation mechanisms and on game-theoretic routing models of ad hoc networks.In Section III we introduce the new model and in Section IV implementation issues related to our model are discusd.In Section V we conclude the paper and discuss future work.In the Appendix we prent game-theoretic material that is ud in the optimality proof of Section III.

II.R ELATED W ORK

The initial approaches for cooperation stimulation in ad hoc networks u either a reputation mechanism[5]or some kind of a virtual currency system.The approaches bad on reputation attempt to identify lﬁsh nodes and to isolate non-cooperative nodes from the network.Among theﬁrst reputation approaches is the watchdog-mechanism in which the forwarding rate of neighbouring nodes is monitored[6]. If a neighbour does not forward messages,it is considered non-cooperative and information about the non-cooperative reputation is propagated in the network.The information about non-cooperative nodes is ud by a pathrater that rates paths between the source and the destination node.Together the watchdog and the pathrater methods make it possible to avoid paths with misbehaving nodes.From a theoretical perspective, the main problem with this approach is that misbehaviour is actually rewarded as no packets are routed through non-cooperative nodes,but the packets of the non-cooperative nodes are still forwarded.However,the general mechanism

of monitoring trafﬁc from neighbouring nodes has been ud in many other approaches and it is also ud in the approach propod in this article.

To overcome the problems with the watchdog approach, more elaborate reputation mechanisms such as Core[7]and Conﬁdant[8]have been propod.The methods extend the watchdog approach so that each node calculates a reputation value bad on the information it has obtained about the forwarding rate of another node.The lower the reputation value of a node,the less likely it is that a packet is forwarded for that node.The propod approaches differ what information is ud,how the information is ud,how hard misbehaviour is punished and how re-integration of temporally misbehaving nodes back in the network is performed.The main problem with the approaches is that they lack a formal model,which makes theoretical analysis difﬁcult.In addition, the decision making mechanism is not asﬂexible as in our model.

The approaches using a virtual currency system model the forwarding problem as an economic market,where nding and forwarding cost money.The approach propod by But-tyán and Hubaux us a currency called NUGLET[9].Each node is assumed to have a parate cure hardware module that has a NUGLET counter.The counter increas when a node forwards a packet for another node,and decreas when a node nds a packet of its own.The NUGLET counter must always be positive,so the approach forces the nodes to forward at least the same amount of trafﬁc as they nd themlves.

A similar kind of solution was prented by Anderegg and Eidenbenz[10],who derive a truthful Vickrey-Clarke-Groves mechanism(VCG)[11]–[13]for routing.In their approach, theﬁrst pha of routing is to ask the nodes for their costs to forward packets.Bad on this information the minimum energy path is calculated.Due to energy constraints,the nodes are attracted to cheating,but the authors prove that using premium ,additional pay to the nodes on the minimum energy path,it is in the best interest of a node to report the actual energy cost in theﬁrst pha.Another related approach was propod by Srinivasan,Nuggehalli,Chiasrini and Rao[14].Their approach us a generous TIT-FOR-TAT policy that guarantees(under certain conditions)cooperation in a game that is played repeatedly.Esntially the TIT-FOR-TAT policy[4,p.173]means that if a node does not cooperate, the other nodes will not forward packets for that node in the next time frame(packet ssion).The generous version us slightly milder punishments.

关于菊花的资料

All the virtual currency approaches u computational mechanism design[15]methods for cooperation stimulation. Although the authors of the articles do not usually formulate a proper game-theoretic model,the approaches can be en as static repeated games[4,Chap.5].This formulation does not properly take into account the uncertainty inherent in the network,which is due to node mobility,energy constraints etc. In addition,the assumption of simultaneous decision making is hard to justify especially in den ad hoc networks.

比比影院

In a recent approach[16]the authors argue that TIT-FOR-TAT strategies make threats that are not credible and that TIT-FOR-TAT strategies lead to unrealistic models.The solution the authors propo is to u milder punishments, which are"partially cooperative".This means that nodes act optimally according to their beliefs about the behaviour of other nodes,but that their actions do not necessarily lead to globally optimal behaviour,in which all resources are optimally allocated.

A properly formulated game-theoretic model was intro-duced by Urpi,Bonuccelli and Giordano[17],who u static Bayesian games[18]–[20]to model forwarding behaviour.In Bayesian games each node has a"cret type",which in this ca is the energy class(remaining energy)of a node.The cret type affects decision making,but is only known to the node itlf.However,nodes have beliefs about the types of neighbouring nodes and they ba their decision making on their beliefs.Although this model properly formulates the game the nodes are playing,it still is very unrealistic as the static framework does not allow non-simultaneous decision making.In addition,the strategies in this framework are not dependent on past behaviour.However,this approach is clost to our approach as we also u energy class and deﬁne beliefs over the energy class of neighbouring nodes.

As a conclusion we argue that the problems with existing approaches are twofold.First of all,the mod

els consider only stage-games,in which a single packet is nt.Secondly, the approaches assume that the stage-games are static by nature.In our approach we consider dynamic stage-games with incomplete information that are repeatedﬁnitely many times and in which the model parameters are updated after each stage-game ends.This way we can model the various sources of uncertainty and take into account the past actions.

III.D EFINITION OF THE G AME

In this ction we prent a new framework that allows non-simultaneous decision making and takes past actions into account.We attempt to keep the discussion as general as possible and for this reason we do notﬁx all the components of the model,but instead give the conditions that must be satisﬁed.We start by deﬁning some notation.

Let be an arbitrary ad hoc network and aﬁnite t of nodes(agents)belonging to;thus we deﬁne

and.An arbitrary node of the t

is indexed by the variable.We assume that nodes have topology information only about the nodes within the range of their transmitter(local topology),but not about the nodes outside this region.The n

odes that are within the range of the transmitter constitute the neighbourhood of a node.The variable is ud to denote the neighbourhood of node.For simplicity we assume that the neighbourhood topologies are ,.

The nodes are energy constrained as they have a limited amount of energy available.In addition,the nodes are energy-aware as they know their current energy level and try to minimize unnecessary energy consumption.A node is said to be rational if it maximizes throughput of its own messages

and minimizes unnecessary energy consumption.Moreover, we assume that the level of remaining energy can be measured

at reasonable accuracy and,without loss of generality,we assume that the energy level can be reprented with aﬁnite t of possible values.Theﬁniteness is achieved using a global discretization method so that all nodes have the same t of possible values.We call the discretized energy level the energy class of a node and u the variable to denote the energy class of node at an arbitrary time.In the rest of the paper, the energy class of a node is also called the type of a player (node).

Sending and forwarding decisions in a network are analyzed at discrete periods of time.We approach the situation from the point of view of an individual node and deﬁne for each a time period,,

so that a new period starts when the node generates some packets and decides whether to nd them to the network or to discard them.The number of packets generated by node at time period is denoted by,and the number of the generated packets that are actually nt to the network is denoted by.Thus,at each time period the relation holds.We deﬁne the action history of a nding node at time period to be a vector that contains the number of packets nt at time periods :

谨小慎微的意思

(1) For simplicity,we restrict the tting by assuming that each message nt by an arbitrary node is broadcasted to all nodes in the neighbourhood.However,every node decides individually whether to forward the packets or not. From the point of view of the nding node,the decision and the corresponding forwarding action by a node take place at time period and we can deﬁne to be the number of packets that node forwards for node at time period.The nder’s decision of how many packets to nd depends on its beliefs about the energy class of the neighbouring nodes;if it believes that all neighbouring nodes have ud all of their energy or that they are non-cooperative,it is not rational to nd anything as nding consumes energy.The energy class of the neighbouring nodes are not a priori known,but instead we assume that node has a probability distribution deﬁned over the possible values of the energy class of a node.The probabilities of the energy class at time period depend on the joint history proﬁle of the actions made by node and node.The history proﬁle is

(4) where is the energy class of a nding node at time period and is the energy class of a node that is forwarding packets for at time step and is an arbitrary probability distribution.

By deﬁning a conditional probability density in the way shown in Equation4,we construct a belief system for the node.The beliefs reﬂect the level of knowledge a node has in the beginning of a time period.In addition,the beliefs play an important role when we want to deﬁne optimality of our model in the Bayesian n.We return to this issue later in this ction.

Similarly as the decisions made by the nder depend on the nder’s beliefs about the energy class of the neighbouring nodes,the decisions of the forwarding nodes depend on the beliefs the nodes have about the energy class of the nder.To assure consistency of the belief system with respect to actions, the model is constructed in such a way that the probabilities depend on the number of packets nt by node.The deﬁnition of the probability distribution of a forwarding node is given in Equation5.

to denote the joint belief system of nodes and:

DEFINITION1:

A nding game is a5-tuple(I,,),where is the t of players,deﬁnes the action space of the game,

is the belief system of the game.

The t consists of two nodes and,where

属鼠几月出生最好

and the t consists of action pairs for which the relation holds.The utility function

where is the utility function of the nder and is the utility function of the forwarder for the messages arriving from nder node. The type space is equivalent to the t of possible energy class values for a node and the belief system

,where is a suitable probability distribution,is some nding node,is forwarding node and is either the node or the node.The variable denotes the action of and the variable

(8) Whereas the nder obtains new information each time the game advances to the next period,the forwarder obtains new information within the individual periods.We require that also the forwarder’s beliefs are kept in a consistent state meaning that the probabilities are updated using the Bayes’rule every time the forwarder obrves an action made by the nder. Again the posterior probabilities of period form the prior probabilities of stage and the actual calculations are carried out in a similar fashion as in Equation8.

We assume that the support of an individual node’s action space equals the complete action space.Thus,for all nodes ,we have,where the support is deﬁned in Equation9.

(9) The assumption that the support equals the complete action space means that all the actions performed by the players have positive probabilities.This is realistic becau we look at the entire time span a node is connected to a network.Thus it is reasonable to assume that a node can fail or that a node can change its behaviour policy in the cour of time.This assumption signiﬁcantly simpliﬁes the analysis of the model as it makes the perfect Bayesian equilibrium a strong enough equilibrium concept.Otherwi stronger equilibrium concepts, such as the quential equilibrium[21],that allow unexpected actions must be ud.Moreover,if we assign positive prior

probabilities over the elements of the action space,the proba-bility of a totally unexpected action approaches zero in inﬁnity, but the game is repeated only aﬁnite number of times.Thus the probability remains instead a small positive value.

爱国征文In addition,we assume that the types of a player are sta-tistically independent and thus uncorrelated.The assumptions we have made are general assumptions that are usually made in game-theoretic systems that u dynamic Bayesian systems. In addition,from the deﬁnition of the be

lief system and the update rule,it follows that our model satisﬁes the Bayesian conditions given in the Appendix.

To analyze optimality of our model,we would like to apply subgame perfection[22]and especially its extension to perfect Bayesian equilibrium to our model.A game is said to be subgame perfect,if the restriction of strategies to a single stage(time period)constitutes a Nash-equilibrium.Thus if we"forget"the previous play and look only at the current situation,the actions the players perform must form an optimal agreement between the players.Perfect Bayesian equilibrium (PBE)extends subgame perfection to games with incomplete information.In PBE,each stage game played at a single period must constitute a Bayes-Nash equilibrium.In other words, when the actions are restricted to a single time period they must be optimal given the beliefs the players have at the beginning of that time period.In a Bayes-Nash equilibrium optimal strategies can be deﬁned as behaviour strategies that maximize the expected utility of a player.According to this deﬁnition,the nder’s optimal strategy is the number of packets it should nd to the network in order to maximize its utility.However,before we can deﬁne the optimal actions in a broadcast model,we need to deﬁne optimality in a unicast model.

In a unicast communication model the nder and the forwarder communicate directly with each oth女人怎么补气血

er.In this model the optimal strategy of the nder is given by Deﬁnition 2.For notational simplicity,the time indices have been omitted from all variables in the remainder of this Section. DEFINITION2:The optimal behaviour policy of the nder at time period is

The probabilities were deﬁned in Equation4and the term is a behaviour strategy of player in the game where node acts as the nder.The behaviour strategy tells the probability that node performs the action given the action of the nder.Finally,the term is the utility function of node which was deﬁned in Equation7.

In the broadcast model it is rational to nd packets if some neighbouring node is willing to forward them.The optimal nder strategy in the broadcast model is deﬁned as the maximum of the optimal strategies of individual"unicast" games.This modiﬁcation is given in Deﬁnition3.

DEFINITION3:The optimal behaviour policy of the nder at time period in a broadcast model.

In Deﬁnition3we have made the assumption that packets are numbered and that the decision making process of forwarding nodes considers the individual packets in numbering order. The optimal strategy of the forwarder can be deﬁned in a similar manner.The forwarder makes a decision only after the nder has already made some action.The action performed by the nder is obrvab

le so the forwarder has more information available than the nder.Again,the optimal behaviour strategy of the forwarder is the strategy that maximizes the expected utility given the current beliefs. The form of the utility should be such that it measures expected gain in future throughput.The formal deﬁnition of the optimal strategy is given in Deﬁnition4where the term was deﬁned in Equation5.

DEFINITION4:Optimal behaviour policy of the for-warder at time period.

Here is the behaviour strategy of the nder which tells the probability that node nds packets(at time period )given her energy class.

Together with the belief system

全民国家安全教育日是哪一天

IV.R OUTING M ODEL AND I MPLEMENTATION I SSUES The theoretical model prented in the previous ction has two main contributions.First of all it allows theoretical analysis of various routing protocols.Secondly,it makes it possible to implement new"intelligent"routing protocols that can be theoretically justiﬁed.In this ction we discuss implementation issues related to our model by introducing a

pudo-protocol that implements all relevant aspects of our model.

形容大自然的成语

The pudo-protocol consists of two phas.In theﬁrst pha a node joins an ad hoc network by discovering its neighbours and by assigning prior probabilities over the energy class values of the neighbours.As in the previous ction, we assume that each packet is broadcasted to all neighbours and thus,to discover its neighbours,a nodeﬁrst nds a JOIN message to the network.The nodes that respond form the neighbourhood.Additionally,the node constructs a prior probability distribution over the energy class values of the neighbours.The general form of the initialization pha is prented in Algorithm1.

Send JOIN message

while receives messages do

nder of message

Construct

end while Algorithm2Initialization of priors for

must perform the same operations.Thus each time a node receives a JOIN-message,it must construct a new probability distribution for the new node.

The cond pha of the pudo-protocol is a working pha. After a node has joined the network it is ready to work as an ordinary peer.This means that the node listens to incoming trafﬁc and when it notices messages that require an action it decides what action to perform(if any).In addition,at certain periods of time the node generates trafﬁc(packets of its own) that must be nt to the network.

The decisions made in the forwarding process depend on the type of the message.If the message is a JOIN-message, the node must respond by nding the corresponding respon message(e discussion above).If the message is a forward request,the node must update its belief system and decide whether to forward the message or not.The decision is bad on Deﬁnition4and the general form of the forwarding process is prented in Algorithm3.

The last type of message that needs to be discusd is that the message is a forwarded message that node has generated.This means that the stage game at period ends and that the node must update its beliefs about the energy class of the neighbour that forwarded the message.Even with

Input:messages m from node at period: Calculate posterior:

本文发布于:2023-07-23 12:25:26，感谢您对本站的认可！

本文链接：https://www.wtabcd.cn/fanwen/fan/89/1093259.html

上一篇：管理会计(英文版)课后习题答案(高等教育出版社)chapter 13

下一篇：中山大学新理5.0英语自主学习大厅第四册第三单元A套答案

标签：教育资料属鼠爱国补气血

留言与评论（共有 0 条评论）