Decentralized Model Predictive Control of
Cooperating UA Vs
Arthur Richards and Jonathan How
Aerospace Control Laboratory
Massachutts Institute of Technology
Cambridge MA02139
讲座海报
Email:{arthurr,jhow}@mit.edu
A BSTRACT
This paper implements robust Decentralized Model Pre-dictive Control(DMPC)for a team of cooperating Un-inhabited Aerial Vehicles(UA Vs).The problem involves vehicles with independent dynamics but with coupled con-straints to capture required cooperative behavior.Using a recently-developed form of DMPC,each vehicle plans only for its own actions,but feasibility of the sub-proble
ms and satisfaction of the coupling constraints are guaranteed throughout,despite the action of unknown but bounded disturbances.UA Vs communicate relevant plan data to en-sure that decisions are consistent across the team.Collision avoidance is ud as an example of coupled constraints and the paper shows how the speed,turn rate and avoidance distance limits in the optimization should be modified in order to guarantee robust constraint satisfaction.Integer programming is ud to solve the non-convex problem of path-planning subject to avoidance constraints.Numerical simulations compare computation time and target arrival time under decentralized and centralized control and inves-tigate the impact of decentralization on team performance. The results show that the computation required for DMPC is significantly lower than for its centralized counterpart and scales better with the size of the team,at the expen of only a small increa in UA Vflight times.
I.I NTRODUCTION
This paper combines Mixed-Integer Linear Program-ming(MILP)path-planning[2],[4]and Decentralized Model Predictive Control(DMPC)[1]to provide a de-centralized algorithm for co-operative guidance of Unin-habited Aerial Vehicles(UA Vs).This paper investigates its performance by simulating its behavior for multi-UA V collision avoidance problems.The avoidance constraints couple the behavior of the system and therefore exerci the ability of the controller to make cooper
ative decisions. The DMPC algorithm[1]was designed primarily to ensure satisfaction of constraints,and critical performance goals can be embedded in constraints.However,condary goals remain in the cost quantities which should be small but have no explicit limit.An important purpo of this paper is to investigate the effect of decentralization on the condary goals.The metrics of interest in the UA V example are theflight time,which appears in the cost function of the MPC optimizations,and computation time.In particular,we investigate how the algorithm scales with the number of UA Vs involved and compare with the behavior of the corresponding centralized MPC.
MPC is a feedback control scheme in which a trajectory optimization is solved on-line at each time step.Thefirst control input of the optimal quence is applied and the op-timization is repeated at each subquent step.Becau the on-line optimization explicitly includes the operating con-straints,MPC can operate clor to hard constraint bound-aries than traditional control schemes.MPC has been widely developed for constrained systems[3],with many results concerning stability[5]and robustness[7],[8],and has been applied to the co-operative control of multiple vehicles[8],
[10],[6]using centralized computation.However,solving
掩的成语a single optimization problem for the entire team typically requires significant computation,which scales poorly with the size of the he number of vehicles in the team).To address this computational issue,attention has recently focud on decentralized MPC[11],with various approaches including robustness to the actions of others[9], penalty functions[12],[15],partial grouping of compu-tations[16],loitering options for safety guarantees[13], and dynamic programming[14].The challenge of making decisions in a decentralized fashion is to ensure that the actions of each subsystem must be consistent with tho of the other subsystems,so that decisions taken independently do not lead to a violation of the coupling constraints.The decentralization of the control is further complicated when disturbances act on the subsystems making the prediction of future behavior uncertain.
This paper employs a recently-developed approach to DMPC[1]that address both of the difficulties.The key features of this algorithm are that each vehicle only solves a sub-problem for its own plan,and each of the sub-problems is solved only once per time step,without iteration.Under the assumption of a bounded disturbance, each of the sub-problems is guaranteed to be feasible[1], thus ensuring robust constraint satisfaction across the group. The method employs at each time step a quential solution procedure,outlined in Fig.1(a),in which the subsystems solve their planning problems one after the other.The plan
(a)Procedure Flow
(b)Information Flow
Fig.1:Overview of Decentralized Algorithm
data relevant to the coupled constraints is then communi-cated to the other subsystems.Fig.1(b)shows the infor-mation requirements for sub-problem p.Each sub-problem accommodates(a)the latest plans of tho subsystems earlier in the quence and(b)predicted plans of tho later in the quence.The disturbance is accommodated by including“margin”in the constraints[8],tightening them in a monotonic quence.At initialization,it is necessary to find a feasible solution to the centralized problem,although this need not be optimal.
热血飞扬The paper begins with the multi-UA V problem state-ment in Section II.Then Section III lays out the control formulation for this problem.Section IV gives the results of simulations using this controller for many different scenarios and team sizes.
II.UAV P ROBLEM S TATEMENT
The problem is to control multiple UA Vs,each assumed tofly at constant altitude and with restricted s
peed and rate of turn.Each UA V has a single pre-assigned goal point.The objective is for all UA Vs to reach their goals in minimum time without aintaining a minimum paration between every pair of UA Vs at all times.This problem statement exercis the ability of the decentralized method to handle non-convex constraints and coupling between the actions of the UA Vs.
Since the decentralized MPC method requires linear dynamics,we adopt the linear approximation of aircraft dynamics from[4].Each vehicle is modeled as a point mass subject to speed and force limits.A disturbance force is also included in the model to account for uncertainty.Let UA V p have position r p(t)∈ 2and velocity v p(t)∈ 2and be acted on by a control force f p(t)∈ 2and a disturbance force˜f p(t)∈ 2giving dynamics
˙r p(t)=v p(t)˙v p(t)=f p(t)+˜f p(t)
球的表面积和体积and constraints
v p(t) 2≤V p f p(t) 2≤F p
(This model becomes equivalent to Dubins’car model[17] with the addition of a minimum speed constraint.This can be implemented using MILP,but was not done in the experiments for simplicity.)The disturbance force is assumed to be unknown but bounded
˜f p(t) 2≤˜F p
The collision avoidance is expresd in terms of a square exclusion region of side2L around each UA V which no other UA V may enter
r p(t)−r q(t) ∞≥L∀t,p,q:q=p Finally,let the goal of UA V p be g p∈ 2.
III.DMPC FOR UAV S
This ction prents the robust MPC optimization prob-lems for the UA V problem described in Section II and the algorithm in which they are employed.Both centralized and decentralized optimizations are prented.The central-ized form is relevant becau it is ud to initialize the decentralized algorithm and also ud in comparison tests in Section IV.
The MPC algorithm requires a linear discrete-time state-space dynamics model and linear,although not necessarily convex,constraints.It is straightforward to convert the dynamics to the form
x p(k+1)=Ax p(k)+Bu p(k)+Ew p(k)(1) with state x p(k)=(r p(kT)T v p(kT)T)T,control input as u p(k)=f p(kT),disturbance w p(k)=˜f p(kT)and time step T.The speed and force limits are approximated by polyhedra and applied at each time step,along with the collision avoidance constraints
[0G]x p(k)≤1V p(2a)
Gu p(k)≤1F p(2b) [I0](x p(k)−x q(k)) ∞≥L ∀p,q:q=p(2c)
where the rows of G are unit vectors and L >L includes some additional margin to account for the the discrete-time application of the constraints.Ref.[4]showed the u of binary logical variables within Mixed-Integer Linear Programming(MILP)to encode non-convex avoidance con-straints such as(2c).Finally,the discrete-time disturbance is also converted to polyhedral form.
Gw p(k)≤1˜F p(3) Centralized Optimization
P C(x1(k),...,x P(k)):J∗C=min
u p,x p
P
p=1
T p
subject to∀p∈{1...P}∀j∈{0...(T p−1)∀q>p}
x p(k+j+1|k)=Ax p(k+j|k)+Bu p(k+j|k) x p(k|k)=x p(k)
[0G]x p(k)≤1(¯V p−c p(j))
Gu p(k)≤1(¯f p−d p(j))
[I0]x p(k+T p|k)−g p ∞≤a p(T p)
[I0]x p(k+j|k)−r q(k+j|k) ∞≥L +b pq(j) where the quantities a,b,c,d are modifications to the con-straints to provide margin for robustness
a p(0)=0
a p(1)= [1000]B
2˜F p
a p(j)=a p(1)+ [1000]LB 2˜F p∀j≥2
b pq(j)=a p(j)+a q(j)∀j
c p(0)=0
c p(1)= [0010]B 2˜F p √2
c p(j)=c p(1)+ [0010]LB 2˜F p √
2∀j≥2
d p(0)=0
d p(1)= [10]KB 2˜F p √2
写观察日记四年级d p(j)=d p(1)+ [10]KLB 2˜F p √
2∀j≥2
where K is a two step nilpotent controller for the sys-tem(A,B)and L=(A+BK).Hence the nilpotency requirement for K gives L×L=0.A suitable controller K can be found since the system is a combination of two independent double-integrators.
Decentralized Optimization(for UA V p)
怎么形容月亮P p(x p(k),˜r k+T|k)):J∗p=min
u p,x p
T p subject to∀j∈{0...T p−1}∀q=p
x p(k+j+1|k)=A p x p(k+j|k)+B p u p(k+j|k) x p(k|k)=x p(k)
Gv p≤1(¯V p−c p(j))
Gf p≤1(¯f p−d p(j))
[I0]x p(k+T p|k)−g p ∞≤a p(T p)
[I0]x p(k+j|k)−˜r pq(k+j|k) ∞≥L +ˆb pq(j)where˜r k+T|k)reprents the latest known inten-tions of other vehicles
˜r k+j|k)=
[I0]x q(k+j|k)q<p
[I0]x q(k+j|k−1)q>p This reprents the latest plans for tho UA Vs before p in the planning quence and projected plans,using the continuation of the previous plans,for tho after p.This data is exchanged by communication between UA Vs and appears as a constrant parameter in sub-problem p.The avoidance marginsˆb are modified for the decentralized problem
ˆb
pq
(0)=0q<p
ˆb
pq
(0)=a q(0)q>p
ˆb
pq
(1)=a p(0)+a q(0)q<p
ˆb
pq
(1)=a p(0)+a q(1)q>p
ˆb
pq
(j)=a p(1)+a q(1)q=p∀j≥2
This accounts for the additional uncertainty in the projected plans of later vehicles,as the actual plans of tho UA Vs will differ from the predictions due to the action of the disturbance.
Algorithm1(Decentralized MPC)
1)Find a solution to the initial centralized problem
P C(x1(0),...,x P(0)).If solution cannot be found,
stop(problem is infeasible).
2)Set k=0
3)Apply control u∗p(k|k)to each subsystem p
4)Increment k
5)For each subsystem p in order1,...,P:
a)Compile the plan
data˜r k+T|k)∀q=p from other
subsystems
b)Solve sub-problem
取名字寓意好的字P p(x p(k),˜r k+T|k))
6)Go to3
海航学院
Theorem1(Robust Feasibility)If a feasible solution to the initial problem P C(x1(0),...,x P(0)),solved in Step1of Algorithm1,can be found,then the system(1) controlled by Algorithm1and acted upon by disturbances obeying(3)will satisfy the constraints(2)and all sub-problem optimizations will be feasible.
This theorem is quoted from[1]and is not proven here. Briefly,the result relies on the ability to construct a feasible solution to each sub-problem by combining its previous solution and a perturbation,using the nilpotent control policy K to counteract the disturbance.
Remarks
Thefirst step of Algorithm1requiresfinding a solution to the centralized problem,but it is not necessary for this to be an optimal solution.Any method of generating a feasible solution can be employed.
Algorithm1ignores computation and communication delays.However,it can be extended to explicitly account for delays by propagating the state estimate forward[18].
Crucially,this does not introduce a delay equal to the entire solution quence.Rather,each subsystem accounts for a delay corresponding to the solution of its own sub-problem. Algorithm1can
be run as an“anytime”algorithm.The result in Theorem1follows from the ability to construct a feasible solution to each sub-problem before starting its computation.Therefore,an arbitrarily small computation time-limit can be enforced with the guarantee that a solution will always be available.
IV.S IMULATION R ESULTS
This ction prents results of numerical simulations demonstrating the application of the DMPC method and investigating its performance,including comparison to the centralized counterpart(CMPC).Fifty random instances of problems for each team size between two and ven were generated and simulated using decentralized MPC.In every ca,the UA Vs began on the line between(−5,5)and (−5,−5),heading in the+X direction.The goals were chon randomly on the line between(5,5)and(5,−5). Fig.2shows a reprentative example of a problem of each size.The simulation model included a randomly-generated disturbance force of up to10%of the control force.The simulations were run in Matlab on a desktop PC with Pentium2.2GHz processor and512MB RAM using CPLEX7.0for MILP optimization.The same computer was ud to solve all UA V sub-problems,consistent with the quential nature of the algorithm.For comparison,the samefifty instances for team sizes up tofive were also simulated,on the same computer,using the CMPC.The same initial solution was ud for both algorithms,with CPLEX configured to arch prim
arily for feasible,rather than optimal,solutions and taking thefirst one found by its arch procedure.For the on-line optimizations,CPLEX was configured to ek optimal solutions and subjected to a ten minute computation time-limit,after which the best feasible solution available was employed.The larger problems,with six and ven UA Vs,were not attempted using the centralized method as the computation times became prohibitively long.
In all the scenarios,all UA Vs reached their respective goals without incurring infeasibility and maintaining the required parations.The examples in Fig.2show how the UA Vs divert from their direct routes to their goals to avoid collisions.The order of planning,outlined in Fig.1(a),is marked in thefigures.The results indicate that the significance of planning order is unclear and highly problem specific.In the two-,three-and six-vehicle cas, UA V1goes straight to its goal,but in the four-andfive-vehicle cas,UA V1diverts to avoid others.Thus there is no apparent consistent advantage to planningfirst or last.
A.Computation Time Comparison
Fig.3and Table I prent data comparing the total computation times for the centralized and decentralized algorithms.Only computation times for the cond time
Fig.2:Example Trajectories for Simulations of Teams of 2-7UA Vs using Decentralized MPC.The numbers denote the planning order and circles mark the goals.
step are considered,as the problem complexity varies as UA Vs proceed towards their goals.The results for the decentralized algorithm are the summed times for all the sub-problems,since they are solved quentially as shown in Fig.1(a).Looking at the mean times(black dot)and the standard devation envelopes(dark gray bar),it is clear that the decentralized method offers a considerable compu-tational advantage,averaging over twenty times faster than centralized for problems offive UA Vs.The ranges(light gray bar)show that there are,however,some rare cas in which the DMPC controller can take around ten minutes to solve.The DMPC computation load still grows nonlinearly with team size,to be expected as the inherent complexity of the problem grows as the UA V“density”increas. However,it is clear that the rate of increa is much slower than for CMPC.Fig.4shows the average solution times for the decentralized MPC experiments broken down by sub-problem.This shows that there is no clear pattern to the distribution of computation between sub-problems.
Fig.3:Comparison of Computation Times for Randomly-Chon Instances.The plot shows the mean computation times(·),range of times(union of light and dark gray)and standard deviation about mean(dark gray).
TABLE I:Computation Times
Number of Computation Times(s)
Vehicles Min Mean Max St.Dev
Centralized
20.73 2.5113.14 2.64
3 1.7719.45600.5087.07
4 1.6771.91600.59134.14
58.00495.12601.25206.21
Decentralized
20.92 1.65 3.550.61
3 1.53 4.0732.49 3.51
4 2.11 6.8842.86 5.50
5 3.2520.59426.1445.19
6 6.4134.94611.7066.69
7 6.3878.14635.61112.75
B.Flight Time Comparison
The results in Section IV-A showed that the DMPC algo-rithm offers a considerable improvement over its centralized counterpart in terms of computation time.This ction compares the performance of the two he optimization cost function,in this ca the UA Vflight time from start to goal.Fig.5shows the differences inflight time between DMPC and CMPC for the samefifty random simultions described in Section IV-A.Fig.5(a)shows the mean and range of the average averaging across all UA Vs in the team for each simulation.For up to four UA Vs,totalflight-times using DMPC are no shorter than for CMPC,but only longer by about one time step per UA V.It is intuitive that DMPC can be no better than CMPC as both solve the same problem,but DMPC in a more constrained manner.However,for thefive-vehicle cas, some DMPC results are better than CMPC,occurring when the CMPC optimization is terminated by its computation time-limit with an inferior solution.Fig.5(b)shows the flight time comparison broken down by individual UA V. The trends in the meanflight time differences suggest a slight advantage to being early in the planning order,
but Fig.4:Mean Solution Times for DMPC with up to Seven UA Vs,showing Breakdown among Sub-Problems.
there is considerable variation and nofirm conclusion can be drawn.
C.In-Flight Goal Change
This ction shows an example involving a change of the goals part way through theflight.This class of uncertainty is not covered by Theorem1,but the example shows that DMPC can potentially handle
such a change and, crucially,does not require re-initializing using a centralized process.Figs.6(a)and6(b)show the trajectories for an example scenario withfixed goals using CMPC and DMPC, respectively.Obrve that the trajectories are similar for both controllers and that UA V3does not interact with the other two.In the simulations shown in Figs.6(c)and6(d), the goals were initially the same as in Figs.6(a)and6(b), but at the third time step the goal for UA V3was moved to the location shown,requiring UA V3to interact with the other two.Comparing Figs.6(a)and6(c),it can be en that all UA Vs are diverted to accommodate the changed goal.However,comparing Figs.6(b)and6(d),it can be en that only the path of UA V3changes when its goal is changed.This is irrespective of the planning order:when UA V3replans for its new goal,it must choo a path consistent with the pre-existing intentions of UA Vs1and2. Therefore,UA V3takes a more circuitous route to its new goal and there is never any cau to change the paths of UA Vs1and 2.This example suggests that DMPC can handle broader class of uncertainty than a simple disturbance action,such as ad-hoc team changes,adding or removing UA Vs or goal changes.
V.C ONCLUSIONS
Decentralized Model Predictive Control(DMPC)for teams of cooperative UA Vs has been developed and demon-strated.Simulation results for the reprentative example of multi-UA V collision avoidan
ce have been prented.DMPC guarantees constraint satisfaction,in this ca avoidance,