Mining Spatio-temporal Association Rules, Sources, Sinks, Stationary Regions and Thoroughfares in Ob

更新时间:2023-05-07 17:22:12 阅读: 评论:0

Mining Spatio-temporal Association Rules,Sources, Sinks,Stationary Regions and Thoroughfares in Object
Mobility Databas
Florian Verhein and Sanjay Chawla
School of Information Technologies,University of Sydney,NSW,Australia
{fverhein,chawla}@it.usyd.edu
Abstract.As mobile devices proliferate and networks become more location-
aware,the corresponding growth in spatio-temporal data will demand analysis
techniques to mine patterns that take into account the mantics of such data.As-
sociation Rule Mining has been one of the more extensively studied data mining
techniques,but it considers discrete transactional data(supermarket or quen-
tial).Most attempts to apply this technique to spatial-temporal domains maps
the data to transactions,thus losing the spatio-temporal characteristics.We pro-
vide a comprehensive definition of spatio-temporal association rules(STARs)
that describe how objects move between regions over time.We define support in
the spatio-temporal domain to effectively deal with the mantics of such data.
We also introduce other patterns that are uful for mobility data;stationary re-
gions and high traffic regions.The latter consists of sources,sinks and thorough-
fares.The patterns describe important temporal characteristics of regions and
we show that they can be considered as special STARs.We provide efficient al-
gorithms tofind the patterns by exploiting veral pruning properties1.
1Introduction
The need for spatio-temporal data mining and analysis techniques is growing.Some specific exampl
es include managing cell phone networks and dealing with the data gen-erated by Radio Frequency Identification Tags.Mining such data could detect patterns for applications as diver as intelligent traffic management,nsor networks,stock control and wildlife monitoring.For example,consider the movement of urs between cells of a mobile phone(or similar)network.Being able to predict where urs will go would make cell hand-over decisions easier and improve bandwidth management.Also, since most people own a mobile phone the days,the data could be ud for fast and inexpensive population movement studies.Local governments wouldfind the ability to answer questions such as“how much is this park being ud?”,“which areas are con-gested?”and“what are the main routes that people take through the city”uful.The latter query would help design better pedestrian and vehicle routes to take into account the mainflows of people.
We therefore consider a t of regions,which may be any shape or size,and a t of objects moving throughout the regions.We assume that it is possible to determine which objects are in a region,but we do not know precily where an object is in that 1This rearch was partially funded by the ARC Discovery Grant,Project ID:DP0559005.
M.L.Lee,K.L.Tan,and V.Wuwong(Eds.):DASFAA2006,LNCS3882,pp.187–201,2006.
c Springer-Verlag Berlin Heidelberg2006
188  F.Verhein and S.Chawla
region.We do not assume that objects are always somewhere in the region t,so in the example of a mobile phone network,turning the phone off pos no problems for our methods.We are interested infinding regions with uful temporal characteristics(thor-oughfares,sinks,sources,and stationary regions)and rules that predict how objects will move through the regions(spatio-temporal association rules).A source occurs when the number of objects leaving a region is high enough.A sink has a high number of objects entering it.A thoroughfare is a region through which many objects move-that is,many objects enter and leave.A stationary region is where many objects tend to stay over time, while a STAR describes how an object moves between regions.Together,the patterns describe many mobility characteristics and can be ud to predict future movements.
We take the approach of mining our patterns on a time window by time window basis.We think this is important becau it allows us to e the changing nature of the patterns over time,and allows for interactive mining-including changing the mining parameters.Even though the patterns we consider occur in a spatial ttings,they are all temporal patterns becau they describe objects movements over time,as well as cap-turing changes in the way the objects move over time.To understand this,consider each pattern tξi as capturing object movements over a‘short’period of time.In our alg
o-rithms this is the interval pair[T I i,T I i+1].That is,ξi captures how the objects move between the time intervals T I i and T I i+1.Then,as the algorithm process subquent time intervals,the patterns mined at the points will in general change,forming a -quence of pattern ts<ξi,ξi+1,...>.This change in the patterns that are output can be considered longer term changes.Such changes in the patterns describe the changes in the objects behavior over time.Another way to think about this is to consider the objects motion as a random process.If the process is stationary,we would expect the patterns to remain the same over time.If the process is not stationary,the patterns will change with time to reflect the change in the way the objects move.
There are a number of challenges when mining spatio-temporal data.First,dealing with the interaction of space and time is complicated by the fact that they have dif-ferent mantics.Secondly,we also need to deal with the mantics effectively.This includes considering the effects of area and the time-interval width not only on the the patterns we mine,but also in the algorithms thatfind tho patterns.Finally,handling updates efficiently in a dynamic environment is challenging-especially when the algo-rithm must be applied in real time.We adopt a data stream model where spatial data arrives continuously as a quence of snapshots S1,S2,...,and the model that we mine must keep up with this.Unless sampling techniques are ud,such algorithms cannot d
o better than scale linearly with time.Since processing the spatial snapshots is expensive in general,we focus our attention there.We deal with exact techniques in this paper, but it is possible to u probabilistic counting techniques together with our algorithms, as demonstrated in one of our experiments.
2Contributions
We make the following contributions:
–We give a rigorous definition of Spatio-Temporal Association Rules(STARs)that prerve spatial and temporal mantics.We define the concepts of spatial coverage,
Mining Spatio-temporal Association Rules189 spatial support,temporal coverage and temporal support.Becau the definitions retain the mantics of the spatial and temporal dimensions,it allows us to mine data with regions of any size without skewing the results.That is,we successfully extend association rules to the spatio-temporal domain.
–We define uful spatio-temporal regions that apply to objects moving through such regions.The are stationary regions and high traffic regions.The latter may be fur-ther broken into sources,sinks an
d thoroughfares.We stress that the are temporal properties of a spatial region t,and show that they are special types of STARs.
We also describe a technique for mining the regions efficiently by employing a pruning property.
–We propo a novel and efficient algorithm for mining the STARs by devising a pruning property bad on the high traffic regions.This allows the algorithm to prune as much of the arch space as possible(for a given datat)before doing the computationally expensive part.
Our algorithms do not assume or rely on any form of index(such as an R-tree,or aggregated R-tree)to function or to obtain time savings.If such an index is available, the algorithms will perform even better.Our time savings come about due to a t of pruning properties,which are spatial in nature,bad on the obrvation that only tho patterns that have a support and confidence above a threshold are interesting to a ur (in the n that they model the data).
The rest of the paper is organized as follows.In Section3we survey related work and place our contributions in context.In Section4we give veral related definitions of STARs that highlight some of the differences in interpreting STARs.We clo the ction with a detailed example to illustrate the subtleties.In Section5we define hot spots,stationary regions and high traffic regions.In Section6we d
escribe our algorithm for mining STARs.The results of our experiments on STAR mining are described in Section7.We conclude in Section8with a summary and directions for future work.
3Related Work
There has been work on spatial association rules(examples include[1,2])and temporal association rules(examples include[3,4])but very little work has addresd both spatial and temporal dimensions.Most of the work that does can be categorid as traditional association rule mining[5]or quential association rule mining applied to a spatio-temporal problem,such as in[6].
The work by Tao et al.[7]is the only rearch found that addresd the problem of STARs(albeit briefly)in the Spatial-Temporal domain.As an application of their work they show a brute force algorithm for mining specific spatio-temporal association rules.They consider association rules of the form(r i,τ,p)⇒r j,with the following interpretation:“If an object is in region r i at some time t,then with probability p it will appear in region r j by time t+τ”.Their algorithm is a brute force technique that takes time quadratic in the number of regions.They u FM-PCSA sketches[8]for speed, have a simple STAR definition and ignore the spatial and temporal mantics of the data(such as the area of the regions or the time interval width).
190  F.Verhein and S.Chawla
Other interesting work that deals with spatio-temporal patterns in the spatio-temporal domain includes[9,10,11,12,7].Mamoulis et al.[12]mine periodic patterns in ob-jects moving between regions.Wang et al.[10]introduce what they callflow patterns,
which describe the changes of events over space and time.They consider events oc-curring in regions,and how the events are connected with changes in neighbouring regions as time progress.Ishikawa et al.[11]describe a technique for mining object
mobility patterns in the form of markov transition probabilities from an indexed spa-tio temporal datat of moving points.In this ca,the transition probability p ij of an
(order1)markov chain is P(r j|r i)where r i and r j are regions,which is the confidence of a spatio-temporal association rule,although this is not mentioned by the authors. Tsoukatos et al.[9]mine frequent quences of non spatio-temporal values for regions.
The work we have listed above is quite different from ours.Tao et al.[7]consider a simple spatio-temporal association rule definition,and the algorithm forfinding the rules is brute force.Patterns that
can be interpreted as STARs are considered by[11,10], but they focus on very different rearch problems.The algorithm of[11]willfind all transition probabilities,even if they are small.Amongst other things,our algorithm makes u of the fact that urs will not be interested in rules below a support threshold, and us this to prune the arch space.And most importantly,none of the related work consider the spatial mantics of the regions,such as area,nor do they consider spatial support or similar concepts.
Dealing with the area of regions correctly is important for interpretation of the re-sults.Many authors implicitly assume that the spatial regions can be specified to suit the algorithm.However,this is usually not the ca.Cells in a mobile phone network arefixed,and have a wide range of sizes and geometries depending on geographic and population factors.Data mining applications have to be developed to work with the given region t,and we cannot ignore the influence of different sized regions.In the ca of mining mobility patterns of moving objects(including sources,sinks,station-ary regions,thoroughfares and STARs),ignoring area will skew the results in favour of larger regions becau they will have more objects in them on average.By taking the region sizes into account,we avoid skewing the results and make our STARs compara-ble across different sized regions.Finally,although it is possible to scale most patterns by the area after they have been mined,t
his is undesirable becau it prevents pruning of the arch space.Our algorithms deal with the spatio-temporal mantics such as area effectively throughout the mining process and prune the arch space as much as possible.
No previous work could be found,despite our efforts,that considers sources,sinks, stationary regions and thoroughfares.We think the patterns are very important be-cau they capture temporal aspects of the way that objects move in space.
4Spatio-temporal Association Rules
Given a datat T of spatio-temporal data,define a language L that is able to express properties or groupings of the data(in both time,space,and object attributes).Given two ntencesϕ1∈L andϕ2∈L that have no common terms,define a spatio-temporal as-sociation rule asϕ1⇒ϕ2.For example,the rule“late shift workers head into the city in
Mining Spatio-temporal Association Rules191 the evening”can be expresd as LateShiftW orker(x)∧InRegion(OutsideCity)∧T ime(Evening)⇒InRegion(City)∧T ime(Night).To evaluate whether such a spatio-temporal rule is interesting in T,a lection predicate q(T,ϕ1⇒ϕ2)maps the rule to{true,fal}.The lection predicate will in general be a combination of sup-port and confidence me
asures.For example,if the support and confidence of a rule R1 are above their respective thresholds,then q(T,R1)evaluates to true.
The language L can be arbitrarily complex.We consider the special ca where objects satisfying a query move between regions.A query q allows the expression of predicates on the t of non spatio-temporal attributes of the objects.We explore a number of definitions of such STARs in this ction to highlight subtleties.We deal only with the STAR of Definition2outside this ction,so the reader can safely focus on this on thefirst reading,without missing the main ideas of the paper.
Definition1.STAR(alternatives):Objects in region r i satisfying q at time t will:
(a)appear in region r j for thefirst time at time t+τ.Notation:(r i,t,@τ,q)⇒r j.
(b)be in region r j at time t+τ.Notation:(r i,t,τ,q)⇒r j.
(c)appear in region r j by(at or before)time t+τ.Notation:(r i,[t,τ],q)⇒r j. Note that(a)distinctly defines the time in r j at which the objects must arrive.(b)is less rigid and allows objects that arrived earlier than time t+τto be counted as long as they are still prent at time t+τ.(c)counts the objects as long as they have made an appearance in r j at any time within[t,t+τ].We generali(c)in ourfinal definition:
Definition2.STAR:Objects appearing in region r i satisfying q during time interval T I s will appear in region r j during time interval T I e,where T I s∩T I e=∅and T I s is before2T I e.Notation:(r i,T I s,q)⇒(r j,T I e).
Note that all the definitions are equivalent when T I s=t,T I e=t+1andτ=1. We are interested in the rules that have a high confidence and high support.We will u the notation r i⇒r j orζfor a STAR when we are not concerned with its exact definition.We will consider the problem of more rigorous support definitions that are more appropriate in a spatio-temporal tting later.
Definition3.Define support of a ruleζ,denoted byσ(ζ),to be the number of objects that follow the rule,and the support(with respect to q)of a region r during T I,denoted byσ(r,T I,q),to be the number of distinct objects within r during T I satisfying q.
Definition4.Define the confidence of a ruleζwho antecedent contains region r i dur-ing T I,denoted by c(ζ),as the conditional probability that the conquent is true given that the antecedent is true.This is the probability that the rule holds and is analogous to the traditional definition of confidence and is given by c(ζ)=σ(ζ)/σ(r i,T I,q). We illustrate the above definitions with an example.
Example1.Consider Figure1(a)which shows the movement of the t of objects S= {a,b,c,d,e,f,g}in th
e time-frame[t,t+3]captured at the four snapshots t,t+1,t+ 2,t+3.Assume that q=‘true so that all objects satisfy the query.
2We actually mine rules where T I s and T I e are adjacent,but it is easy to generali our methods so that this restriction does not hold.

本文发布于:2023-05-07 17:22:12,感谢您对本站的认可!

本文链接:https://www.wtabcd.cn/fanwen/fan/82/549795.html

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。

标签:
相关文章
留言与评论(共有 0 条评论)
   
验证码:
推荐文章
排行榜
Copyright ©2019-2022 Comsenz Inc.Powered by © 专利检索| 网站地图