首页 > 美文鉴赏

Kernel-Bad ObjectTracking

更新时间:2023-05-28 07:47:22 阅读：评论：0

Kernel-Bad Object Tracking

Dorin Comaniciu Visvanathan Ramesh Peter Meer

Real-Time Vision and Modeling Department

Siemens Corporate Rearch

755College Road East,Princeton,NJ08540

Electrical and Computer Engineering Department

Rutgers University

94Brett Road,Piscataway,NJ08854-8058

Abstract

A new approach toward target reprentation and localization,the central component in visual track-

ing of non-rigid objects,is propod.The feature histogram bad target reprentations are regularized

猪跟狗相配吗by spatial masking with an isotropic kernel.The masking induces spatially-smooth similarity functions

suitable for gradient-bad optimization,hence,the target localization problem can be formulated us-

ing the basin of attraction of the local maxima.We employ a metric derived from the Bhattacharyya

coefﬁcient as similarity measure,and u the mean shift procedure to perform the optimization.In the

prented tracking examples the new method successfully coped with camera motion,partial occlusions,

clutter,and target scale variations.Integration with motionﬁlters and data association techniques is als

discusd.We describe only few of the potential applications:exploitation of background information,

Kalman tracking using motion models,and face tracking.

Keywords:non-rigid object tracking;target localization and reprentation;spatially-smooth sim-ilarity function;Bhattacharyya coefﬁcient;face tracking.

1Introduction

Real-time object tracking is the critical task in many computer vision applications such as surveil-lance[44,16,32],perceptual ur interfaces[10],augmented reality[26],smart rooms[39,75,47], object-bad video compression[11],and driver assistance[34,4].

Two major components can be distinguished in a typical visual tracker.Target Reprenta-tion and Localization is mostly a bottom-up process which has also to cope with the changes in the appearance of the target.Filtering and Data Association is mostly a top-down process dealing with the dynamics of the tracked object,learning of scene priors,and evaluation of different hy-pothes.The way the two components are combined and weighted is application dependent and pla

ys a decisive role in the robustness and efﬁciency of the tracker.For example,face tracking in

a crowded scene relies more on target reprentation than on target dynamics [21],while in aerial video ,[74],the target motion and the ego-motion of the camera are the more important components.In real-time applications only a small percentage of the system resources can be allocated for tracking,the rest being required for the preprocessing stages or to high-level tasks such as recognition,trajectory interpretation,and reasoning.Therefore,it is desirable to keep the computational complexity of a tracker as low as possible.

The most abstract formulation of the ﬁltering and data association process is through the state space approach for modeling discrete-time dynamic systems [5].The information characterizing the target is deﬁned by the state quence

,who evolution in time is speciﬁed by

the dynamic equation

.The available measurements are related to

the corresponding states through the measurement equation

.In general,both

and are vector-valued,nonlinear and time-varying functions.Each of the noi quences,

and is assumed to be independent and identically distributed (i.i.d.).

日星隐曜

阿诗玛电影

The objective of tracking is to estimate the state given all the measurements up

that moment,or equivalently to construct the probability density function (pdf)

.The

祝小孩生日快乐的祝福语

晏子使吴

theoretically optimal solution is provided by the recursive Bayesian ﬁlter which solves the problem in two steps.The prediction step us the dynamic equation and the already computed pdf of the state at time ,

,to derive the prior pdf of the current state,.

Then,the update step employs the likelihood function of the current measurement to

compute the posterior pdf ).

When the noi quences are Gaussian and and are linear functions,the optimal

solution is provided by the Kalman ﬁlter [5,p.56],which yields the posterior being also Gaussian.(We will return to this topic in Section 6.2.)When the functions and are nonlinear,by

linearization the Extended Kalman Filter (EKF)[5,p.106]is obtained,the posterior density being still modeled as Gaussian.A recent alternative to the EKF is the Unscented Kalman Filter (UKF)

[42]which us a t of discretely sampled points to parameterize the mean and covariance of the posterior density.When the state space is discrete and consists of a ﬁnite number of states,Hidden Markov Models (HMM)ﬁlters [60]can be applied for tracking.The most general class of ﬁlters is reprented by particle ﬁlters [45],also called bootstrap ﬁlters [31],which are bad on Monte Carlo integration methods.The current density of the state is reprented by a t of

random samples with associated weights and the new density is computed bad on the samples and weights(e[23,3]for reviews).The UKF can be employed to generate proposal distributions for particleﬁlters,in which ca theﬁlter is called Unscented Particle Filter(UPF)[54].

When the tracking is performed in a cluttered environment where multiple targets can be prent[52],problems related to the validation and association of the measurements ari[5, p.150].Gating techniques are ud to validate only measurements who predicted probability of appearance is high.After validation,a strategy is needed to associate the measurements with the current targets.In addition to the Nearest Neighbor Filter,which lects the clost measure-ment,techniques such as Probabilistic Data Association Filter(PDAF)are available for the single targe

t ca.The underlying assumption of the PDAF is that for any given target only one mea-surement is valid,and the other measurements are modeled as random interference,that is,i.i.d. uniformly distributed random variables.The Joint Data Association Filter(JPDAF)[5,p.222], on the other hand,calculates the measurement-to-target association probabilities jointly across all the targets.A different strategy is reprented by the Multiple Hypothesis Filter(MHF)[63,20], [5,p.106]which evaluates the probability that a given target gave ri to a certain measurement quence.The MHF formulation can be adapted to track the modes of the state density[13].The data association problem for multiple target particleﬁltering is prented in[62,38].

Theﬁltering and association techniques discusd above were applied in computer vision for various tracking scenarios.Boykov and Huttenlocher[9]employed the Kalmanﬁlter to track vehicles in an adaptive framework.Rosales and Sclaroff[65]ud the Extended Kalman Filter to estimate a3D object trajectory from2D image motion.Particleﬁltering wasﬁrst introduced in vision as the Condensation algorithm by Isard and Blake[40].Probabilistic exclusion for tracking multiple objects was discusd in[51].Wu and Huang developed an algorithm to integrate multiple target clues[76].Li and Chellappa[48]propod simultaneous tracking and veriﬁcation bad on particleﬁlters applied to vehicles and faces.Chen et al.[15]ud the Hidden Markov Model formulation for tracking combined

with JPDAF data association.Rui and Chen propod to track the face contour bad on the unscented particleﬁlter[66].Cham and Rehg[13]applied a variant of MHF forﬁgure tracking.

The emphasis in this paper is on the other component of tracking:target reprentation and localization.While theﬁltering and data association have their roots in control theory,algorithms

for target reprentation and localization are speciﬁc to images and related to registration methods [72,64,56].Both target localization and registration maximizes a likelihood type function.The difference is that in tracking,as oppod to registration,only small changes are assumed in the location and appearance of the target in two concutive frames.This property can be exploited to develop efﬁcient,gradient bad localization schemes using the normalized correlation criterion [6].Since the correlation is nsitive to illumination,Hager and Belhumeur[33]explicitly mod-eled the geometry and illumination changes.The method was improved by Sclaroff and Isidoro [67]using robust M-estimators.Learning of appearance models by employing a mixture of stable image structure,motion information and an outlier process,was discusd in[41].In a differ-ent approach,Ferrari et al.[26]prented an afﬁne tracker bad on planar regions and anchor points.Tra

cking people,which ris many challenges due to the prence of large3D,non-rigid motion,was extensively analyzed in[36,1,30,73].Explicit tracking approaches of people[69] are time-consuming and often the simpler blob model[75]or adaptive mixture models[53]are also employed.

The main contribution of the paper is to introduce a new framework for efﬁcient tracking of non-rigid objects.We show that by spatially masking the target with an isotropic kernel,a spatially-smooth similarity function can be deﬁned and the target localization problem is then reduced to a arch in the basin of attraction of this function.The smoothness of the similarity function allows application of a gradient optimization method which yields much faster target localization compared with the(optimized)exhaustive arch.The similarity between the target model and the target candidates in the next frame is measured using the metric derived from the Bhattacharyya coefﬁcient.In our ca the Bhattacharyya coefﬁcient has the meaning of a correlation score.The new target reprentation and localization method can be integrated with various motionﬁlters and data association techniques.We prent tracking experiments in which our method successfully coped with complex camera motion,partial occlusion of the target,prence of signiﬁcant clutter and large variations in target scale and appearance.We also discuss the integration of background information and Kalmanﬁlter bad tracking.

The paper is organized as follows.Section2discuss issues of target reprentation and the importance of a spatially-smooth similarity function.Section3introduces the metric derived from the Bhattacharyya coefﬁcient.The optimization algorithm is described in Section4.Experimental results are shown in Section5.Section6prents extensions of the basic algorithm and the new

approach is put in the context of computer vision literature in Section7.

2Target Reprentation

To characterize the target,ﬁrst a feature space is chon.The reference target model is reprented by its pdf in the feature space.For example,the reference model can be chon to be the color

pdf of the target.Without loss of generality the target model can be considered as centered at the spatial location.In the subquent frame a target candidate is deﬁned at location,and

is characterized by the pdf.Both pdf-s are to be estimated from the data.To satisfy the

low computational cost impod by real-time processing discrete ,-bin histograms

should be ud.Thus we have

target model:

target candidate:

The histogram is not the best nonparametric density estimate[68],but it sufﬁces for our purpos. Other discrete density estimates can be also employed.

We will denote by

(1)

a similarity function between and.The function plays the role of a likelihood and its local

maxima in the image indicate the prence of objects in the cond frame having reprentations similar to deﬁned in theﬁrst frame.If only spectral information is ud to characterize the target,

the similarity function can have large variations for adjacent locations on the image lattice and the spatial information is lost.Toﬁnd the maxima of such functions,gradient-bad optimization pro-cedures are difﬁcult to apply and only an expensive exhaustive arch can be ud.We regularize the similarity function by masking the objects with an isotropic kernel in the spatial domain.When the kernel weights,carrying continuous spatial information,are ud in deﬁning the feature space reprentations,becomes a smooth function in.

2.1Target Model

A target is reprented by an ellipsoidal region in the image.To eliminate the inﬂuence of different target dimensions,all targets are ﬁrst normalized to a unit circle.This is achieved by independently rescaling the row and column dimensions with and .

Let

be the normalized pixel locations in the region deﬁned as the target model.

The region is centered at .An isotropic kernel,with a convex and monotonic decreasing kernel

proﬁle 1,assigns smaller weights to pixels farther from the center.Using the weights in-

creas the robustness of the density estimation since the peripheral pixels are the least reliable,being often affected by occlusions (clutter)or interference from the background.

The function

associates to the pixel at location the index of its

bin in the quantized feature space.The probability of the feature in the target model

is then computed as

(2)

电脑远程控制怎么弄

where is the Kronecker delta function.The normalization constant

is derived by imposing the

condition ,from where

教育书籍读后感

(3)

since the summation of delta functions for is equal to one.

2.2

Target Candidates Let be the normalized pixel locations of the target candidate,centered at in the current

frame.The normalization is inherited from the frame containing the target model.Using the same kernel proﬁle

but with bandwidth ,the probability of the feature in the target

candidate is given by

拥挤的公交车

(4)

1The proﬁle of a kernel is deﬁned as a function such that .

本文发布于:2023-05-28 07:47:22，感谢您对本站的认可！

本文链接：https://www.wtabcd.cn/fanwen/fan/89/941138.html

上一篇：《细胞生物学》期末试卷试卷

下一篇：TMI202106论文汇总（IEEETransactionsonMedicalImaging）

标签：远程读后感拥挤电脑教育公交车控制书籍

留言与评论（共有 0 条评论）