利用低质性提取视频显著性

更新时间:2023-07-05 00:08:14 阅读：评论：0

MOTION SALIENCY DETECTION USING LOW-RANK AND SPARSE DECOMPOSITION

登克尔

⋆Yawen Xue§,†,⋆Xiaojie Guo§and Xiaochun Cao§

{yxue,xguo,xcao}@tju.edu

§School of Computer Science and Technology,Tianjin University,Tianjin300072,China †School of Computer Software,Tianjin University,Tianjin300072,China

ABSTRACT

Motion saliency detection has an important impact on further

video processing tasks,such as video gmentation,object

recognition and adaptive compression.Different to image

saliency,in videos,moving regions(objects)catch human be-

ings’attention much easier than static ones.Bad on this

obrvation,we propo a novel method of motion saliency

detection,which makes u of the low-rank and spar de-

composition on video slices along X-T and Y-T planes to

achieve the parating foreground moving objects

from backgrounds.In addition,we adopt the spatial infor-

mation to prerve the completeness of the detected motion

objects.In virtue of adaptive threshold lection and efﬁcient

noi elimination,the propod approach is suitable for dif-

ferent video scenes,and robust to low resolution and noisy

cas.The experiments demonstrate the performance of our

method compared with the state-of-the-art.

Index Terms—Motion Saliency Detection,Low-rank

and Spar Decomposition,Video Analysis

1.INTRODUCTION

Visual attention analysis provides an intuitive methodology

to mantic content understanding and important information

capture in both images and videos.Most primates,including

humans,can divert their mind subconsciously to the salient

objects in images or to the motion objects in videos.Such a

remarkable ability leads to that they can sample the most“in-

teresting”features and interpret complex scenes while spend-

ing limited processing.In other words,visual saliency makes

a distinguishing region stand out and thus catch special atten-

tion quickly,which provides an alternative solution to many

tasks,such as video gmentation[1],adaptive content deliv-

ery[2]and adaptive compression[3].

(a) Input Video

(f) Saliency Map

(g) Output Video

(b) X-T Plane Slice

(d) X-T Saliency

(e) Y-T Saliency

Fig.2.Illustration of the main stages of the propod method

techniques for images(mentioned above)are not available.

Since,different with image saliency detection,moving re-

gions(objects)alternatively catch more human beings’atten-

tion than static ones,even though which have large contrast

to their neighbors in static images.That is to say,the focal

ladykillerspoint changes from the regions with large contrast to their

neighbors for images to tho with motion discrimination for

videos.Therefore,the contrast bad methods are hardly ap-

plied to videos directly.An exception exists in[4],which ex-

tends the spectrum residual analysis[9]in images to videos.

Actually,the goal of moving object paration from back-

ground is the same as that of motion saliency detection.A

few solutions for parating foreground moving objects from

backgrounds have been propod,such as Gaussian Mixture

Model[5].In this work,we introduce a novel method to de-

tect motion saliency by using low-rank and spar decompo-

英语四级论坛sition.Prior to detailing the stages of our propod method,

weﬁrst post an example of the performance comparison be-

tween our method and the state-of-the-art in Fig.1.As can be

en,the result obtained by our method is signiﬁcantly better

than the others.(More experiments and results can be found

in Sec.4.)

2.LOW-RANK AND SPARSE DECOMPOSITION

Suppo we have a data matrix A∈R n∗m,and know that it

can be decompod as A=D+E,where D has low rank

and E is spar.Both D and E are of arbitrary magnitude.

Recall that the Classical Principle Component Analysis

(CPCA)eks the rank-k estimate of D to approximate A,or

to say,reduce the dimensionality of the obrvations in A by

optimally solving:

min||A−D||,s.t.rank(D)≤k,(1)

where||∗||stands for he largest singular

value of it.Note that CPCA is under the assumption that the

obrvations in A are polluted by noi he ab-

solute values of the elements in E are small.Otherwi,the

solution of CPCA is far away from the optimal D.However,体育馆英语

in real world problems,data pollution is ubiquitous and arbi-

trary.As a conquence,CPCA may somewhat lo its power

南京财经大学自考网to deal with many real world problems.

To overcome the drawback of CPCA,Robust Principal

Component Analysis(RPCA)is propod by Cand`e s et al.

[12].The goal of RPCA is to optimize the problem:

min rank(D)+||E||A=D+E,(2)

where||∗||0denotes theℓ0-norm.But such a problem is in-

tractable in polynomial-time.Instead,one can solve its con-

vex relaxation as follows:

min||D||∗+λ||E||A=D+E,(3)

whereλis the coefﬁcient controlling the weight of the spar

matrix E,and||∗||∗and||∗||1reprent the nuclear norm and

theℓ1-norm of the matrix,respectively.This formulation per-

forms well in practice,which recovers the true low-rank so-

lution even when up to a third of the obrvations are grossly

corrupted.More detail about the proof of the low-rank and

请假单英文spar decomposition using RPCA can be found in[12].

3.OUR METHOD

In this ction,weﬁrst formulate the problem this work in-

tends to solve,and then detail the stages of our propod

成都外国语method(Fig.2).london bridge

Problem Formulation.Different to static image saliency

detection,the motion regions in a video intensively attract

humans’attention instead of the regions with large contrast

in every single image.And due to the correlation between

frames,the motion regions in the video1can be identiﬁed

from the background by low-rank and spar decomposition.

Note that foreground motion objects,such as cars and pedes-

trians,usually occupy only a fraction of the image pixels and

hence can be treated as spar errors.In this work,we stack

(a) Y-T Plane Slice (c) Our Method

(b) TSR Fig.3.Visual comparison of the middle results for motion saliency detection on a temporal slice.(a)shows a Y-T plane slice.(b)and (c)are the results of detecting motion saliency on (a)using TSR [4]and our method,respectively.

the temporal slices along X-T and Y-T as the matrices S .Nat-urally,the low-rank component B corresponds to the back-ground and the spar component M captures the motion ob-jects in the foreground.Figure 3(a)shows a temporal slice to conﬁrm our obrvation.As shown in Fig.3(b)and (c),our method signiﬁcantly outperforms TSR in terms of capturing the motion.Moreover,we take ada

ptive threshold lection and reﬁnement to reduce the effect of noi and missing pix-els (becau of the ignore of spatial consideration).

Decomposition.Bad on the problem formulation,each X-T and Y-T slice S is decompod,similar with Eq.3,as:

min ||B ||∗+λ||M ||S =B +M .

(4)

Then,the motion abs (M ),obtained from the X-T (Y-T)slices are integrated together as S cubeXT (S cubeY T )along X-Y-T.Then,using norm (S cubeXT .∗S cubeY T )to form the initial saliency map S cube ,where .∗is the element-wi product operator,and norm (∗)reprents normalization pro-cessing.The size of T in our experiments equals the size of the video,it also can be deﬁned as the size of a sub-video.Reﬁnement.To reduce the effect of missing pixels on the motion objects and reﬁne the results,we further take into ac-count the spatial information.Intuitively,the pixels belong-ing to the same motion object are always locally coherent.This indicates that a pixel p i,j,k is very likely to be missing salient pixel when its neighbors in frame k are motion salient.Inspired by this obrvation,we u a Gaussian function to recall the missing pixels as follows (we omit subsc

ript k for short):

S cube (i,j )=

||p x,y −p i,j ||2<τ

S cube (x,y )∗f (||p x,y −p i,j ||2),(5)where τis the radius of the support region centered on p i,j ,||∗||2is the ℓ2-norm,and f is a Gaussian function:f (d )=12πσ

exp −d 2

2To

e more results,plea visit:

cs.tju.edu/orgs/vision/msd/results.htm

(2)

(1)

(4)

(e) Raw Saliency Map (b) FD (c) GMM (d) TSR (f) Our Method

(a) Original Frame (5)prada什么意思

(3)

Fig.4.Experiment results.(a)original frames (OF)from different video scenes and types.(b)-(d)are the

results by using frame difference (FD),GMM [5]and TSR [4],respectively.(e)shows the raw saliency maps obtained by our method.The ﬁnal results by our method are shown in (f).The performance difference analysis plea e the text.effort of our adaptive threshold lection and reﬁnement.Fig-ure 4(e)shows the raw saliency maps without executing the reﬁnement and the adaptive threshold lection.The ﬁnal re-sults of our method are displayed in Fig.4(f).

5.CONCLUSION

In this work,we propod a novel motion saliency detec-tion method bad on low-rank and spar decomposition,which provides many video processing tasks,such as video gmentation and adaptive content delivery,with a powerful video pre-processing technique.The propod method is able to distinguish foreground motion objects from backgrounds without any background modeling procedure.Thank to spa-tial consideration,we further reduce the effect of incomplete-ness.In addition,by employing adaptive threshold lection and noi elimination,the method can automatically and ro-bustly accomplish the task.The experiments carried on differ-ent video qualities and scenes demonstrated that our propod method outperforms the state-of-the-art.

6.REFERENCES

[1]K.Fukuchi,K.Miyazato,A.Kimura,S.Takagi,and J.Yam-ato,“Saliency-bad video gmentation with graph cuts and quentially updated priors,”in IEEE ICME ,2009,pp.638–641.[2]Y .Ma and H.Zhang,“Contrast-bad image attention analysis

by using fuzzy growing,”in ACM MM ,2003,pp.374–381.

[3] C.Christopoulos,A.Skodras,A.Koike,and T.Ebrahimi,“The

jpeg2000still image coding system:An overview,” Consumer Electronics ,vol.46,no.4,pp.1103–1127,2000.[4]X.Cui,Q.Liu,and D Metaxas,“Temporal spectral residual:price tag 歌词

Fast motion saliency detection,”in ACM MM ,2009,pp.617–620.[5]Z.Zivkovic,“Improved adaptive gaussian mixture model for

background subtraction,”in ICPR ,2004,pp.28–31.[6]L.Itti,C.Koch,,and E.Niebur,“A model of saliency-bad vi-sual attention for rapid scene analysis,” PAMI ,vol.20,no.11,pp.1254–1259,1998.[7]R.Achanta,S.Hemami, F.Estrada,and S.S¨u sstrunk,

“Frequency-tuned salient region detection,”in IEEE CVPR ,2009,pp.1597–1604.[8]J.Harel, C.Koch,and P.Perona,“Graph-bad visual

saliency,”in NIPS ,2007,pp.545–552.[9]X.Hou and L.Zhang,“Saliency detection:A spectral residual

approach,”in IEEE CVPR ,2007,pp.1–8.[10]Z.Wang and B.Li,“A two-stage approach to saliency detection

in images,”in IEEE ICASSP ,2008,pp.965–968.[11]M.Cheng,N.Zhang,G.and Mitra,X.Huang,and S.Hu,

“Global contrast bad salient region detection,”in IEEE CVPR ,2011,pp.409–416.[12] E.Cand`e s,X.Li,Y .Ma,and J.Wright,“Robust principal

component analysis?,”Journal of the ACM ,vol.58,no.3,pp.1–37,2011.

本文发布于:2023-07-05 00:08:14，感谢您对本站的认可！

本文链接：https://www.wtabcd.cn/fanwen/fan/90/167272.html

上一篇：ANSYS内嵌函数

下一篇：深度学习概论

标签：自考网南京

留言与评论（共有 0 条评论）