首页 > 美文鉴赏

Learning in region-bad image retrieval

更新时间:2023-06-23 15:29:46 阅读：评论：0

Learning in Region-Bad Image Retrieval

Feng Jing1, Mingjing Li2, Lei Zhang2, Hong-Jiang Zhang2, Bo Zhang3

1 State Key Lab of Intelligent Technology and Systems

善良的名人名言Beijing 100084, China

jingfeng00@mails.tsinghua.edu

2Microsoft Rearch Asia

49 Zhichun Road, Beijing 100080, China

{mjli, i-lzhang, hjzhang}@

3 State Key Lab of Intelligent Technology and Systems

Beijing 100084, China

北齐律

dcszb@mail.tsinghua.edu

Abstract. In this paper, veral effective learning algorithms using global i m age

reprentations are adjusted and introduced to region-bad image retrieval

(RBIR). First, the query point movement technique is considered. By asmbling

all the gmented regions of positive examples together and resizing the regions to

emphasize the latest positive examples, a composite image is formed as the new

感悟人生图片大全query. Second, the application of support vector machines (SVM) in relevance

feedback for RBIR is investigated. Both the one class SVM as a class distribu-

tion estimator and two class SVM as a classifier are taken into account. For the

latter, two reprentative display strategies are studied. Last, a region re-

见义勇为事迹材料weighting algorithm is propod inspired by tho feature re-weighting ones. Ex-

洋洋洒洒造句

perimental results on a databa of 10,000 general-purpo images demonstrate

the effectiveness of the propod learning algorithms.

1 Introduction

Most of the early rearches on content-bad image retrieval (CBIR) have been focud on developing effective global features [6][14][18]. While the rearches establish the basis of CBIR, the retrieval performance is still far from urs’ expec-tations. The main reason is acknowledged to be the gap between low-level features and high-level concepts. To narrow down this mantic gap, two techniques have been widely ud: region-bad features to reprent the focus of the ur’s percep-tions of image content [1][8][16] and learning techniques, e.g. relevance feedback (RF), to learn the ur’s intentions [4][7][10][12][15][17].

Many early CBIR systems perform retrieval bad primarily on global features. It is not unusual that urs accessing a CBIR system look for objects, but the afore-mentioned systems are likely to fail, since a single signature computed for the entire image cannot sufficiently capture the impo rtant properties of individual objects. Region-bad image retrieval (RBIR) systems [1][16] attempt to overcome the draw-

drawback of global features by reprenting images at object-level, which is intended to be clo to t

he perception of human visual system [16].

One of the interactive learning techniques is relevance feedback (RF) initially de-veloped in text retrieval [13]. RF was in troduced into CBIR during mid 1990’s and has been shown to provide dramatic performance boost in retrieval systems [7][12][15][17]. The main idea of it is to let urs guide the system. During r etrieval process, the ur interacts with the system and rates the relevance of the retrieved images, according to his/her subjective judgment. With this additional information, the system dynamically learns the ur’s intention, and gradually prents better results.

Although RF has shown its great potential in image retrieval systems that u global reprentations, it has ldom been introduced to RBIR systems. Minka and Picard performed a pioneering work in this area by proposing the FourEyes system [10]. FourEyes contains three stages: grouping generation, grouping weighting and grouping collection.

The main purpo of this paper is to integrate region-bad reprentations and learning techniques and allows them to benefit from each other. To do that, on the one hand, two RF methods are propod. One is the query point movement (QPM) algorithm with speedup techniques. The other is introducing three SVM schemes bad on a new kernel. On the other hand, a novel region re-wei

ghting scheme bad on urs’ feedback information is propod. The region weights that coincide with human perception improve the accuracy of both initial query and the following rele-vance feedback. Furthermore, the region weights could not only be ud in a query ssion, but be also memorized and accumulated for future queries.

The organization of the paper is as follows: Section 2 describes the basic ele-ments of a RBIR system including: image gmentation, image reprentation and image similarity mea sure. The RF strategies using QPM and SVM are described in Section 3 and Section 4 respectively. The region re-weighting scheme is prented in Section 5. In Section 6, we provide experimental results that evaluate all aspects of the learning schemes. Finally, we conclude in Section 7.

2 Region-Bad Image Retrieval

2.1 Image Segmentation

The gmentation method we utilized is propod in [9]. First, a criterion for homo-geneity of a certain pattern is propod. Applying the criterion to local windows in the original image results in the H-image. The high and low values of the H-image correspond to possible region boundaries and region interiors respectively. Then, a region growing method is ud to gment the image bad on

the H-image. Finally, visually similar regions are merged together to avoid over-gmentation.

2.2 Image Reprentation

Currently, the spatial relationship of regions is not considered, and an image is rep-rented by a t of its regions. To describe a region, w e u two properties: the features of the region and its importance weight. Two features are adopted. One is the color moment [14] and the other is the banded auto-correlogram [6]. For the former, w e extract the first two moments from each channel of CIE-LUV color space. For the latter, the HSV color space with inhomogenous quantization into 36 colors [18] is adopted. Considering that the size of the regions may be small, we u b = d = 2 in the current computation [6]. Therefore, the resulting 36 dimensional feature suggests the local structure of colors. Since color moment s measure the global information of colors, the two features are complementary to each other and the combination enables them benefit from each other. The area percentage of re-gions is ud as its importance weight in Section 2, 3, 4 temporarily. More satisfac-tory weighting methods are discusd in Section 5. The only requirement is that the sum of importance weights of an image should be equal to 1.

2.3 Image Similarity Measure

Bad on the image reprentation, the distance between two images is mea sured using the Earth Mover’s Distance (EMD) [11]. EMD is bad on the minimal cost that must be paid to transform one distribution into another. Considering that EMD matches perceptual similarity well and can operate on variable-length reprent a-tions of the distributions, it is suitable for region-bad image similarity measure.

In this special ca, a signature is an image with all the regions corresponding to clusters, and the ground distance is the L1 distance between the features of two re-gions. EMD incorporates the properties of all the gmented regions so that infor-mation about an image can be fully utilized. By allowing many-to-many relationship of the regions to be valid, EMD is robust to inaccurate gmentation.

3 Query Point Movement

3.1 The Optimal Query

Inspired by the query-point movement (QPM) method [12], a novel relevance feedback approach to region-bad image retrieval is propod [8]. The basic assumption is that every region could be helpful in retrieval. Bad on this assumption, all the regions of both initial query and positive examp

les are asmbled into a pudo image, which is ud as the optimal query at next iteration of retrieval and feedback process. The importance of the regions of optimal query is normalized such that the sum of them is equal to 1. During the normalization, regions of tho newly added positive examples, which reflect the ur’s latest query refinement

positive examples, which reflect the ur’s latest query refinement more precily, are emphasized by given more importance. As more positive examples are available, the number of regions in the optimal query increas rapidly. Since the time required calculating image similarity is proportional to the number of regions in the query, the retrieval speed will slow down gradually. To avoid this, regions similar in the feature space are merged into larger ones together via clustering. This process is similar to region merging in an over-gmented image.

3.2 RF Using QPM

The RF process using QPM technique is summarized as follows.

大灰狼与小白兔

The initial query is regarded as a positive example for the sake of simplicity. At the first iteration of feedback, all regions of positive examples are asmbled into a composite image, in which similar regions are grouped into clusters by k-means algorithm. Regions within a cluster are merged into a

new region. The feature of the new region is equal to the average feature of individual regions, while the importance of it is t to the sum of individual region importance divided by the number of posi-tive examples. This composite image is ud as the optimal query example.

In the following iterations, only the optimal query is treated as a positive example with all other prior examples being ignored. That is, there is exactly one prior posi-tive example, which is treated equally as newly added positive examples. This im-plies that the importance of the prior positive examples gradually decays in the op-timal query, and the importance of the newly added ones is emphasized accordingly.

4 SVM-bad RF

As a core machine learning technology, SVM has not only strong theoretical foun-dations but also excellent empirical success [5]. SVM has also been introduced into CBIR as a powerful RF tool, and performs fairly well in the systems that u global reprentations [3][15][17].

Given the RF information, generally two kinds of learning could be done in order to boost the performance. One is to estimate the distribution of the target images, while the other is to learn a boundary that parates the target images from the rest. For the former, the so-called one-class SV

M was adopted [4]. A kernel bad one-class SVM as density estimator for positive examples was shown in [4] to outper-form the whitening transform bad linear/quadratic method. For the latter, the typi-cal form of SVM as a binary classifier is appropriate [15][17]. A SVM captures the query concept by parating the relevant images from the irrelevant images with a hyperplane in a projected space.

When SVM is ud as a classifier in RF, there are two display strategies. One strategy is to display the most-positive (MP) images and u them as the training samples [17]. The MP images are chon as the ones farthest from the boundary on the positive side, plus tho nearest from the boundary on the negative side if neces-

sary. The underlying assumption is that the urs are greedy and impatient and thus expects the best possible retrieval results after each feedback. It is also the strategy adopted by most early relevance feedback schemes. However, if we assume the urs are cooperative, another strategy is more appropr iate. In this strategy, both the MP images and the most-informative (MI) images are displayed. Additional ur feedbacks, if any, will only be performed on tho MI images, while the MP images are shown as the final results. Tong and Chang [15] propod an active learning algo-rithm to lect the samples to maximally reduce the size of the version space. Fol-lowing the principl

e of maximal disagreement, the best strategy is to halve the ve r-sion space each time. By taking advantage of the duality between the feature space and the parameter space, they showed that the points near the boundary can approxi-mately achieve this goal. Therefore, the points near the boundary are ud to a p-proximate the MI points.

4.1 EMD-Bad Kernel

Unlike the global feature, the region-bad reprentations of images are of variable length, which means both the inner product and the L p norm are not applicable. As a result, the common kernels, such as the polynomial kernel and Gaussian kernel are inappropriate in this situation.

To resolve the issue, a generalization of Gaussian kernel is introduced:

)2),(exp(),(2σy x d y x k GGaussian −= (1)

where d is a distance measure in the input space.

工作交接内容范文Since the distance measure here is EMD, a particular form of the generalized Gaussian kernel with d being EMD is considered. More specific, the propod ke r-nel is:

)),(exp(),(2σy x EMD y x k GEMD −= (2)

5 Region Re-weighting

Enlightened by the idea of feature re-weighting [12] and the TF *IDF (Term Fre-quency * Inver Document Frequency) weighting in text retrieval, we designed a RF *IIF (Region Frequency * Inver Image Frequency) weighting scheme. It us urs’ feedback information to estimate the region i mportance of all positive images. The basic assumption is that important regions should appear more times in the posi-tive images and fewer times in all the images of the databa.

Before we go into details, we first introduce some notations and definitions that will be ud to illustrate the region importance.

Two regions are deemed as similar, if the L 1 distance between their feature ve c-tors is less than a predefined threshold.

A region R and an image I is defined to be similar if at least one region of I is similar to R . We u s (R , I ) to denote this relationship: s (R , I ) = 1 if R is similar to I , while s (R , I ) = 0 otherwi.

Assume that there are totally N images in the databa, which are reprented by {}N I I ,...,1. Also assume that we are calculating the region importance of I that con-sists of regions {}n R R R ,...,,21.

I is actually one of the positive examples ident ified by a ur in feedback. Note that the original query image is also considered to be a positive example. Let all the positive examples be {}

++k I I ,...,1

For each region i R , we define a measure of region frequency (RF ), which re-flects the extent to which it is consistent with other positive examples in the feature space. Intuitively, the larger the region frequency value, the more important this region is in reprenting the ur’s intention. The region frequency is defined in the following way. ∑=+

=k j j i i I R s R RF 1),()(

(3)

On the other hand, a region becomes less important if it is similar to many images in the databa. To reflect the distinguishing ability of a region, we define a measure of inver image frequency (IIF ) for region i R :

)),(log()(1∑==N j j i

i I R s N R IIF (4)

which is analogous to the IDF (inver document frequency) in text retrieval.

Bad on the above preparations, we now come to the definition of the region im-portance:

∑==n j j j i i i R IIF R

RF R IIF R RF R RI 1))

(*)(()

(*)()( (5) Basically, the importance of a region is its region frequency weighted by the in-ver image frequency, and normalized over all regions in an image such that the sum of all region importance weights is equal to 1.

日月星辰Since to “common” urs the region importance is similar, it can be cumulated for future u. More specific, g iven a region i R , its cumulate region importance (CRI ) after l (l > 0) updates is defined as:

l R RI l R CRI l l R CRI i i i )

()1,(*)1(),(+−−= (6)

where )(i R RI is the latest RI of i R calculated from formula (3) and )0,(i R CRI is initialized to be the area percentage (AP ) of i R . Note that once the RI of a region is learned, its CRI will have nothing to do with its AP .

本文发布于:2023-06-23 15:29:46，感谢您对本站的认可！

本文链接：https://www.wtabcd.cn/fanwen/fan/89/1051451.html

上一篇：汇编语言---寄存器

下一篇：Background subtraction techniques

标签：感悟工作图片内容人生交接

留言与评论（共有 0 条评论）