An image signature for any kind of image

更新时间:2023-07-06 02:32:12 阅读：评论：0

AN IMAGE SIGNATURE FOR ANY KIND OF IMAGE

H.Chi Wong,Marshall Bern,and David Goldberg

Xerox Palo Alto Rearch Center

3333Coyote Hill Rd.,Palo Alto,CA94304春节去哪里玩

ABSTRACT

We describe an algorithm for computing an image signature, suitable forﬁrst-stage screening for duplicate images.Our signature relies on relative brightness of image regions,and is generally applicable to photographs,text documents,and line art.We give experimental results on the nsitivity and robustness of signatures for actual image collections,and also results on the robustness of signatures under transformations such as resizing,rescanning,and compression.

1.BACKGROUND AND MOTIV ATION Massive image databas are becoming increasingly com-mon.Examples include document image databas such as declassiﬁed government documents[8],and photo archives such as the New York Times archive.Duplicate removal of-fers space and bandwidth savings,and more ur-friendly arch results.Despite some effort to cull duplicates,the

im-age arch rvice of Google[4]often retrieves a number of duplicate and near-duplicate images.Duplicate detection alsoﬁnds application in copyright protection and authenti-cation.

We believe that duplicate detection is most effectively solved in two distinct stages.A fastﬁrst stage reduces im-ages to small signatures,with the property that signatures of two different versions of the same image have small vec-tor distance relative to the typical distance between signa-tures of distinct images.A slow cond stage then makes a detailed comparison of candidate duplicates identiﬁed in theﬁrst stage.Detailed comparison of document images can identify changes as small as moving a decimal point[16].

In this paper we give a fast and simple algorithm for the ﬁrst stage.Our image signature encodes the relative bright-ness of different regions of the image;it can be applied quite generally to text document images,line art(such as cartoons), and continuous-tone images.Although there are a number of image signature schemes already in the literature,there is no one signature that applies to such a wide class of im-ages.The main limitation of our signature is that it is not de-signed to handle large amounts of cropping or rotation.This design choice is appropriate for document image databas and Web arch,but not for object recognition or for detect-ing copyright violations.

2.PREVIOUS WORK

Image signatures have already been ud to address three dif-ferent,but cloly related,problems:similarity arch,au-thentication,and duplicate detection.The three problems have slightly different characteristics that affect the design of the signature.Signatures for similarity arch[10](for exam-ple,color histograms)must prerve similarity,not just iden-tity,and hence do not necessarily spread out non-duplicates very effectively.Signatures for authentication[1,11,13,15] must be more nsitive—but can be much slower—than sig-natures for theﬁrst stage of duplicate detection.

Nevertheless,the techniques developed for arch and authentication can be harnesd for duplicate detection.Our work adapts a method by O’Gorman and Rabinovich[11] for computing a signature for ID card photos.Their method places an grid of points on the ID card photo and com-putes a vector of8bits for each point in the grid;roughly speaking,each bit records whether a neighborhood of that point is darker or lighter than a neighborhood of its8diag-onal or orthogonal neighbors.In the usage scenario,the im-age signature computed from the photo is compared with a reference signature written on the ID card and digitally signed with public-key cryptography.

Another technique worth considering lets the image con-tent dictate the neighborhoods ud in the signature[1,14]. Schmid and Mohr[14]u a corner-point detector to deﬁne “interesting points”in the i

mage.The signature then includes a statistical abstract of the neighborhood of each interesting point.Compared to grid-bad methods such as ours,this approach has the advantage that it can handle large amounts of cropping,and if the statistics at each interesting point are rotation invariant,it can also handle arbitrary amounts of ro-tation.On the other hand,it ems to take veral hundred interesting points to obtain a reliable signature,and this ap-proach is likely to break down entirely for text documents or line art images,which may contain thousands of interesting points with very similar statistics.

Also available in the literature are specialized signature

methods for document images,using optical character recog-nition[5,8]or document-speciﬁc features such as paragraph layout,line lengths,and letter shapes[2,3].We expect that the specialized methods are more accurate than our own generic method.Our method,however,should perform quite well forﬁrst-stageﬁltering,reducing the number of possible duplicate pairs by a factor of hundreds or thousands,where-upon still more accurate and detailed full-image matching algorithms[9,12,16]can be ud for the slow cond-stage

matching.

文科生英语

(a)A

image.(b)A image.

Fig.1.“Naturally occurring”duplicates:two images of the Mona Lisa downloaded from the Web.

3.THE SIGNATURE

橄榄核雕Ideally,we would like a signature that is small enough to allow efﬁcient nearest-neighbor arch,nsitive enough to effectivelyﬁlter a databa for possible duplicates,and yet robust enough toﬁnd duplicates that have been resized,res-canned,or lossily compresd.Figure1shows black-and-white versions of a typical duplicate pair.The are black-and-white versions of two different JPEG-compresd color images of the Mona Lisa downloaded from the Web.The images have different sizes,gray values,and cropping(look at theﬁngers).

We now describe the steps of our algorithm in detail.Al-though inspired by the signature algorithm of O’Gorman et al.[11],our algorithm differs at almost every step.For ex-ample,we added something to handle mild cropping,and expanded O’Gorman’s two-level relative darkness values—simply“darker”or“lighter”—toﬁve levels to give greater nsitivity and robustness.

Step1.If the image is color,weﬁrst convert it to8-bit gray-scale using the standard color-conversion algorithms included in djpeg and ppmtopgm.Pure white is reprented by255 and pure black by0.

Step2.We next impo a grid of points on the image. (For large databas,a bigger grid such as would give greaterﬁrst-stageﬁltering.)We deﬁne the grid in a way that欺侮的读音

is robust to mild cropping,under the assumption that such cropping usually removes relatively featureless parts of the image,for example,the margins of a document image or the dark bottom of the Mona Lisa picture.For each column of the image,we compute the sum of absolute values of differ-ences between adjacent pixels in that column.We compute the total of all columns,and crop the image at the5%and 95%columns,that is,the columns such that5%of the total sum of differences lies on either side of the cropped image. We crop the rows of the image the same way(using the sums

of original uncropped rows).

Conceptually,we then divide the cropped image into a grid of blocks.We round each interior grid point to the clost pixel(that is,integer coordinates),thereby tting

a grid of points on the image.

Step3.At each grid point,we compute the average gray level of the square centered at the grid point.We ran our experiments with

where and are the dimensions of the image in pixels. The squares are slightly soft-edged,meaning that instead of using the pixel’s gray levels themlves,we u an average

of a block centered at that pixel.Figure2shows the grid of squares for half-size versions of each of the images from Figure1.

Step4.For each grid point,we compute an8-element array who elements give a comparison of the average gray level

of the grid point square with tho of its eight neighbors.The result of a comparison can be“much darker”,“darker”,“same”,“lighter”,or“much lighter”,reprented numerically as-2, -1,0,1and2,respectively.The“same”values are tho av-erages that differ by no more than2on a scale of0to255. We t the boundary between“much darker”and“darker”so that the two values are equally popular;we do the same for“lighter”and“much lighter”.The rationale in this step is that“same”may be very common in images withﬂat back-grounds(such as text documents),and hence it should not be included in the histogram equalization applied to the other values.Grid points in theﬁrst or last rows or column have fewer than8neighbors;for simplicity weﬁll in the missing comparisons—there are104of them—in the array with ze-ros.

Step5.The signature of an image is simply the concatena-tion of the8-element arrays corresponding to the grid points, ordered left-to-right,top-to-bottom.Our signatures are thus

vectors of length .We store them in 648-byte arrays,but becau some of the entries for the ﬁrst and last rows and columns are known to be zeros and becau each byte is ud to hold only 5values,signatures could be reprented by as few as

bits.

(a)

squares.(b)

squares.

Fig.2.The grid squares for the Mona Lisa images.

4.NEAREST NEIGHBORS

The normalized distance between two image signa-tures and is ,where reprents the norm of a vector,that is,its Euclidean length.Rela-tive to the

norm (or Hamming distance,as ud in [11]),the norm gives greater emphasis to larger differences;this is appropriate becau a difference of 1at a coordinate may be due to roundoff,but a difference of 2is meaningful.

It is necessary to normalize the distance,rather than just

using

in order to compare the distance against a ﬁxed threshold.A threshold of ems to be a good choice,giving reasonable robustness to image transformations with-out admitting very many fal matches.For text documents,a slightly lower threshold is better,becau their signatures usually have many more zeros than do the signatures of conti-nuous-tone images.An easy ﬁx,which allows the u of a threshold for any type of image,is to change basic arith-metic:count the difference between a 0and a 2or in

as 3.We ud this ﬁx in the experiments reported

below.

Given a query signature ,we now need a way to ﬁnd all signatures in a databa within normalized distance of or less.The usual way to accomplish this task involves chop-ping up the signatures into (possibly non-contiguousand over-lapping)words of bytes,and ﬁnding all ’s that exactly match on some word.For each of the word positions,an index table points to all the signatures with each given word.The parameters and are chon so that any within nor-malized distance of is sure to match on at least one word,and so that the space requirement of the index tables is not too large.This method solves the problem of nearest-neighbor arch in high dimensions by ﬁnding exact matches in a nu

m-ber of lower-dimensional projections.There are also more sophisticated algorithms and analysis of nearest neighbor arch in high dimensions [6,7].

In our ca,a normalized distance of allows and to differ at every byte,so there is no good choice of word length .Therefore we propo that in the index tables,we lump together and ,that is,reprent them both by ,and similarly lump together and .The table entries,although indexed by a words over 3possible letters,still point to the more preci 5-letter signatures.Now a reasonable choice

would be

and .A signature within of the query is very likely to match on some word.If

each letter of the lumped version of has probability

of matching the corresponding letter in the lumped version of ,then each word has probability of matching,and hence the chance that at least one of 100words match is about ,which is greater than .993.A random signature ,however,is unlikely to match on some word.If each letter of lumped has probability of matching the corresponding letter in lumped ,then each word has prob-ability

of matching,and hence the chance that at least one of the 81words match is about .Hence only abou

t 10%of the databa signatures need to have their actual normalized distances to computed.Be-cau the computation of normalized distances is very fast,this should be quite tolerable for a million-image databa.Larger values of would give greater discriminating power but u more memory space.As grows larger,signatures within would tend to have many more word matches than random signatures.

Notice that with this method for nearest-neighbor arch-ing,the overall image duplicate detection process becomes a three-level scheme.The indexing on words ﬁnds candidate signature pairs,which in turn lead to candidate image pairs.

5.RESULTS

We tested the robustness (the “recall”)of our algorithm on both synthetic and natural data.The synthetic data consist of a t of original images (600photo CD images,belong-ing to 6themes -Michelangelo,transportation,cityscape,etc -and 170document images -pages from a Ph.D.thesis)and their synthetic duplicates,generated by rotating,resiz-ing,and compressing/decompressing the originals.The bar charts below show the robustness of our algorithm in ﬁnding synthetic duplicates.The natural data consist of 10ts of images downloaded from the Web,each containing a

num-ber of visually (clo-to-)identical images of different sizes,with different amounts of cropping,and some even with slight

color differences.Our algorithm handles well the natu-rally occurring duplicates,failing only in cas where there is a signiﬁcant amount of cropping or color difference.

爱家乡Rotation (in degrees)

305070902

%of images Resizing

305070900.65

1.25

2.0

高花2040608010050

90Documents

We also tested the nsitivity (the “precision”)of our al-gorithm by running it on our original t of photo CD images and document images.The table below shows the average number of fal positives returned for each type of images.image type

photo CD images

缆车4.05(in a total of 170)

悄多音字As future work,we plan to conduc studies comparing our method with other existing methods.

Acknowledgments.We would like to thank Paul Stewart and Rusl Atkinson for their help with software tools.

6.REFERENCES

[1]S.Bhattacharjee and M.Kutter.Compression tolerant im-age authentication.Proc.ICIP ,V ol.1,pp.435–439,1998.[2]V .Chalana,A.G.Bruce,and T.Nguyen.Duplicate doc-ument detection in DocBrow.Proc.SPIE Document Recognition and Retrieval Conference ,1999.[3] D.Doermann,H.Li,and O.Kia.The detection of dupli-cates in document image databas.Proc.Int.Conf.Doc-ument Analysis and Recognition ,pp.314–318,1997.documents.cfar.umd.edu/LAMP/[4]Google Advanced le.com [5]J.Hull.Document image similarity and equivalence detec-tion.Pattern Recognition Letters ,2001,545–550.[6]P.Indyk and R.Motwani.Approximate nearest neighbors:

towards removing the cur of dimensionality.ACM Symp.Theory of Computing ,1999.[7]P.Indyk,R.Motwani,P.Raghavan,S.Vempala.Locality-prerving hashing in multidimensional spaces.ACM Symp.Theory of Computing ,1997.[8] D.Lopresti.A comparison of text-bad methods for

detecting duplication in document image databas.Proc.Document Recognition and Retrieval VI ,SPIE Electronic Imaging ,V ol.3967,pp.210–221,2000./org/1133/Rearch/

DocumentAnalysis/dup.html [9]J.Mao and K.Mohiuddin.Form dropout using distance

transformation.Proc.ICIP ,V ol.3,pp.328–331,1995.[10]W.Niblack,R.Barber,and W.Equitz.The QUBIC project:

querying images by content using color,texture,and shape.IBM Rearch Report #RJ 9203,February 1993.[11]L.O’Gorman and I.Rabinovich.Secure identiﬁcation doc-ument via pattern recognition and public-key cryptography.IEEE Trans.Pattern Analysis and Machine Intelligence 20(1998),1097–1102.[12]H.Peng,F.Long,W.-C.Siu,Z.Chi,and D.Feng.Doc-ument image matching bad on component blocks.Proc.ICIP ,V ol.2,pp.601–604,2000.[13]M.Ruhl,M.Bern,and D.Goldberg.Secure notarization of

paper text documents.Proc.12th SIAM Symp.Disc.Algo-rithms ,2001.[14] C.Schmid and R.Mohr.Image retrieval using local charac-terization.Proc.ICIP ,pp.781–784,1996.[15]R.Venkatesan,S.-M.Koon,M.H.Jakubowski,and P.

Moulin.Robust image hashing.Proc.ICIP ,2000.[16]M.Ye,M.Bern,and D Goldberg.Document image match-ing and annotation lifting.Proc.Int.Conf.Document Anal-ysis and Recognition ,2001.

本文发布于:2023-07-06 02:32:12，感谢您对本站的认可！

本文链接：https://www.wtabcd.cn/fanwen/fan/89/1069619.html

上一篇：房间物品中英文名称对照

下一篇：南通市2019-2021年三年中考英语试卷分类汇编：词汇(解析版)

标签：橄榄读音核雕欺侮

留言与评论（共有 0 条评论）