Linear Regression for Face Recognition
Imran Naem,
Roberto Togneri,Senior Member,IEEE,and
Mohammed Bennamoun
Abstract—In this paper,we prent a novel approach of face identification by formulating the pattern recognition problem in terms of linear regression.Using a fundamental concept that patterns from a single-object class lie on a linear subspace,we develop a linear model reprenting a probe image as a linear combination of class-specific galleries.The inver problem is solved using the least-squares method and the decision is ruled in favor of the class with the minimum reconstruction error.The propod Linear Regression Classification (LRC)algorithm falls in the category of nearest subspace classification.The algorithm is extensively evaluated on veral standard databas under a number of exemplary evaluation protocols reported in the face recognition literature.A comparative study with state-of-the-art algorithms clearly reflects the efficacy of the propod approach.For the problem of contiguous occlusion,we propo a Modular LRC approach,introducing a novel Distance-bad Evidence Fusion (DEF)algorithm.The propod methodology achieves the best results ever r
eported for the challenging problem of scarf occlusion.
Index Terms—Face recognition,linear regression,nearest subspace classification.
Ç
1I NTRODUCTION
F ACE recognition systems are known to be critically dependent on manifold learning methods.A gray-scale face image of an order aÂb can be reprented as an ab-dimensional vector in the original image space.However,any attempt at recognition in such a high-dimensional space is vulnerable to a variety of issues often referred to as the cur of dimensionality.Therefore,at the feature extraction stage,images are transformed to low-dimensional vectors in the face space.The main objective is to find such a basis function for this transformation which could distinguishably reprent faces in the face space.A number of approaches have been reported in the literature,such as Principle Component Analysis(PCA)[1],[2], Linear Discriminant Analysis(LDA)[3],and Independent Com-ponent Analysis(ICA)[4],[5].Primarily,the approaches are classified in two ,reconstructive and discriminative methods.Reconstructive approaches(such as PCA and ICA)are reported to be robust for the problem of contaminated pixels[6], whereas discriminative
approaches(such as LDA)are known to yield better results in clean conditions[7].Apart from the traditional approaches,it has been shown recently that unorthodox features,such as downsampled images and random projections, can rve equally well.In fact,the choice of the feature space may no longer be so critical[8].What really matters is the dimension-ality of feature space and the design of the classifier.
In this rearch,we propo a fairly simple but efficient linear regression-bad classification(LRC)for the problem of face identification.Samples from a specific object class are known to lie on a linear subspace[3],[9].We u this concept to develop class-specific models of the registered urs simply using the downsampled gallery images,thereby defining the task of face recognition as a problem of linear regression.Least-squares estimation is ud to estimate the vectors of parameters for a given probe against all class models.Finally,the decision rules in favor of the class with the most preci estimation.The propod classifier can be categorized as a Nearest Subspace(NS)approach.
An important relevant work is prented in[8],where down-sampled images from all class are ud to develop a dictionary matrix during the training ssion.Each probe image is repre-nted as a linear combination of all gallery images,thereby resulting in an ill-conditioned inver problem.With th
e latest rearch in compressive nsing and spar reprentation, sparsity of the vector of coefficients is harnesd to solve the ill-conditioned problem using the l1-norm minimization.In[10],the concept of Locally Linear Regression(LLR)is introduced specifically to tackle the problem of po.The main thrust of the rearch is to indicate an approximate linear mapping between a nonfrontal face image and its frontal counterpart;the estimation of linear mapping is further formulated as a prediction problem with a regression-bad solution.For the ca of vere po variations,the nonfrontal image is sampled to obtain many overlapped local gments.Linear regression is applied to each small patch to predict the corresponding virtual frontal patch;the LLR approach has shown some good results in the prence of coar alignment. In[11],a two-step approach has been adopted,fusing the concept of wavelet decomposition and discriminant analysis to design a sophisticated feature extraction stage.The discriminant features are ud to develop feature planes(for Nearest Feature Plane—NFP classifier)and feature spaces(for Nearest Feature Space—NFS classifier).The query image is projected onto the subspaces and the decision rules in favor of the subspace with the minimum distance. However,the propod LRC approach,for the first time,simply us the downsampled images in combination with the linear regression classification to achieve superior results compared to the benchmark techniques.
六级分数线
Further,for the problem of vere contiguous occlusion,a modular reprentation of images is expected to solve the problem [12].Bad on this concept,we propo an efficient Modular LRC Approach.The propod approach gments a given occluded image and reaches individual decisions for each block.The intermediate decisions are combined using a novel Distance-bad Evidence Fusion(DEF)algorithm to reach the final decision.The propod DEF algorithm us the distance metrics of the intermediate decisions to decide about the“goodness”of a partition.There are two major advantages of using the DEF approach.First,the nonface partitions are rejected dynamically; therefore,they do not take part in the final decision making. Second,the overall recognition performance is better than the best individual result of the combining partitions due to the efficient decision fusion of the face gments.
Algorithm:Linear Regression Classification(LRC)
Inputs:Class models X i2I R qÂp i;i¼1;2;...;N and a test image vector y2I R qÂ1.
Output:Class of y
1.^ i2I R p iÂ1is evaluated against each class model,
^
i
¼ðX T i X iÞÀ1X T i y;i¼1;2;...;N
2.^y i is computed for each^ i,^y i¼X i^ i;i¼1;2;...;N
3.Distance calculation between original and predicted
anarchismrespon variables d iðyÞ¼yÀ^y i
k k2;i¼1;2;...;N
4.Decision is made in favor of the class with the
minimum distance d iðyÞ
The rest of the paper is organized as follows:In Section2,the propod LRC and Modular LRC algorithms are described.This is
.I.Naem and R.Togneri are with the School of Electrical,Electronic and Computer Engineering,M018,The University of Western Australia, 35Stirling Highway,Crawley,Western Australi
a6009,Australia.
E-mail:{imran.naem,roberto}@ee.uwa.edu.au.
.M.Bennamoun is with the School of Computer Science and Software Engineering,The University of Western Australia,35Stirling Highway, Crawley,Western Australia6009,Australia.
E-mail:m.bennamoun@cs.uwa.edu.au.
Manuscript received10Oct.2008;revid13Jan.2009;accepted11July 2009;published online1July2010.
Recommended for acceptance by S.Li.
For information on obtaining reprints of this article,plea nd e-mail to: tpami@computer,and reference IEEECS Log Number
TPAMI-2008-10-0684.
Digital Object Identifier no.10.1109/TPAMI.2010.128.
0162-8828/10/$26.00ß2010IEEE Published by the IEEE Computer Society
followed by extensive experiments using standard databas under a variety of evaluation protocols in Section 3.The paper concludes in Section 4.
2
L INEAR R EGRESSION FOR F ACE R ECOGNITON
2.1
Linear Regression Classification Algorithm
Let there be N number of distinguished class with p i number of training images from the i th class,i ¼1;2;...;N .Each gray-scale training image is of an order a Âb and is reprented as u i ðm Þ
2I R a Âb ,i ¼1;2;...;N and m ¼1;2;...;p i .Each gallery image is downsampled to an order c Âd and transformed to vector
through column concatenation such that u i ðm Þ2I R a Âb !w i ðm Þ
2I R q Â1,where q ¼cd ,cd <ab .Each image vector is normalized so that maximum pixel value is 1.Using the concept that patterns from the same class lie on a linear subspace [9],we develop a class-specific model X i by stacking the q -dimensional image vectors,
X i ¼Âw i ð1Þw i ð2Þ......w i ðp i ÞÃ
2I R q Âp i ;
i ¼1;2;...;N ð1Þ
Each vector w i ðm Þ
,m ¼1;2;...;p i ,spans a subspace of I R q also called the column space of X i .Therefore,at the training level,each class i is reprented by a vector subspace,X i ,which is also called the regressor or predictor for class i .Let z be an unlabeled test image and our problem is to classify z as one of the class i ¼1;2;...;N .We transform and normalize the gray-scale image z to an image vector y 2I R q Â1as discusd for the gallery.If y belongs to the i th class,it should be reprented as a linear combination of the training images from the same class (lying in the same subspace),i.e.,
y ¼X i i ;
i ¼1;2;...;N;
ð2Þ
where i 2I R p i Â1is the vector of parameters.Given that q !p i ,the system of equations in (2)is well conditioned and i can be estimated using least-squares estimation [13],[14],[15]:
^ i ¼X T i
X i À
ÁÀ1
X T i y :
ð3Þ
The estimated vector of parameters,^
in your eyes歌词
i ,along with the predictors X i are ud to predict the respon vector for each class i :
^y
i ¼X i ^ i ;i ¼1;2;...;N ^y i ¼X i ÀX T i X i ÁÀ1X T
i y ^y
i ¼H i y ;ð4Þ
where the predicted vector ^y
i 2I R q Â1is the projection of y onto the i th subspace.In other words,^y
i is the clost vector,in the i th subspace,to the obrvation vector y in the euclidean n
[16].H is called a hat matrix since it maps y into ^y
i .We now calculate the distance measure between the predicted respon
vector ^y
i ,i ¼1;2;...;N ,and the original respon vector y ,d i ðy Þ¼y À^y
i k k 2;i ¼1;2;...;N
ð5Þ
and rule in favor of the class with minimum ,
min |{z}
i
d i ðy Þ;i ¼1;2;...;N:ð6Þ
2.2Modular Approach for the LRC Algorithm
The problem of identifying partially occluded faces could be
efficiently dealt with using the modular reprentation approach [12].Contiguous occlusion can safely be assumed local in nature in a n that it corrupts only a portion of conterminous pixels of the image,the amount of contamination being unknown.In the modular approach,we utilize the neighborhood property of the
contaminated pixels by dividing the face image into a number of subimages.Each subimage is now p
rocesd individually and a final decision is made by fusing information from all of the subimages.A commonly reported technique for decision fusion is majority voting [12].However,a major pitfall with majority voting is that it treats noisy and clean partitions equally.For instance,if three out of four partitions of an image are corrupted,majority voting is likely to be erroneous no matter how significant the clean partition may be in the context of facial features.The task becomes even more complicated by the fact that the distribution of occlusion over a face image is never known a priori and therefore,along with face and nonface subimages,we are likely to have face portions corrupted with occlusion.Some sophisticated approaches have been developed to filter out the potentially contaminated image pixels (for example,[17]).In this ction,we make u of the specific nature of distance classification to develop a fairly simple but efficient fusion strategy which implicitly deemphasizes corrupted subimages,significantly improving the overall classification accu-racy.We propo using the distance metric as evidence of our belief in the “goodness”of intermediate decisions taken on the sub-images;the approach is called “Distance-bad Evidence Fusion.”To formulate the concept,let us suppo that each training image is gmented in M partitions and each partitioned image is designated as v n ,n ¼1;2;...;M .The n th partition of all p i training images from the i th class is subsampled and transformed to vectors,as discusd in Section 2,to develop a class-specific and
partition-specific subspace U ðn Þ
i :
U ðn Þ
i
¼satellite是什么意思
h w i ð1Þðn Þw i ð2Þðn Þ
......w i ðpi Þðn Þ
i ;
i ¼1;2;...;N:ð7Þ
Each class is now reprented by M subspaces and altogether
we have M ÂN subspace models.Now a given probe image is partitioned into M gments according
ly.Each partition is transformed to an image vector y ðn Þ,n ¼1;2;...;M .Given that i is the true class for the given probe image,y ðn Þis expected to lie on
the n th subspace of the i th class U ðn Þ
i and should satisfy
y ðn Þ¼U ðn Þi ðn Þ
i :
ð8Þ
The vector of parameters and the respon vectors are
estimated as discusd in this ction:
^ ðn Þi ¼ÂÀ
U ðn Þi
ÁT
U ðn Þi
ÃÀ1À
U ðn Þi
ÁT
y ðn Þ;
ð9Þ^y
ðn Þ
i ¼U ðn Þi ^ ðn Þ
i ;
i ¼1;2;...;N:
ð10Þ
The distance measure between the estimated and the original
respon vector is computed
d i y ðn Þ
¼
y ðn ÞÀ^y ðn Þi 2;i ¼1;2;...;N:ð11ÞNow,for the n th partition,an intermediate decision called j ðn Þis
reached with a corresponding minimum distance calculated as
d j
ink cartridgeðn Þ
¼min |{z}i
d i ðy ðn ÞÞ
i ¼1;2;...;N:ð12Þ
Therefore,we now have M decisions j ðn Þwith M corresponding
distances d j ðn Þ
and we decide in favor of the class with minimum distance:
Decision ¼arg min |{z}j
d j
ðn Þ
n ¼1;2;...;M:ð13Þ
3E XPERIMENTAL R ESULTS
Extensive experiments were carried out to illustrate the efficacy of the propod approach.Esntially,five standard ,the AT&T [18],Georgia Tech [19],FERET [20],Extended Yale B
[21],[22],and AR [23]have been addresd.The databas incorporate veral deviations from the
ideal conditions,including po,illumination,occlusion,and gesture alterations.Several standard evaluation protocols reported in the face recognition literature have been adopted and a comprehensive comparison of the propod approach with the state-of-the-art techniques has been prented.It is appropriate to indicate that the developed approach has been shown to perform well for the cas of vere gesture variations and contiguous occlusion with little change in po,scale,illumination,and rotation.However,it is not meant to be robust to other deviations such as vere po and illumination variations.
3.1AT&T Databa
The AT&T databa is maintained at the AT&T Laboratories,Cambridge University;it consists of 40subjects with 10images per subject.The databa incorporates facial gestures,such as smiling or nonsmiling,open or clod eyes,and alterations like glass or without glass.It also characterizes a maximum of 20degree rotation of the face with some scale variations of about 10percent (e Fig.1).
We follow two evaluation protocols as propod in the literature quite often [24],[25],[26],[27].Evaluation Protocol 1(EP1)takes the first five images of each individual as a training t,while th
e last five are designated as probes.For Evaluation Protocol 2(EP2),the “leave-one-out”strategy is adopted.All experiments are conducted by downsampling 112Â92images to an order of 10Â5.A detailed comparison of the results for the two evaluation protocols is summarized in Table 1.For EP1,the LRC algorithm achieves a comparable recognition accuracy of 93.5per-cent in a 50D feature space;the best results are reported for the latest Eigenfeature Regularization and Extraction (ERE)approach,which are 3.5percent better than the LRC method.Note that
recognition error rate is converted to recognition success rate for [27].Also for EP2,the LRC approach attains a high recognition success of 98.75percent in a 50D feature space,it outperforms the ICA approach by 5percent (approximately),and it is fairly comparable to the Fisherfaces,Eigenfaces,Kernel Eigenfaces,2DPCA,and ERE approaches.
The choice of dimensionality for the AT&T databa is dilated in Fig.2a,which reflects that the recognition rate becomes fairly constant above a 40-dimensional feature space.
3.2Georgia Tech (GT)Databa
The Georgia Tech databa consists of 50subjects with 15images per subject [19].It characterizes veral variations such as po,expression,cluttered background,and illumination (e Fig.3).Images
were downsampled to an order of 15Â15to constitute a 225D feature space;choice of dimensionality is depicted in Fig.2b,which reflects a consistent performance above a 100D feature space.The first eight images of each subject were ud for training,while the remaining ven rved as probes [27];all experiments were conducted on the original databa without any cropping/normalization.Table 2shows a detailed comparison of the LRC with a variety of approaches;all results are as reported in [27]with recognition error rates converted to recognition success rates.Also,results in [27]are shown for a large range of feature dimensions;for the sake of fair comparison,we have picked the best reported results.The propod LRC algorithm outperforms the traditional PCAM and PCAE approaches by a margin of 12and 18.57percent,respectively,achieving a high recognition accuracy of 92.57percent.It is also shown to be fairly comparable to all other methods,including the latest ERE approaches.
TABLE 1
Results for EP1and EP2Using the AT&T
Databa
zahaFig.1.A typical subject from the AT&T
databa.
Fig.2.(a)Recognition accuracy for the AT&T databa with respect to feature dimension for a randomly lected leave-one-out experiment.(b)Feature dimensionality curve for the GT
databa.
Fig.3.Samples of a typical subject from the GT databa.
neor3.3FERET Databa 3.3.1
Evaluation Protocol 1
The FERET databa is arguably one of the largest publicly available databas [20].Following [27],[28],we construct a subt of the databa consisting of 128subjects,with at least four images per subject.We,however,ud four images per subject [27].Fig.4shows images of a typical subject from the FERET databa.It has to be noted that,in [27],the databa consists of 256subjects;128subjects (i.e.,512images)are ud to develop the face space,while the remaining 128subjects are ud for the face recognition trials.The propod LRC approach us the gallery images of each person to form a linear subspace;therefore,it does not require any additional development of the face space.However,it requires multiple gallery images for a reliable construction of linear subspaces.Using a single gallery image for each person is not substantial in the context of linear regression,as this corresponds to only a single regressor (or predictor)obrvation,leading to erroneous least-squares calculations.
Cross-validation experiments for LRC were conducted in a 42D feature space;for each recognition trial,three images per person were ud for training,while the system was tested for the fourth one.The results are shown in Table 3.The frontal images fa and fb incorporate gesture variations with small po,scale,and rotation changes,whereas ql and qr correspond to major po variations (e [20],for details).The propod LRC approach copes well with the problem of facial expressions in the prence of small po variations,achieving high recognition rates of 91.41and 94.53per-cent for fa and fb,respectively.It outperforms the benchmark PCA and ICA I algorithms by margins of 17.19and 17.97percent for fa and 21.09and 23.44percent for fb,respectively.The LRC approach,however,shows degraded recognition rates of 78.13and 84.38per-cent for the vere po variations of ql and qr,respectively;however,even with such major posture changes,it is substantially superior to the PCA and ICA I approaches.In an overall n,we achieve a recognition accuracy of 87.11percent,which is favorably comparable to 83.00percent recognition achieved by ERE [27]using single gallery images.
3.3.2Evaluation Protocol 2
In this experimental tup,we validated the consistency of the propod approach with a large number of subjects.We now have a subt of FERET databa consisting of 400randomly lected
persons.Cross-validation experiments were conducted as dis-cusd above;results are reported in Table 3.The propod LRC approach showed quite agreeable results with the large databa as well.It persistently achieved high recognition rates of 93.25and
93.50percent,for fa and fb,respectively.For the ca of vere po variations of ql and qr,we note a slight degradation in the performance,as expected.The overall performance is,however,pretty much comparable with an average recognition success of 84.50percent.For all ca studies,the propod LRC approach is found to be superior to the benchmark PCA and ICA I approaches.
3.4Extended Yale B Databa
Extensive experiments were carried out using the Extended Yale B databa [21],[22].The databa consists of 2,414frontal face images of 38subjects under various lighting conditions.The databa was divided in five subts;subt 1consisting of 266images (ven images per subject)under nominal lighting conditions was ud as the gallery,while all others were ud for validation (e Fig.5).Subts 2and 3,each consisting of 12images per subject,characterize slight-to-moderate luminance variations,while subt 4(14images per person)and subt 5(19images per person)depict vere light variations.
All experiments for the LRC approach were conducted with images downsampled to an order 20Â20;results are shown in Table 4.The propod LRC approach showed excellent performance for moderate light variations,yielding 100percent recognition accuracy for subts 2and 3.The recognition success,however,falls to 83.27and 33.61percent,for subts 4and 5,respectively.The propod LRC approach has shown better tolerance for consider-able illumination variations compared to benchmark reconstructive approaches comprehensively outperforming PCA and ICA I for all ca studies.The propod algorithm,however,could not with-stand vere luminance alterations.
3.5AR Databa
The AR databa consists of more than 4,000color images of 126subjects (70men and 56women)[23].The databafacebook mesnger
Results for the Georgia Tech Databa Fig.4.A typical subject from the FERET databa;fa and fb reprent frontal shots with gesture variations,while
ql and qr correspond to po variations.Results for the FERET Databa
Fig.5.Starting from the top,each row illustrates samples from subts 1,2,3,4,and 5,respectively.
characterizes divergence from ideal conditions by incorporating various facial expressions (neutral,smile,anger,and scream),luminance alterations (left light on,right light on,and all side lights on),and occlusion modes (sunglass and scarf).It has been ud by rearchers as a testbed to evalu
ate and benchmark face recognition algorithms.In this rearch,we address two funda-mental challenges of face ,facial expression variations and contiguous occlusion.
3.5.1Gesture Variations
Facial expressions are defined as the variations in appearance of the face induced by internal emotions or social communications [29].In the context of face identification,the problem of varying facial expressions refers to the development of the face recognition systems which are robust to the changes.The task becomes more challenging due to the natural variations in the head orientation with the changes in facial expressions,as depicted in Fig.6.Most of the face detection and orientation normalization algorithms make u of the facial features,such as eyes,no,and mouth.It has to be noted that for the ca of adver gesture variations such as “scream,”the eyes of the subject naturally are clod (e Figs.6d and 6h).Conquently,under such vere conditions,the eyes cannot be automatically detected and therefore face normalization is likely to be erroneous.Hence,there are two possible configurations for a realistic evaluation of robustness for a given face recognition algorithm:1)by implementing an automatic face localization and normalization module before the actual face recognition module or 2)by evaluating the algorithm using the original frame of the face image rather than a manually localized and aligned face.With this understanding,w
e validate the propod LRC algorithm for the problem of gesture variations on the original,uncropped,and unnormalized AR databa.
Out of 125subjects of the AR databa,a subt is generated by randomly lecting 100individuals (50males and 50females).All 576Â768images are downsampled to an order of 10Â10.The databa characterizes four facial expressions:neutral,smile,anger,and scream.
Experiments are bad on cross-validation ,each time the system is trained using images of three different expressions (600images in all),while the testing ssion is conducted using the left-out expression (200images)[17].The LRC algorithm achieves a high recognition accuracy for all facial expressions;the results for a 100D feature space are reported in Table 5with an overall average recognition of 98.88percent.For the ca of screaming,the propod approach achieves 99.50per-cent recognition accuracy,which outperforms the results in [17]by 12.50percent;noteworthy is the fact that results in [17]are shown on a subt consisting of only 50individuals.
3.5.2Contiguous Occlusion
The problem of face identification in the prence of contiguous occlusion is arguably one of the most challenging paradigms in the context of robust face recognition.Commonly ud objects,such a
s caps,sunglass,and scarves,tend to obstruct facial features,causing recognition errors.Moreover,in the prence of occlusion,the problems of automatic face localization and normalization as discusd in the previous ction are even more magnified.Therefore,experiments on manually cropped and aligned data-bas make an implicit assumption of an evenly cropped and nicely aligned face,which is not available in practice.
The AR databa consists of two modes of contiguous ,images with a pair of sunglass and a scarf.Fig.7reflects the two scenarios for two different ssions.A subt of the AR databa consisting of 100randomly lected individuals (50men and 50women)is ud for empirical evaluation.The system is trained using Figs.6a,6b,6c,6d,6e,6f,6g,and 6h for each subject,thereby generating a gallery of 800images.Probes consist of Figs.7a and 7b for sunglass occlusion and Figs.7c and 7d for scarf occlusion.The propod approach is evaluated on the original databa without any manual cropping and/or normalization.
For the ca of sunglass occlusion,the propod LRC approach achieves a high recognition accuracy of 96percent in a 100D feature space.Table 6depicts a detailed comparison of the LRC approach with a variety of approaches reported in [8],consisting of Principal Component Analysis,Independent Component Analysis-architecture I (ICA I),Local Nonnegative Matrix Factorizat
ion (LNMF),least-squares projection onto the subspace spanned by all face images,and Spar-Reprentation-bad Classification (e [8],for details).NN and NS correspond to Nearest Neighbors and Nearest Subspace-bad classification,respectively.The LRC algorithm comprehensively outperforms the best competitor (SRC)by a margin of 9percent.To the best of our knowledge,the LRC approach achieves the best results for the ca of sunglass
Results for the Extended Yale B
Databa Fig.6.Gesture variations in the AR databa.Note the changing position of the head with different pos.(a)-(d)and (e)-(h)correspond to two different ssions incorporating neutral,happy,angry,and screaming expressions,respectively.
Recognition Results for Gesture Variations Using the LRC
Approach
Fig.7.Examples of contiguous occlusion in the AR databa.
occlusion.Note that in [8],a comparable recognition rate of 97.5percent has been achieved by a subquent image partitioning approach.
hotsauce
For the ca of vere scarf occlusion,the propod approach gives a recognition accuracy of 26percent in a 3,600D feature space.Fig.8a shows the performance of the system with respect to an increasing dimensionality of the feature space.Although the LRC algorithm outperforms the classical PCA and ICA I approaches by a margin of 14and 11percent,respectively,it lags the SRC approach by a margin of 33.5percent.
We now demonstrate the efficacy of the propod Modular LRC approach under vere occlusion co
nditions.As a preprocessing step,the AR databa is normalized,both in scale and orientation,generating a cropped and aligned subt of images consisting of 100subjects.Images are manually aligned using eye and mouth locations,as shown in Fig.8b [30];each image is cropped to an order 292Â240.Some images from the normalized databa are shown in Fig.9.
All images are partitioned into four blocks,as shown in Fig.10a.The blocks are numbered in ascending order from left to right,starting from the top;the LRC algorithm for each subimage us a 100D feature space as discusd in the previous ction.Fig.11a elaborates on the efficacy of the propod approach for a random probe image.In our propod approach,we have ud the distance measures d j ðn Þas evidence of our belief in a subimage.The key factor to note in Fig.11a is that corrupted subimages (i.e.,blocks 3and 4in Fig.10a)reach a decision with a low ,high distance measures d j ðn Þ.Therefore,in the final decision making,the corrupted blocks are rejected,thereby giving a high recognition accuracy of 95percent.The superiority of the propod approach is more pronounced by considering the individual recognition rates of the subimages in Fig.11b.Blocks 1and 2yield a high classification accuracy of 94and 90percent,respectively,whereas blocks 3and 4give 1percent output each.Note that the effect of the propod approach is twofold:First it automatically deemphasizes the nonface partitions.Second,the efficient and dynamic f
usion harness the complementary information of the face subimages to yield an overall recognition accuracy of 95percent,which is better than the best of the participating face partition.
Note that in Fig.10a,the partitioning is such that the uncorrupted subimages (blocks 1and 2)correspond to undistorted and complete eyes,which are arguably one of the most discriminant facial features.Therefore,one can argue that this high classification accuracy is due to this coincidence and the approach might not work well otherwi.To remove this ambiguity,we partitioned the images into six and eight blocks,as shown in Figs.10b and 10c,respectively.The blocks are numbered left to right,starting from the top.For Fig.10b,partitions 1and 2give high recognition accuracies of 92.5and 90percent,respectively,while the remaining ,3,4,5,and 6yield 8,7,0,and 1percent recognition,respectively.Interestingly,although the best block gives 92.5percent ,which is 1.5percent less than the best block of Fig.10a,the overall classification accuracy comes out to be 95.5percent.
Similarly,in Fig.10c,blocks 1,2,3,and 4give classification accuracies of 88.5,84.5,80,and 77.5percent,respectively,while the corrupted blocks 5,6,7,and 8produce 3,0.5,1,and 1percent of classification.The propod evidence-bad algorithm yields a high classification accuracy of 95percent.A key factor to note is that the best individual result is 88.5percent,which lags the best indi
vidual result of Fig.10a by 5.5percent.However,the propod integration of combining blocks yields a comparable overall recognition.Interestingly,the eyebrow regions (blocks 1and 2)in Fig.10c have been found most uful.
To the best of our knowledge,the recognition accuracy of 95.5percent achieved by the prented approach is the best result ever reported for the ca of scarf occlusion,the previous best being 93.5percent achieved by partitioned SRC approach in [8];also
TABLE 6
Recognition Results for Occlusion
Fig.8.(a)The recognition accuracy versus feature dimension for scarf occlusion using the LRC appro
ach.(b)
A sample image indicating eye and mouth locations for the purpo of
manual alignment.
Fig.9.Samples of cropped and aligned faces from the AR databa.
Fig.10.Ca studies for the Modular LRC approach for the problem of scarf occlusion.
bella什么意思Fig.11.(a)Distance measures d j ðn Þfor the four partitions;note that nonface components make decisions with low evidences.(b)Recognition accuracies for all blocks.