Correspondences
Reduced-Reference Image Quality Asssment
With Visual Information Fidelity
Jinjian Wu,Weisi Lin,Senior Member,IEEE,
Guangming Shi,Senior Member,IEEE,and Anmin Liu Abstract—Reduced-reference(RR)image quality asssment(IQA)aims to u less data about the reference image and achieve higher evaluation accuracy.Recent rearch on brain theory suggests that the human visual system(HVS)actively predicts the primary visual information and tries to avoid the residual uncertainty for image perception and understanding. Therefore,the perceptual quality relies to the informationfidelities of the primary visual information and the residual uncertainty.In this paper,we propo a novel RR IQA index bad on visual informationfidelity.We advocate that distortions on the primary visual information mainly disturb image understanding,and distortions on the residual uncertainty mainly change the comfort of perception.We parately compute the quantities of the primary visual information and the residual uncertainty of an image. Then thefidelities of the two types of information are parately evaluated for quality asssment.Experimental results demonstra
te that the propod index us few data(30bits)and achieves high consistency with human perception.
Index Terms—Image quality asssment,reduced-reference,internal generative mechanism,informationfidelity.
I.I NTRODUCTION
Objective image quality asssment(IQA)plays an important role in image and video processing,such as in information compression, transmission,restoration and display[1].During the last decade,a lot of IQA indices have been introduced.Most of them are full-reference (FR)methods which require the whole reference image for quality eval-uation[2].However,the reference images are not always available, and no-reference(NR)IQA indices are expected.Becau of the varied image contents and the individual distortion types,the NR quality eval-uation with no prior knowledge is an extremely difficult task[3].
As a compromi between FR and NR,reduced-reference(RR)IQA indices are designed to evaluate the perceptual quality by using partial information of the reference images.A successful RR IQA index is expected to u less data of reference images and achieve higher eval-uation accuracy[4].T
o this end,some reprentative global features are extracted for quality evaluation.In[3],a wavelet-domain natural image statistic metric(WNISM)is introduced.Under the assumption Manuscript received June24,2012;revid September30,2012and December28,2012;accepted January07,2013.Date of publication June04, 2013;date of current version October11,2013.This work was supported by the Major State Basic Rearch Development Program of China(973Program) (No.2013CB329402),NSF of China(No.61033004,61070138,61072104,and 61227004),and the Fundamental Rearch Funds for the Central Universities (No.K50513100005and K5051202034)..The associate editor coordinating the review of this manuscript and approving it for publication was Dr.Sheng-Wei (Kuan-Ta)Chen.
J.Wu and G.Shi are with Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education of China,School of Electronic Engi-neering,Xidian University,Xi’an,China(e-mail:jinjian.wu@mail.xidian.edu. cn;gmshi@).
W.Lin(corresponding author)and A.Liu are with the School of Computer Engineering,Nanyang Technological University,Nanyang639798,Singapore (e-mail:wslin@ntu.edu.sg;liua0002@ntu.edu.sg).
Digital Object Identifier10.1109/TMM.2013.2266093that most real world image distortions disturb image statistics,WNISM evaluates the perceptual quality by computing the distance between the probability distributions of wavelet coefficients.In[5],according to the distribution of wavelet coefficients,geometric information is ex-tracted for quality asssment.In addition,by analyzing the DCT coef-ficient distributions,Ma et al.[6]reorganized the DCT coefficients into veral reprentative subbands and evaluated the perceptual quality bad on their city-block distance(ROCB).Recently,Soundararajan and Bovik[7]suggested to measure the quality degradation according to the entropies of wavelet coefficients and introduced a Reduced Ref-erence Entropic Differences(RRED)algorithm.All of the algorithms are performed on subband domain and promote our understanding on quality asssment.
It is well known that the HVS is highly nsitive to spatial ,luminance contrast and structure)of an input image.In this letter,we try to directly analyze the quality degradation within spatial domain according to the spatial features which the HVS is highly nsitive to.Recent rearches on brain ,the Bayesian brain theory[8]and the free-energy principle[9])suggest that the brain works with an internal generative mechanism(IGM)for visual perception and image understanding.With an input scene,the IGM actively predicts the primary visual information and atte
mpts to avoid the residual uncertainty/disorder[9].This mechanism reveals the inner processing of visual signals and prompts the cue for quality asssment.According to this mechanism,Zhai et al.[10]introduced a free-energy-bad distortion metric(FEDM)which measures the quality bad on the change of the residual uncertainty.We advocate that distortions on the primary visual information and the residual uncertainty result in different visual content degradations.Therefore, we need to discriminately evaluate the degradations on the two types of information.
In this letter,we introduce a novel RR IQA index by discriminately evaluating the visual contentfidelities of the primary visual informa-tion and the residual uncertainty.Inspired by the active prediction of the IGM,an autoregressive(AR)model is employed to predict the vi-sual content and to decompo the input image into two portions,the orderly portion and the disorderly portion.The orderly portion pos-ss the primary visual information of the input scene,which will be further procesd by the HVS for image understanding and recog-nition[11].The disorderly portion consists of the residual uncertainty, which will be avoided by the HVS for further processing[9].There-fore,we parately evaluate the informationfidelities on the two por-tions.Firstly,the quantities of information of the two portions are com-puted(each feature is quantized into15bits)according to information theory.Then,we respectively evaluate the informationfidelities on the two portions,and combine the two results to acquire the overall quality score.
The rest of this letter is organized as follows.In Section II,we give the detailed description of the propod index.Then,the performance of the propod index is demonstrated in Section III.Finally,conclu-sions are drawn in Section IV.
II.T HE P ROPOSED RR IQA I NDEX
In this ction,we will introduce the propod RR IQA index in detail,and its deployment is illustrated in Fig.1.Inspired by the IGM theory,we decompo the input image into two portions,the orderly visual information and the disorderly uncertainty.Then the quantity
1520-9210©2013IEEE
Fig.1.Deployment of the propod index.is the original image and is the contaminated image.Pixel values of the two disorderly uncertainty images are scaled to[0,255]for a clear view.
of information of each portion is computed.Finally,we evaluate the visual quality bad on the informationfidelity.
A.Image Understanding Within the IGM
As an active signal processing system,the HVS helps us to under-stand the colorful world.Recent rearch on brain theory suggests that the HVS has an IGM for visual perception and understanding[8],[10]. To the retinal stimuli,the IGM actively predicts and explains their n-sations according to the inherent priori knowledge,and tries to avoid the remaining uncertainty for further processing[9].Inspired by the ac-tive prediction in the IGM,we suppo to decompo an input image into two portions,which we call the orderly and disorderly portions,for quality asssment.Both the Bayesian brain theory[8]and the free-en-ergy principle[9]indicate that the IGM optimizes the input scene by minimizing the prediction error.From the perspective of pixel,the IGM tries to accurately predict the value of pixel with minimum error.It is well known that pixels are highly correlated with their surround-ings and jointly carry structural information[12].Therefore,the cor-related pixels posss high inter-pixel redundancy,and we can accu-rately predict pixels bad on the relationships among them.In other words,according to the relationships between a central pixels and its surrounding pixels,the prediction error of the central pixel can be min-imized by highlighting the highly correlated pixels on the prediction procedure.To this end,an AR bad prediction model is introduced in [13],in which the central pixel is predicted bad on the structure similarity with its surrounding pixels.
(1)where is the predicted value of pixel being a local surround-ings of being the structural related adjusting parameter,and is white noi.More details about(1)can be found in[13].With the help of(1),an input image is decompod into the orderly portion with and the disorderly portion with.As shown in Fig.1,the or-derly portion(and)posss the primary visual information of an image(and),and the disorderly portion(and)posss the residual uncertainty.Furthermore,the two portions contain different visual information and play different roles on image perception and understanding.
Visual quality is cloly related to the informationfidelities on the orderly and disorderly portions.Distortions on the orderly portion dis-turb the prediction of the primary visual information,such as blurring the structure and degrading the edge,which directly impact on image understanding.While distortions on the disorderly portion have little interference on the prediction of primary visual information,which mainly arou uncomfortable perception and have limited effect on image understanding[12].As shown in Fig.1,distortions on the or-derly portion blur the structure of the motorbike and the words on the clothes.Comparing with the degradation on the orderly portion, the HVS takes less care about the degradation on the disorderly portion (further analysis of distortions on the two portions will be given in Section III-A).Therefore,distortions on the orderly and
disorderly por-tions have different effects and we should discriminatively evaluate the informationfidelities on the two portions.
B.Information Fidelity Bad Quality Asssment
The IGM performs as an active inference system which perceives an input scene by adjusting the inner configuration[14].Hence,we as-
Fig.2.Edgefilters for four directions.
sume the IGM as a parametric system for visual stimuli processing, within which the quantity of information of an input image can be computed as
(2) where is the conditional probability of given.
As discusd in the above subction,an image is decompod into two ,and)for processing.Since the two portions contain different visual information,it is reasonable to suppo that their contents are independent.The reprentation of the image in the HVS can be regarded as the union reprentations of the two portions. According to information theory[15],(2)can be rewrote as
(3) where and are the quantities of information of and,respectively.
The HVS is highly adapted to extract the luminance contrast,which can effectively reprent the primary visual content and is often ud to measure the degradation of image structure[16],[17].Therefore,we employ the luminance contrast map to reprent the primary visual information in the HVS,which is computed as[17],
(4)
(5) where are four directionalfilters,as shown in Fig.2,, and symbol denotes the convolution operation.
With(4)and(5),the contrast map of the primary visual information (i.e.,and)is acquired.Then,the probability distribution of the contrast map is calculated bad on the intensity of the contrast.Finally, according to Shannon Entropy[15],the quantity of information of the primary visual ,and)is obtained. Thefidelity of the primary visual ,between and) is measured as
(6) Since the disorderly portion consists of residual uncertainty which is independent from the primary visual information,its intensity directly reprents the uncertain degree[9].Thus,we employ the intensity en-ergy of the disorderly portion to reprent its quantity of information,
(7) where is the total number of pixels in image.The information fidelity between the disorderly portions of the reference and test images is measured as
(8)where and are the quantities of information of and ,respectively.
Combining the two evaluation results,and,we deduce the overall visual quality as follows
(9) where and denote the relative importance of the two parts.Since reprents thefidelity of the primary visual information,it is more important than which reprents thefidelity of the residual uncer-tainty,in this letter,we simply t and(more discus-sion about the two parameters will be given in Section III).
III.E XPERIMENTAL R ESULTS
In this ction,wefirstly illustrate the effectiveness of the propod index.Then we verify the propod index by comparing it with three latest RR IQA indices and two classic FR IQA indices.In the experi-ment,the quantities of information of orderly and disorderly portions arefirstly quantized into[0,255.99](only retains2digits after the dec-imal point,and therefore the value can be reprented within15bits) for further processing.
A.Analysis on the Propod Index
The propod RR IQA index is bad on visual informationfidelity, which includes the content degradations on both of the primary visual information and the disorderly uncertainty.Fig.3shows an example of the propod RR IQA index(considering the resolution of the screen, we crop a part of the bikes image for a clear view).From the left to the right columns,they are reference images,white
noi(WN)contam-inated images,and JPEG2000(JP2K)contaminated images.
Different types of distortion result in different visual content degra-dations.As shown in Fig.3(b),the white noi mainly increas the un-certainty/disorder of the image.Since the HVS can effectivelyfilter out the white noi and actively predict the primary visual information,the WN in Fig.3(b)mainly caus uncomfortable nsation and has little effect on image understanding.By comparing Fig.3(a)with(b),it can be en that Fig.3(b)posss almost as much primary visual infor-mation as Fig.3(a),such as the detailed structure of the tyre and the words on the steel tube.In summary,the WN in Fig.3(b)increas the disorderly uncertainty and has little degradation on the primary visual information.However,the JP2K distortion mainly caus degradation on image structure.As shown in Fig.3(c),the structure of the tyre is -verely blurred,and the words on the steel tube has almost been erad. Though with a similar level of error energy(reprented by MSE),the primary visual information in Fig.3(c)is much more verely distorted than that in Fig.3(b).As a result,Fig.3(c)(with)gains a lower perceptual quality than Fig.3(b)(with).
The propod index discriminatively measures thefidelities on the primary visual information and the disorderly uncertainty.As shown in Fig.3,the input image isfirstly decompod into two portions,the or-derly images locate at the cond row and the disorderly images locate at the bottom row.Under t
he WN,the orderly image Fig.3(e)is highly similar with Fig.3(d)and their entropy are almost the same(Fig.3(d) with and Fig.3(e)with).However,with the JP2K distortion,the orderly image(Fig.3(f))is much different from Fig.3(d),and its entropy is much smaller than Fig.3(d). On the other hand,the white noi increas the disorderly degree of Fig.3(b),and the energy of its disorderly uncertainty portion(Fig.3(h) with)is larger than that of the reference image(Fig.3(g) with).While with the blurring effect from the JP2k dis-tortion,the disorderly uncertainty is decread and its corresponding energy is also decread(Fig.3(i)with).
Fig.3.Visual information degradation analysis.From the top to the end rows,they are the original images,the primary visual information portions,and the
disorderly uncertainty portions (their pixel values are scaled to [0,255]for a clear view),respectively.(a)
,(b),
(c),(d)
,(e),(f),(g),(h),(i).TABLE I
P ERFORMANCE
OF
IQA I NDICES ON L IVE D ATABASE
With discriminative measurement on information fidelities of the or-derly portion and the disorderly portion,the propod RR IQA can ac-curately evaluate the degradations of the two contaminated images (i.e.,Fig.3(b)and (c)).According to the propod index,the WN contam-inated image (Fig.3(b))mainly increas the energy of the disorderly portion and has little effect on the orderly portion;but the JP2K noi contaminated image (Fig.3(c))has vere degradation on the orderly porti
on,meanwhile it slightly decreas the energy of the disorderly portion.As a result,the measurement results from the propod index
(Fig.3(b)(with
)has a better quality than Fig.3(c)(with ))are consistent with the subjective evaluation results (rep-rented by the DMOS values).
B.Performance Comparison
In order to make a comprehensive analysis,we verify the propod RR IQA index on two large databas:LIVE databa[18],which is comprid offive prevailing distortion ,JPEG2000,JPEG, white noi,Gaussian blur and fast fading)across799distorted images; and TID databa[19],which is comprid of17types of distortion across1700distorted images.Furthermore,three well known and/or latest RR IQA indices:RRED[7],FEDM[10]and WNISM[3];and two classical FR IQA indices:PSNR and MS-SSIM[2]are chon for comparison.
According to the performance evaluation standard propod by the Video Quality Experts Group(VQEG)[20],three performance cri-teria,which are Spearman rank-order correlation coefficient(
SRCC), Pearson linear correlation coefficient(PLCC),and root mean squared error(RMSE),are adopted to evaluate the performance of the IQA methods.A better IQA index has higher SRCC and PLCC,while lower RMSE values.
A successful RR IQA index is expected to u less data of refer-ence images and achieve higher evaluation accuracy.Therefore,the reference data of the IQA indices isfirstly listed in Table I.As can be en,the reference data of the propod index,FEDM,WNISM,a single scalar bad RRED,and large scalars bad are30,32, 162,32,and(where N is the size of the image in pixels).For fair comparison(the reference data of are much larger than tho of the propod index,and the reference data of RRED are similar with tho of the propod index),we mainly compare the propod index with the single scalar bad RRED.
The evaluation results on LIVE databa and TID databa are listed in Tables I and II.As shown in Table I,comparing with the three RR IQA ,FEDM,WNISM,RRED),the propod RR IQA index performs the best on all of thefive types of distortion.In ad-dition,the propod RR IQA index outperforms the FR PSNR index on four out offive types of ,JPEG2000,JPEG,Gaussian blur and fast fading);it performs similarly with the FR MS-SSIM index on Gaussian blur and fast fading,and a little wor on the other three types of distortions.On TID databa(as shown in Table II)
,the pro-pod index performs much better on fourteen out of venteen types of distortion than WNISM and RRED,and is comparable with the two FR-IQA ,PSNR and MS-SSIM)on almost all of the types of distortion.In summary,the propod index performs very well on each type of distortion.
The overall performance of the propod index on the whole data-ba is also given in Tables I and II.On LIVE databa,the overall performances of the propod index is similar with WNISM and a little wor than RRED.On TID databa,the overall performance of the three RR-IQA indices are similar(the propod index is slightly better than WNISM and RRED).With further analysis,we have found that different types of distortion generate different degradation on the orderly and disorderly portions,and the two portions should not be combined with twofixed ,and).Therefore,the two parameters should be determined bad on distortion type.We can improved the overall performance on LIVE databa with the help of classification procedure(as ud in[21]).For example,we randomly lect15reference images(29reference images in total in LIVE databa)and their corresponding distorted images for training. Through the training,the distortion classifier is acquired.Meanwhile, the and for each type of distortion are obtained according to the minimized error between the computed scores and the DMOS values.Then,the remaining
test images are classified and evaluated.
TABLE II
SRCC V ALUES OF IQA I NDICES ON TID D ATABASE
We conducted such procedure for100times(to ensure the algorithm is robust)and the average SRCC value for LIVE databa is0.867. However,with much more distortion types,the classification method introduced in[21]is not efficient enough for the TID databa,and further studies are needed in this regard.
IV.C ONCLUSION
In this letter,inspired by the recent brain theory,a visual informa-tionfidelity bad RR IQA index is introduced.The IGM theory in-dicates that the HVS actively predicts the primary visual information and tries to avoid the residual uncertainty for image perception and un-derstanding.Therefore,distortions parately degrade the primary vi-sual information and change the disorderly uncertainty,and we should discriminatively evaluate their degradations.Wefirstly decompo an image into predicted/orderly and uncertain/disorderly portions.Then according to infor
mation theory,we compute the quantities of visual information of the two portions,respectively.Finally,we parately evaluate the informationfidelities on the two portions and combine the two results to acquire the overall quality.Experimental results demon-strate that the propod RR IQA index needs few bits to achieve high consistency with human perception.
R EFERENCES
[1]W.Lin and C.-C.J.Kuo,“Perceptual visual quality metrics:A survey,”
J.Visual Commun.Image Reprent.,vol.22,no.4,pp.297–312,2011.
[2]Z.Wang,E.Simoncelli,and A.Bovik,“Multiscale structural similarity
for image quality asssment,”in Conf.Record37th Asilomar Conf.
Signals,Systems and Computers,2003,2003,vol.2,pp.1398–1402,
V ol.2.
[3]Z.Wang and E.P.Simoncelli,“Reduced-reference image quality as-
ssment using a wavelet-domain natural image statistic model,”in
SPIE,2005,vol.5666,pp.149–159.
[4]Q.Li and Z.Wang,“Reduced-reference image quality asssment
using divisive normalization-bad image reprentation,”IEEE J.
Select.Topics Signal Process.,vol.3,no.2,pp.202–211,Apr.2009.
[5]X.Gao,W.Lu,D.Tao,and X.Li,“Image quality asssment bad on
multiscale geometric analysis,”IEEE Trans.Image Processing,vol.
18,no.7,pp.1409–1423,Jul.2009.
[6]L.Ma,S.Li,F.Zhang,and K.N.Ngan,“Reduced-reference image
quality asssment using reorganized DCT-bad image reprenta-
tion,”IEEE Trans.Multimedia,vol.13,no.4,pp.824–829,Aug.2011.
[7]R.Soundararajan and A.Bovik,“RRED indices:Reduced reference
entropic differencing for image quality asssment,”IEEE Trans.
Image Process.,vol.21,no.2,pp.517–526,Feb.2012.