Color Image Processing Pipeline in Digital Still Cameras
Rajeev Ramanath1,Wesley E.Snyder1,Youngjun Yoo2,Mark S.Drew3
1Dept.of Elec.and Comp.Engg.2Imaging and Audio Group3School of Computing Science NC State University Texas Instruments Inc.Simon Frar University
Raleigh,NC27695-7911Dallas,TX75243Vancouver,BC V5A1S6,Canada
Abstract
Digital Still Color Cameras have gained significant popularity in recent years,with projected sales in the order of44million units by the year2005.Such an explosive demand calls for an understanding of
上海社保网上查询the processing involved and the implementation issues,bearing in mind the otherwi difficult problems
the cameras solve.This article prents an overview of the image processing pipeline,first from a
signal processing perspective and later from an implementation perspective,along with the trade-offs
involved.
祭文范文1Image Formation
A good starting point to fully comprehend the signal processing performed in a digital still color camera
100以内加减法练习
(DSC)is to consider the steps by which images are formed and how each stage impacts thefinal rendered
image.There are two distinct aspects of image formation,one that has a colorimetric perspective and
another that has a generic imaging perspective,and we treat the parately.
In a vector space model for color systems,a reflectance spectrum r(λ)sampled uniformly in a spectral
range[λmin,λmax]interacts with the illuminant spectrum L(λ)to form a projection onto the color space
of the camera RGB c as follows:
c=N
S T LMr+n
(1)
where S is a matrix formed by stacking the spectral nsitivities of the K colorfilters ud in the imaging system column-wi,L is a diagonal matrix with samples of the illuminant spectrum along its diagonal,M is another diagonal matrix with samples of the relative spectral nsitivity of the CCD nsor,r is a vector corresponding to the relative surface spectral reflectance of the object,and n is an additive noi term(in fact,noi may even be multiplicative and signal-dependent,making matters mathematically much more
complex)[1,2].N corresponds to the nonlinearity introduced in the system.A similar algebraic tting exists for color formation in the human eye,producing tristimulus values where A is ud to denote the equivalent of matrix S and t denotes the tristimulus values in the CIEXYZ space[3].Let us denote by f an image with M rows,N columns,and K spectral bands.DSCs typically u K=3,although one may conceive of a nsor with a more spectral bands.In some recent cameras four nsors are ud—red,green,blue and emerald,or cyan,magenta yellow and green1The image may also be considered as a two-dimensional array with vector-valued pixels.Each vector-valued pixel is formed according to the model in Eqn.1,with values determined by the reflectance and illumination at the3D world-point indexed by2D camera pixel position.The image formed is then further modeled as follows:
g=B
Hf
(2)
where B is a colorfilter array(CFA)sampling operator,H is the point spread function(a blur)corre-sponding to the optical system,and f is a lexical reprentation of the full-color image in which each pixel is formed according to Eqn.1.
Although there is an overlap between color processing problems for other devices,such as scanners and printers,working with DSCs is complicated by problems stemming from the manner in which the input image is captured—the spatial variation in the scene lighting,the non-fixed scene geometry(location and orientation of light source,camera and surfaces in the scene),varying scene illuminants(including combination of different light sources in the same scene),the u of a colorfilter array to obtain one color sample at a nsor location,to name a few.
Further resources on DSC issues include an illustrative overview of some of the processing steps involved in a DSC by Adams,et al.[4].A recent book chapter by Parulski and Spaulding details some
of the steps[5].An excellent resource for DSC processing is a book chapter by Holm,et al.[6].The multitude of problems that still remain unsolved is a fascinating source of rearch in the community.
2Pipeline
The signalflow-chart shown in Fig.1briefly summarizes DSC processing.It should be noted that the quence of operations differs from manufacturer to manufacturer.
Each of the blocks may be“fine-tuned”to achieve“better”systemic performance[7],e.g.,introducing
1Depending upon the choice of the spectral nsitivities,the colors captured form color gamut of different sizes—for DSCs, it is typically important to capture skin tones with reasonable accuracy.
Post Processing Pre− Processing
device
Compress and Store preferred Display on White Balance Demosaic
Figure 1:Image processing involved in a Digital Still Color Camera.
a small amount of blur using the lens system increas the correlation between neighboring pixels,which in turn may be ud in the demosaicking step.Let us now consider each block in Fig.1.
2.1Sensor,Aperture and Lens
Although there is a need to measure three (or more)bands at each pixel location,this requires the u of more than one nsor and conquently drives up the cost of the camera.As a cheaper and more robust solution,manufacturers place a Color Filter Array (CFA)on top of the nsor element.Of the many CFA patterns available,the Bayer array is by far the most popular [8].Control mechanisms interact with the nsor,shown in Fig.1as a red-green-blue checkered pattern —the CFA —to determi
ne the exposure (aperture size,shutter speed and automatic gain control)and focal position of the lens.The parameters need to be determined dynamically bad on scene content.It is conventional to include an infrared blocking filter called a “hot mirror”(as it reflects infrared energy)along with the lens system as most of the filters that are ud in CFAs are nsitive in the near-infrared part of the spectrum,as is the silicon substrate ud in the nsor.
Exposure control usually requires characterization of the brightness (or intensity)of the image:an over–or under-expod image will greatly affect output colors.Depending on the measured energy in the nsor,the exposure control system changes the aperture size and/or the shutter speed along with a
carefully calibrated automatic gain controller to capture well-expod images.Both the exposure and focus controls may be bad on either the actual luminance component derived from the complete RGB image or simply the green channel data which is a good estimate of the luminance signal.
我要出租房子
The image is divided into blocks [9,10]as shown in Fig.2(a).The average luminance signal is measured in each one of the blocks that later is combined to form a measure of exposure bad upon the type of scene being imaged —backlit or frontlit scene,a nature shot,etc.In a typical image,the average luminance
signal血管炎的发病原因
(a)
凉凉的歌词
(b)
Figure 2:Illustration of the division of the image into various blocks over which the luminance signal is measured for exposure control (a)the image is divided into 24blocks (b)sample partitioning of the scene for contrast measurement.
is measured and is compared to a reference level and the amount of exposure is controlled to maint
ain a constant scene luminance.Backlit or frontlit scenes may distinguished by measuring the difference between the average luminance signal in the blocks as shown in Fig.2(b).If the image is excessively frontlit,the average energy in region A will be much higher than that in region B and vice versa in the ca of a backlit scene.The exposure is controlled so as to maintain the difference between the average signals in the two areas –an estimate
of the
object contrast.
Figs.3(a)–(c)illustrate an under-expod,over-expod,and well-expod image,respectively.
(a)(b)(c)
电脑话筒没声音
Figure 3:Images of a Macbeth ColorChecker chart showing exposure levels.(a)An under-expod image;(b)an over-expod version of the same image;(c)a well-expod image.The images were taken using a manually-controlled lens control.
Outdoor images (and many indoor ones as well)taken with typical cameras suffer from the problem of limited dynamic range –in the ca of an excessively backlit or frontlit scene.Dynamic range refers to the contrast ratio between the brightest pixel and the darkest pixel in an image.The human visual system can
adapt to about four orders of magnitude in contrast ratio,while the sRGB system and typical computer monitors and television ts have a dynamic range of about two orders of magnitude.This leads to spatial detail in darker areas becoming indistinguishable from black and spatial detail in bright areas become indistinguishable from white.To address this problem,rearchers have ud th
e approach of capturing multiple images of the same scene at varying exposure levels and combining them to obtain a‘fud’image that reprents the highlight(bright)and shadow(dark)regions of an image in reasonable detail[11].A detailed discussion of this topic is beyond the scope of this paper,however we refer the interested reader to the appropriate references[12,13,14].Another interesting approach to this problem is to“squeeze in”two nsors in the location of what was formerly one nsor element—each with a different nsitivity to light, capturing two images of different dynamic ranges,hence effectively increasing the net dynamic range of the camera.Commercial cameras are beginning to incorporate High Dynamic Range(HDR)imaging solutions into their systems,either in software through image processing routines or in hardware by modifying the actual nsor,to facilitate the capture of excessively front or backlit scenes.
Focus control may be performed by using one of two approaches–active approaches that typically u a puld beam of infrared light from a small source placed near the lens system(called an auto-focus assist lamp)to obtain an estimate of the distance to the object of interest,or passive approaches that make u of the image formed in the camera to determine the‘best’focus.Passive approaches may be further divided into two types,ones that analyze the spatial frequency content of the image and ones that u a pha detection technique to estimate the distance to the object.
Techniques that analyze the spatial frequency content of the image typically divide the image into various regions(much like in the ca of exposure control)and the position of the lens is adjusted to maximize the high-spatial-frequency content in the region(s)of interest(region labeled A in the ca of a portrait image).In other words,given that sharp edges are preferred over smooth edges,in cas when the object of interest is in the central region(A)the position of the lens is adjusted so as to maximize the energy of the image gradient in this region.A digital high-passfilter kernel is ud to measure the resulting energy as a measure of focus.There are other“smart”measures of focus discusd in the open literature[15,16])and as intellectual property[17].Clearly,such techniques require that the scene have high contrast,which is not always the ca.Also,such techniques have high computational demands on the camera.
远中天Techniques that u pha detection utilize the pha difference between the energy measured from the two halves of the lens,much like in the ca of a split-image rangefinder ud infilm-bad single-