In Proc. Document Analysis and Recognition,Seoul,Korea,2005.
space via a centroid and a certain number of ba vectors can be computed by applying Principle Component Anal-ysis(PCA)to the face images and retaining only a small number of dimensions.The discrimination of individuals–i.e.between different face class–can be achieved by us-ing the projection vectors as features.
For approaches to appearance-bad object recognition besides PCA many different methods for computing ab-stract reprentations from example images only have been [6])including LDA,ICA,or Wavelet rep-rentations.All approaches rely on a good localization of characteristic object features in the example images.As an example,for face recognition the centers of eyes and mouth are usually aligned prior to the analysis.
The easiest way of using appearance-bad techniques for handwriting recognition is to apply them to the classifi-cation of isolated characters or numerals(cf.[5]).But also frame images extracted from text lines in sliding window approaches can be subject to an appearance-bad analysis, e.g.by computing an Eigen-space reprentation via PCA. In[4]this approach was applied to the recognition of cur-sively written isolated words.However,more recent publi-cations on handwriting recognition do not consider p
urely appearance-bad features any more.Especially for the challenging task of unconstrained handwritten text recog-nition a similar approach has not been investigated,yet.
Due to the much wider variability in shape and style of handwritten texts characteristic features of characters that are usually captured by analyzing geometric primitives are less well localized within the character or frame im-ages.Conquently,appearance-bad methods need to take into account a much larger degree of appearance variability which will only to a minor degree be relevant for the dis-crimination of different characters.It can,therefore,be ex-pected that preprocessing and normalization methods have a much greater impact on the performance of appearance-bad feature reprentation than on tho relying on some sort of geometric abstraction.
3.Reference Recognition System
The system for unconstrained handwritten text recog-nition that we u as a reference for our experiments is a state-of-the-art gmentation-free recognition system bad on HMMs which was successfully applied to challenging writer-independent recognition tasks[12,13,14].
After text line extraction the handwriting is normalized with respect to skew,baline orientation,and slant.Ad-ditionally,a re-sizing of the line images is performed that tries to normalize the character wid
th by scaling the im-age such that the average distance between local minima of the text contour equals a certain parameter(25pixels).Af-ter binarization of the normalized text lines frames of con-stant width(4pixels)and of the(varying)height of the text line are extracted with some overlap(2pixels).On each of the frames9geometric features(e[12]for de-tails)together with a discrete approximation of theirfirst or-der derivatives are computed.The handwriting model con-sists of mi-continuous HMMs with Bakis-topology and a varying number of states for context independent characters (both upper and lower ca),numerals,punctuation sym-bols,and white space(75models in total).The emissions of the models in the18-dimensional feature space are de-scribed by state-specific continuous mixture densities bad on a shared t of component densities(Gaussians with di-agonal covariance matrices).
4.Appearance-Bad Features
In a gmentation-free text recognition framework appearance-bad analysis methods can be applied for ex-tracting features from the individual frame images that re-sult from sliding-window processing of the text-lines.In or-der for an analytic transformation,PCA,to produce uful results on such data it needs to be assured that re-lated elements of the writing appear at roughly the same position in the frame images.Therefore,when extract-ing frames from normalized text lines
the position of the es-timated baline is mapped to a specific position in the frame image.
Due to variation in writing style size normalization bad on estimated parameters of the writing,average char-acter width or core height,will still produce normalized text-line images with a large variation in overall height.As appearance-bad analysis techniques require input images of constant size the height variations have to be coped with during frame extraction.We investigated two possibilities: In thefirst configuration,for which the majority of exper-iments were performed,a scaling factor for the mapping of normalized text images to frames was determined such that all image content above the baline was mapped ex-actly to the upper portion of the extracted frame image.The same scaling factor was ud for mapping the descenders accordingly.In the ca that the size of the descender area was not big enough forfilling the corresponding area in the extracted frame completely the remaining pixels were as-signed the maximum grey value in the source he background intensity.In the cond configuration the frame image were not re-scaled but merely cropped from the nor-malized text lines.The vertical position of the cropped im-age region was determined by the baline estimate.Pixels not defined via the source image were again mapped to the background intensity.
Both frame extraction procedures described above gen-erate a quence of frame images of consta
nt size from the normalized text lines.The can then be directly sub-
风湿心脏病ject to an analytic image transform in order to compute appearance-bad feature ts.In the work reported here we considered Principle Component Analysis(PCA)and the Discrete Wavelet Transform(DWT).
For PCA the frame images are considered as vectors in high-dimensional space.From the training data their mean and covariance matrix are computed.The Eigenvectors for the covariance matrix belonging to the largest Eigenvalues reprent tho directions in frame-image space that repre-nt the largest variations in the data.Tho variations are also considered to be the most characteristic aspects of the frame images with respect to the recognition of handwrit-ing.Therefore,the projection of the frame images on tho first few Eigenvectors can be ud as features.
An interesting aspect of PCA is that the Eigenvectors ud for the analysis can be visualized easily as if they were elements of the source as frame images.Such a visualization of thefirst50Eigenvectors where the vec-tor components were re-scaled to the range of256grey-values is shown in Fig.1.Especially in thefirst few of the “Eigenframes”one can easily e the structures correspond-ing to the core area of the writing analyzed.
Discrete Wavelet analysis of2-dimensional data is bad on a certain type of mother Wavelet–we u Daubechies of2nd-order–and produces a reprentation split up into approximation and detail coefficients for the vertical and horizontal [10]).For a source image four blocks of coefficients–each one fourth the image size–are obtained.On the approximation coefficients obtained the Wavelet analysis can be applied recursively.As the frame images considered here are only a few pixels wide(8x128 pixel frames)only two steps of this multi-resolution analy-sis could reasonably be performed.Usually,when applying Wavelet transforms for feature extraction the approximation coefficients together with some of the detail coefficients are ud as features.In order to obtain a certain target feature vector dimension in a moreflexible way we performed an PCA on the Wavelet coefficients themlves.The projection vectors obtained from thisfinal transform were ud as fea-tures.
5.Results
In order to evaluate the performance of different appearance-bad feature extraction methods we con-ducted a ries of writer-independent recognition ex-periments on the IAM databa of handwritten texts [8].The databa consists of veral hundred docu-ments scanned at300dpi which were generated by having subjects write short paragraphs of text from veral dif-ferent text categori
es.The documents collected reprent truly unconstrained handwriting as no instructions concern-ing the writing style were given.
As in our previous experiments(cf.[12,13,14])we ud all documents from text categories A to D(485documents, 4222extracted text lines)for training and the documents from categories E and F(129documents,1076extracted text lines)for testing.
After feature extraction mi-continuous HMMs with a codebook of approximately2k densities were trained for the75symbol models ud.During recognition the u of a lexicon or a statistical language model was deliberately avoided in order to be able to obrve the effect of differ-ent feature reprentations without a possible bias result-ing from higher order models.Conquently,no restrictions were impod on the hypothesized character quences. As performance measure we computed the Character Er-ror Rate(CER)of the recognition results with respect to the reference transcription of the data.
The results of the extensive experiments are summarized in Table5.In the upper ction various configurations of appearance-bad feature extraction methods are listed.The results of the reference system using geometric features are shown in the lower ction of the table.Compared to t
he best configuration of the reference system with a CER of only26%all appearance-bad feature extraction methods
1.PCA avg.char.width baline1x1282541.5%
2.(25pixels)at75%,25+25∆37.5%
怎么练弹跳力
<-scaling4x1282538.2%
4.5038.7%
断路器品牌
5.8x1282533.8%
6.25+25∆33.7%
7.5034.5%
8.12835.1%
9.DWT+PCA8x1282534.0%
儿童睡前童话故事
10.25+25∆32.8%
11.5033.6%
12. height baline8x1282540.8%
新年祝福的诗句
13.(30pixels)at67%,25+25∆37.7%
15.50+50∆38.0%
Table1.Comparison of different appearance-bad feature ts
1Both normalization methods produce approximately the same total number of frames for the training t.
2The rather high character error rates reported in the experiments are due to the extremely challenging task of unconstrained handwritten text recognition considered and the fact that no restrictions whatso-ever were impod on the hypothesized character quences.tional context is considered via dynamic features(10%rel-ative reduction from experiment2to6).However,for the geometric features an increa of the frame size to8pix-els width decreas the performance.Due to the nature of this feature t frames of a single pixel width can not be ud at all.
The u of dynamic features always improves perfor-mance,though this effect is much more pronounced for ge-ometric features(10%relative reduction of CER from ex-periment17to18)than for the best performing appearance-bad configuration using PCA-transformed Wavelet coef-ficients(only4%improvement from experiment9to10).
Compared to the rather simple PCA-bad feature ts the improvement achieved by applying a Wavelet transform is rather small(only approximately3%improvement from experiment6to10).The reason for this is most likely that the multi-resolution analysis can not be exploited fully due to the extremely small width of the frame images analyzed.
The most important obrvation is,however,that the combination of appropriate size normalization and frame extraction methods is crucial for the performance of both appearance-bad and geometric feature ts.When nor-
malizing the average core height instead of the average character width the performance of geometric features de-grades by more than45%relative(experiments18to16). With this normalization and cropped frame images PCA-bad features even outperform the geometric feature t (experiment13).This obrvation suggests that more re-arch is required with respect to a robust m
ethod for size normalization and frame extraction that optimally comple-ments the appearance-bad features computed from the frame images by PCA or DWT on highly varying writer in-dependent handwriting data.李白之死
6.Conclusion
In this paper we prented an experimental analysis of different methods for computing appearance-bad feature ts–namely using PCA or discrete Wavelet transforms–for the writer independent recognition of handwritten texts in a gmentation-free framework bad on HMMs.The ex-tensive experiments performed show that promising results with respect to the state-of-the-art reference system using geometric features could be achieved.The still existing per-formance gap and the obrved strong impact of normaliza-tion steps indicate that the aspects need to be optimized together with the appearance-bad methods applied in or-der to reach the performance of the geometric feature t. This would allow to design powerful feature ts for hand-writing recognition in a completely data-driven manner. 7.Acknowledgment
We would like to thank the Institute of Informatics and Applied Mathematics,University of Bern,namely Horst Bunke and Urs-Viktor Marti,who allowed us to u the IAM databa of handwritten forms[8]for our recognition experiments.
References
[1]K.Aas and L.Eikvil.Text page recognition using grey-level
features and hidden Markov models.Pattern Recognition, 29(6):977–985,1996.
[2]I.Bazzi,R.Schwartz,and J.Makhoul.An omnifont open-
vocabulary OCR system for English and Arabic.IEEE Trans.
on Pattern Analysis and Machine Intelligence,21(6):495–504,1999.
[3]H.Bunke,M.Roth,and E.G.Schukat-Talamazzini.Off-line
大葱怎么做好吃
cursive handwriting recognition using Hidden Markov Mod-els.Pattern Recognition,28(9):1399–1413,1995.
[4]W.Cho,S.-W.Lee,and J.H.Kim.Modeling and recogni-
tion of cursive words with hidden Markov models.Pattern Recognition,28(12):1941–1953,1995.
[5]S.E.N.Correia,J.M.de Carvalho,and R.Sabourin.On the
performance of wavelets for handwritten numerals recogni-tion.In Proc. Pattern Recognition,volume3, pages127–130,Qu´e bec,2002.
[6]R. B.Fisher.CV online:The evolving,distributed,
non-proprietary,on-line compendium of computer vision.
homepages.inf.ed.ac.uk/rbf/CV online/.
[7]U.-V.Marti and H.Bunke.Handwritten ntence recogni-
tion.In Proc. Pattern Recognition,volume3, pages467–470,Barcelona,2000.
[8]U.-V.Marti and H.Bunke.The IAM-databa:An english
菌陈ntence databa for offline handwriting recognition.Int.
Journal on Document Analysis and Recognition,5(1):39–46, 2002.
[9]A.W.Senior and A.J.Robinson.An off-line cursive hand-
writing recognition system. Pattern Analysis and Machine Intelligence,20(3):309–321,1998.
[10]E.J.Stollnitz,T.D.DeRo,and D.H.Salesin.Wavelets for
computer graphics:A primer,part1.IEEE Computer Graph-ics and Applications,15(3):76–84,1995.
[11]M.Turk and A.Pentland.Eigenfaces for recognition.Jour-
nal of Cognitive Neuro Science,3(1):71–86,1991.
[12]M.Wienecke,G.A.Fink,and G.Sagerer.Experiments in
unconstrained offline handwritten text recognition.In Proc.
8th Int.Workshop on Frontiers in Handwriting Recognition, Niagara on the Lake,Canada,August2002.
[13]M.Wienecke,G.A.Fink,and G.Sagerer.Towards auto-
matic video-bad whiteboard reading.In Proc.Int.Conf.
on Document Analysis and Recognition,pages87–91,Edin-burgh,2003.
[14]M.Wienecke,G.A.Fink,and G.Sagerer.Video-bad
whiteboard reading.Int.Journal on Document Analysis and appear.