Vol.33,No.5ACTA AUTOMATICA SINICA May,2007 Study of Feature Extraction Bad on Autoregressive Modeling in ECG Automatic Diagnosis
GE Ding-Fei1HOU Bei-Ping1XIANG Xin-Jian1
Abstract This article explores the ability of multivariate autoregressive model(MAR)and scalar AR model to extract the features from two-lead electrocardiogram signals in order to classify certain cardiac arrhythmias.The classification performance of four different ECG feature ts bad on the model coefficients are shown.The data in the analysis including normal sinus rhythm, atria premature contraction,premature ventricular contraction,ventricular tachycardia,ventricularfibrillation and superventricular tachycardia is obtained from the MIT-BIH databa.The classification is performed using a quadratic discriminant function.The results show the MAR coefficients produce the best results among the four ECG reprentations and the MAR modeling is a uful classification and diagnosis tool.
Key words Autoregressive model,ECG features,classification,automatic diagnosis.
1Introduction
One of the most important tasks is the reliable detec-tion and classification of the arrhythmias for automatic monitoring and diagnosis.Among tho threatening ar-rhythmias,ventricular tachycardia(VT)and ventricular fibrillation(VF)are most dangerous becau they produce the haemodynamic deterioration.Other arrhythmias like premature ventricular contraction(PVC)etc.are not so lethal,but are also important for diagnosing the heart dis-eas.Various studies have been propod for classification of various cardiac arrhythmias,such as analysis of peaks in the short-term autocorrelation function[1],time-frequency analysis[2],nonlinear dynamical modeling method[3,4],to-tal least squares bad Prony modeling algorithm[5],cor-rection waveform analysis[6],and artificial neural networks for decimated ECG analysis[7].Generally,the techniques classify only two or three arrhythmias,therefore there is a need for extending the identification technique for a larger number of arrhythmias and easy real-time implementation.
Multivariate autoregressive(MAR)modeling provides an approach to analy the bio-signals.For example,MAR modeling was widely ud to model heart rate(HR),blood pressure(BP)and respiration(RESP)for asssment of interaction between them[8].MAR modeling was ud to extract the features from the human electroencephalogram with which mental tasks can be discriminated[9].However, in the study of ECG arrhythmia recognition problems,re-arches have no
t done too much using MAR model and multiple lead ECGs.Scalar autoregressive(AR)modeling has been widely utilized to model bio-signals for the pur-po of analysis,such as AR modeling of scalar time signals bad on Kalmanfilter for calculating instantaneous mea-sures of linear dependence[10],AR modeling ud to model heart rate variability(HRV)and for power spectrum esti-mation of ECG and HRV signals[11],AR coefficients ud as ECG features for classification of cardiac arrhythmias us-ing fuzzy ARTMAP[12].It is noted that normal ECG QRS complexes are usually prominent in ECG lead II and nor-mal beats are frequently difficult to discern in ECG lead VI although ectopic beats will often be more prominent. Thus,two-lead ECG signals contain more information than Received January16,2006;in revid form May24,2006 Supported by Natural Science Foundation of Zhejiang Province of P.R.China(Y104284)
1.School of Information and Electronic Engineering,Zhejiang Uni-versity of Science and Technology,Hangzhou310012,P.R.China DOI:10.1360/aas-007-0462one-lead ECG signals,and the classification results can be improved by using two-lead ECG signals significantly.
The purpo of the prent work is to explore the feasi-bility of MAR and AR modeling to extract the classifica-tion features from two-lead ECG signals in order to classify more types of cardiac arrhythmias with higher accuracy. In this study,MAR and AR modeling were performed on the ECG d
ata including normal sinus rhythm(NSR),atria premature contraction(APC),PVC,VT,VF and super-ventricular tachycardia(SVT).There were four ECG rep-rentations bad on the model coefficients,and the classi-fication was performed using quadratic discriminant func-tion(QDF)bad classifier.Three hundred sample pat-terns each from the six class were lected for analysis.A training data t consisted of150sample patterns each from the six class,and the remaining data was ud for testing. The results showed that the MAR coefficients could clas-sify better than other three reprentations.Thus,MAR modeling is a uful classification and diagnosis tool for the cardiac arrhythmias.
2Methods
2.1Preprocessing
The data in the analysis was obtained from the MIT-BIH databa.The NSR,PVC and APC were sampled at360Hz,the VT and VF were sampled at250Hz,and the SVT was sampled at128Hz.The data including NSR, PVC,APC and SVT was subsampled in order that all the two-lead ECG signals in the analysis had a frequency of 250Hz.All ECG data have beenfiltered to remove the noi including respiration,ba line drift and wandering etc.The high-passfilter is of a linear pha charact
eristic bad on the frequency of250Hz.The cut offfrequency of the high-passfilter is2Hz.Thus,the drift caud by res-piration at about0.2Hz is sufficiently removed.The other noi caud by the motion from the electrode is also min-imized.
The R peaks of the ECGs were detected using Tompkin s algorithm[13].A normal ECG refers to the usual ca in the health adults where the heart rate is60∼100beats per minute.In the current study,the sample size of the various gments was0.9conds.0.3conds before R peak and 0.6conds after R peak were picked for modeling.It is ad-equate to capture most of the information from a particular cardiac cycle.
No.5GE Ding-Fei et al.:Study of Feature Extraction Bad on Autoregressive Modeling (463)
节水标语2.2MAR and scalar AR modeling
A common form of a MAR model of order P is given
by[8,9].
X(k)=−
P
X
i=1
A(i)X(k−i)+e(k)(1)
where X(k)is a2-dimensional column vector of obrva-tions at time k,e(k)is a2-dimensional column vector of unknown,zero-mean,uncorrelated random variable,A(i), for i=1,2,...,P is the2×2matrices of MAR model coef-ficients to be estimated.
It is important to determine the model order which best fits the data when constructing a MAR model.The model was estimated from225points of data(0.9conds)from two ECG leads in this rearch.The model order lection was performed on the six types of two-lead ECG signals in-cluding in the analysis.Pre-lected model orders from one to eight were investigated for model order lection.Burg s algorithm was ud to estimate the MAR coefficients.The criterion ud to evaluate the model order lection was the sum-squared error(SSE)in this work[10].
Scalar AR modeling was performed on each of the two ECG leads for the six types of ECG signals.The AR model order was estimated bad on the SSE,and was calculated over all estimates in the225-point window gmented from single lead.
2.3ECG features
In this study,four different reprentations of ECG sig-nals were ud for classification:the MAR coefficients,the K-L MAR coefficients,the scalar AR coefficients bad on two-lead ECG,and the scalar AR coefficients bad on single-lead ECG.
2.3.1ECG features bad on MAR and K-L MAR
coefficients
A MAR process of order P has been applied to the two-lead ECG signals from the six class.The number of MAR coefficients reprenting a two-lead ECG gment was4P.
In order to reduce the redundancy of features,K-L MAR coefficients was computed and ud as features.The K-L transform can reduce the dimension of feature space by projecting the original feature vectors onto a small num-ber of eigenvectors.The K-L transform in this study was performed as follows[14]:
1)Calculate the within-class scatter matrix.2)Calculate the eigenvalues and eigenvectors of the within-class scatter matrix.3)The t of m eigenvectors which correspond to the m largest eigenvalue
s was chon to transfer the origi-nal data,the corresponding eigenvectors in this study was determined by the index i for which r i/r max≤0.001,where i=1,2,...,4P,r i s are in the descending order.4)Gener-ate the K-L transform by projecting each4P-dimentional pattern onto the chon eigenvectors.Thus the dimen-sion of the features bad on K-L MAR coefficients was m.
2.3.2ECG features bad on scalar AR coeffi-
cients
A scalar AR process of order P has been performed on each ECG lead from the six class.The scalar AR co-efficients were estimated from each lead and concatenated together to form the feature vectors for the classification. The number of the scalar AR coefficients reprenting a two-lead ECG gment was2P,the number of the scalar AR coefficients reprenting a single-lead ECG gment was P.2.4QDF-bad classification
The ECG features described as above were utilized to classify the cardiac arrhythmias.The various cardiac ar-rhythmias have been classified by a stage-by-stage QDF-Bad algorithm in current rearch.The QDF is given by[14]
y i=X iβ+εi(2) where x=[x1,x2,...,x d]reprents a d-dimensional ECG feature vector,y i is an obrved respon,εi is the QDF error,βis a(d(d+3)/2+1)-dimensional column vector. X i is a(d(d+3)/2+1)-dimensional row vector,that is
X i=[1,x1,x2,...,x d,x21,x22,...,x2d,2x1x2,2x1x3,..., 2x1x d,2x2x3,2x2x4,...,2x2x d,...,2x d−1x d]
The ECG feature vector of a particular ECG gment was mapped to a respon(1or–1).Assume the total number of the ECG gments ud for classification at a particular stage is D.The following equation can be given
˜Y=Aββ+E(3) where˜Y=[y1,y2,...,y D]T is a D-dimensional column vec-tor of the obrved respons,and made up of“1”and “-1”,which correspond to different class respectively, A=[X1,X2,...,X D]T is a D×(d(d+3)1/2+1)matrix, E=[ε1,ε2,...,εD]T is a D-dimensional column vector of the errors.
The least squares estimator is
β=(A T A)−1A T˜Y(4) The quadratic discriminant function of the classifier is
Y I=X iβ(5) Table1shows the classification algorithm for the MAR and K-L MAR coefficients.The simil
ar classification al-gorithm can be constructed for the scalar AR coefficients. The criterion bad on standard deviation and Euclidean center distance(SDECD)was ud to measure the para-bility between two class.Associated value of SDECD was computed to determine the groupings of the class at each stage in order to perform the stage-by-stage classification. The SDECD can be expresd as[14]
J=
s
d
市场环境P
i=1
(µ1i−µ2i)2
3(1
d
如何学好初中语文
i=1
σ1ii+1
d
i=1
σ2ii)
(6)
whereσ1ii andσ2ii(i=1,2,3,...,d)reprent the stan-dard deviations of variables,µ1=[µ11,µ12,...,µ1d]T and µ2=[µ21,µ22,...,µ2d]T are the expected vectors.
During the training pha,the estimatorβwas computed by equation(4)using the lected training ts at each stage of the classification.During the testing pha,the output respon at each stage of the classification was computed using the feature vectors and the previously estimatedβby equation(5).A threshold value of zero was ud to clas-sify the output respon at a particular stage.The average nsitivity and specificity were computed for all the class for measuring the performance of the classification[15].
464ACTA AUTOMATICA SINICA Vol.33 Table1Classification algorithm for MAR and K-L MAR coefficients
Stage1Stage2Stage3Stage4
Groups Member Decision-
Groups
Member Decision-
Groups
Member Decision-
Groups
逍遥游原文朗诵Member Decision-ship making ship making ship making ship making
NSR1Y1>0NSR-1Y2<0APC/NSR-1Y3<0NSR1Y5>0 APC1Y1>0APC-1Y2<0PVC/NSR1Y3>0PVC-1Y5<0 PVC1Y1>0PVC-1Y2<0VT-1Y4<0NSR1Y6>0 VT/VF1Y1>0VT/VF1Y2>0VF1Y4>0APC-1Y6<0
SVT-1Y1<0
3Results
3.1MAR and scalar AR modeling Results
In order to evaluate the performance of the MAR model-ing,the SSE was computed over all estimates in the length of modeled ECG signals.The results showed that the SSE decread initially with the model order P,but remained almost constant for model order greater than or equal to three.However,MAR model of order four was lected for extracting the features.This is becau more details can be incorporated into the model order,which might be missing from a lower-order model.On the other hand,the number of the MAR coefficients and computation for higher orders would increa rapidly.So the MAR model of order4is a fitter lection.
Scalar AR modeling has been performed for the purpo of classification.Afitter scalar AR model of order4was found to model the ECG using SSE criterion calculated from single-lead ECGs and over all estimates in the225-point window.This result was consistent with the other rearches on the scalar AR model order lection[16].
3.2Classification results
A MAR model of order4and a scalar AR model of order 4were lected to model the ECG signals in the current re-arch.The MAR coefficients computed with order4,the K-L MAR coefficients and the scalar AR coefficients esti-mated with order4were ud for QDF-bad classification.
3.2.1Classification results bad on MAR and
K-L MAR coefficients
The ECG features were extracted by applying MAR pro-cess of order4to the two-lead ECG signals.This resulted in the16MAR coefficients to reprent a two-lead ECG gment in this rearch.Table1shows the classification algorithm for this ca.The values of SDECD between the class were computed for determining the group-ings of class at each stage.Table2shows the values of SDECD bad on the MAR coefficients.One can e that APC/NSR/PVC,VT/VF and SVT form one group respec-tively due to small values of SDECD within the same group and large values between different groups.Therefore,SVT was parated from APC/NSR/PVC and VT/VF in stage one(Y1).The membership of SVT was defined as“-1”,and the membership of APC/NSR/PVC and VT/VF was de-fined as“+1”.The least squares estimatorβwas computed as equation(4).The output respon Y1was computed as equation(5).The value of Y1was ud to determine the class.Simila
rly,VT/VF and APC/NSR/PVC were dis-tinguished between each other in the cond stage(Y2).Stage three(Y3and Y4),four(Y5and Y6)were ud to dif-ferentiate between APC,NSR,PVC,VT and VF as shown in Table1.
食品的英文One hundred andfifty cas each from the six class were lected at random to estimateβin training pha, and the remaining were ud for testing in testing pha. The classification results bad on the MAR coefficients on testing data are given in Tables3and4.Table3shows a classification results bad on the MAR coefficients for a sample training t.Table4shows the performance of classification bad on the MAR coefficients for the various class,which were averaged over20runs,each run with different training and testing data ts.
Table2Values of SDECD bad on MAR coefficients between
the different class
什么什么扬扬
Class SVT APC PVC NSR VT VF SVT0 1.6587 1.3775 1.5718 1.6287 2.8733 APC 1.658700.9669 1.2325 1.5397 2.8077 PVC 1.37750.96690 1.1739 1.4374 2.2122 NSR 1.5718 1.2325 1.17390 1.9631 2.2082 VT 1.6287 1.5397 1.4374 1.96310 1.0671 VF 2.8733 2.8077 2.2122 2.2082 1.06710
Table3Classification results bad on MAR coefficients for a
sample training t
Class SVT APC NSR PVC VT VF
SVT14800200
APC01473000
NSR01149000
PVC00014910
VT00001500
VF00000150
Table4Performance of the classification bad on MAR
coefficients
Class SVT NSR APC PVC VF VT
Sensitivity98.6%99.3%98.0%99.3%100%100% Specificity100%98.0%99.3%98.6%100%99.3%
No.5GE Ding-Fei et al.:Study of Feature Extraction Bad on Autoregressive Modeling (465)
The number of the eigenvectors was chon to be10ac-cording to the choice criterion of eigenvectors described in ction2.Thus,10-dimensional feature vectors bad on K-L MAR coefficients were obtained after K-L transforma-tion.The10-dimensional feature vectors were trained and tested the same way as in the MAR coefficients bad clas-sification experiments,the classification results bad on the K-L MAR coefficients on the testing data are given in Table5.
Table5Performance of the classification bad on K-L MAR
coefficients
Class SVT NSR APC PVC VF VT Sensitivity97.3%99.3%96.6%95.3%98.6%96.6% Specificity96.6%93.3%99.3%98.0%99.3%97.3%
3.2.2Classification results bad on scalar AR co-
efficients
A scalar AR process of order4was performed on each ECG lead from the six class.Thus,the number of the scalar AR coefficients to reprent a two-lead ECG g-ment was8.A similar analysis method was employed for the scalar AR coefficient classification.The classification results bad on the scalar AR coefficients and two-lead ECG gments are shown in Table6.
Table6Performance of the classification bad on scalar AR coefficients and two-lead ECG gments
Class SVT NSR APC PVC VF VT Sensitivity96%99.3%96.6%98.0%98.6%97.3% Specificity99.3%94.6%99.3%96.0%99.3%98.0%
The classification results bad on scalar AR coefficients and single-lead ECG are given in Table7.It is for the purpo of comparison between one-lead ECG signal and two-lead ECG signal bad classification.
Table7Performance of the classification bad on single-lead
ECG signals
Class SVT NSR APC PVC VF VT Sensitivity90.0%98.6%94.6%92.6%99.3%92.0% Specificity95.3%86.0%99.3%92.0%97.3%98.0%
4Discussions
The main objective of this study was to model two-lead ECG signals for extracting features in order to explore the feasibility to classify more types of cardiac arrhythmias us-ing MAR and AR modeling.The modeling results showed that the MAR order of4was sufficient to model the ECG signals for the purpo of the classification,scalar AR or-der of4was also sufficient for the same purpo.It was reported that the sufficient MAR model order was25for modeling HR,BP,and RESP for the purpo of asssment of interaction between them in[8].
Extra calculation was involved in calculating K-L trans-form of MAR coefficients.This reprentation may not be worth considering for a real-time system.The classifica-tion of the scalar AR coefficients extracted from two-lead ECGs produced the similar percentages of the accuracy compared to classification of the K-L MAR coefficients. The classification of the scalar AR coefficients extracted from signal-lead ECGs gave the lowest classification ac-curacy.Thus,the MAR coefficients would be the most efficient ECG signal reprentation.Using two-lead ECG signals can improve the classification accuracy significantly compared with single-lead ECG signals.
The current study classifies six types of ECG arrhyth-mias,and some of the propod techniques u only a smaller number of arrhythmias than the current study.For example,two AR coefficients and the mean-square value of QRS complex gment were utilized as features for clas-sifying PVC and NSR using a fuzzy ARTMAP classifier, nsitivity of97%and specificity of99%were achieved in [12],the total least quare-bad Prony modeling tech-nique was ud for detecting SVT,VT and VF,accuracy of SVT,VT and VF were95.24%,96%and97.78%in[5]. The classification algorithms bad on MAR modeling are easy to implement.In this study,the sample size of the various gments was0.9conds only,and it was3to7 conds and5to9conds for the complexity measure-bad technique in[3]and the Prony modeling technique in[5],respectively.
The MAR model might not be suited to ECG signals under all conditions since MAR model is a linear modeling technique,nonlinear parametric modeling might improve the results.Future work would involve real-time data col-lection in order to test our hypothesis and determine the precision of our methodology.
5Conclusions
MAR coefficients extracted by fusing two ECG leads could be ud as features to classify certain cardiac ar-rhythmias effectively in critical ill patients for real-time automatic diagnosis purpo.
References
vc的功效1Chen S,Thakor N V,Mover M M.Ventricularfibrillation de-tection by a regression test on the autocorrelation function.
Medical and Biological Engineering and Computing,1987, 25(3):241∼249
2Afonso V X,Tompkins W J.Detecting ventricularfibrilla-tion:Selecting the appropriate time frequency analysis tool for the application.IEEE Engineering in Medicine and Biol-ogy Magazine,1995,14(2):152∼159
3Zhang X S,Zhu Y S,Thakor N V,Wang Z Z.Detecting ven-tricular tachycardia andfibrillation by complexity measure.
IEEE Transactions on Biomedical Engineering,1999,46(5): 548∼555
4Jekova I.Comparison offive algorithms for the detection of ventricularfibrillation from the surface ECG.Physiological Measurement,2000,21(4):429∼439
5Chen S W.Two stage discrimination of cardiac arrhyth-mias using a total least squares bad prony modeling algo-rithm.IEEE Transactions on Biomedical Engineering,2000, 47(10):1317∼1326
6Caswell S A,Kluge K S,Chiang C M J,Jenkins J M,Carlo L A.Pattern recognition of cardiac arrhythmias using two intracardiac channels.In:Proceedings of Computers in Car-diology.London,UK,IEEE,1993.181∼184
7Melo S L,Caloba L P,Nadal J.Arrhythmia analysis using artificial neural network and decimated electrocardio graphic data.In:Proceedings of Computers in Cardiology.Piscat-away,USA,IEEE,2000.27:73∼76
466ACTA AUTOMATICA SINICA Vol.33
8Jimenez J C,Biscay R,Montoto O.Modelling the electroen-cephalogram by means of spatial spline smoothing and tem-poral auto regression.Biological Cybernetics,1995,72(3): 249∼259
9Keirn Z A,Aunon J I.A new method of communication between man and his surroundings.IEEE Transactions on Biomedical Engineering,1990,37(12):1209∼1214
10Arnold M,Miltner W H R,Witte H.Adaptive AR modeling of nonstationary time ries by means of Kalmanfiltering.
IEEE Transactions on Biomedical Engineering,1998,45(5): 553∼562
11Mainardi L T,Bianchi A M,Balli G,Cerutti S.Pole tracking algorithms for the extraction of time variant heart rate variability spectral parameters.IEEE Transactions on Biomedical Engineering,1995,42(3):250∼258
12Ham F M,Han S.Classification of cardiac arrhythmias us-ing fuzzy ARTMAP.IEEE Transactions on Biomedical En-gineering,1996,43(4):425∼430
13Tompkins W J.Biomedical Digital Signal Processing.Engle-wood Cliffs,New Jery:Prentice Hall,1993,246∼261
14Fukunaga K.Introduction to Statistical Pattern Recog-nition.New York:Academic Press,1990,153∼154and 400∼409
15Barro S,Ruiz R,Cabello D,Mira J.Algorithmic quential decision making in the frequency domain for life threatening ventricular arrhythmias and imitative artefacts:a diagnos-tic system.Journal of Biomedical Engineering,1989,11(4): 320∼328
16Ge Ding-Fei,Xia Shun-Ren.Application of AR model in telediagnosis of cardiac arrhythmias.Chine Journal of Biomedical Engineering,2004,23(3):222∼229(in Chine)GE Ding-F
ei Received his master de-grees from Nanyang Technological Univer-sity,Singapore in2003.Now he is an as-sociate professor in Zhejiang University of Science and Technology.His rearch in-terest covers pattern recognition and data mining.Corresponding author of this pa-per.E-mail:
HOU Bei-Ping Received his Ph.D.de-gree from Zhejiang University in2005.His rearch interest covers machine vision,im-age processing,and pattern recognition. XIANG Xin-Jian Professor of Zhe-jiang University of Science and Technology. His rearch interest covers intelligent con-trol and application of pattern recognition.
>雷蝎