Feature Selection
Martin Sewell杂乱无章什么意思
2007hurt locker
推荐信英文1Definition
Feature lection(also known as subt lection)is a process commonly ud in machine learning,wherein a subt of the features available from the data are lected for application of a learning algorithm.The best subt contains the least number of dimensions that most contribute to accuracy;we discard the remaining,unimportant dimensions.This is an important stage of pre-processing and is one of two ways of avoiding the cur of dimensionality(the other is feature extraction).
There are two approaches:
forward lection Start with no variables and add them one by one,at each step adding the one that decreas the error the most,until any further addition does not significantly decrea the error.omnia
backward lection Start with all the variables and remove them one by one, at each step removing the one that decreas the error the most(or in-creas it only slightly),until any further removal increas the error sig-nificantly.
店铺介绍怎么写
To reduce overfitting,the error referred to above is the error on a validation t that is distinct from the training t.
2Chronological Literature Review
Kira and Rendell(1992)described a statistical feature lection algorithm called RELIEF that us instance bad learning to assign a relevance weight to each feature.
John,Kohavi and Pfleger(1994)addresd the problem of irrelevant features and the subt lection problem.They prented definitions for irrelevance and for two degrees of relevance(weak and strong).They also state that features lected should depend not only on the features and the target concept,but also on the induction algorithm.Further,they claim that thefilter model approach to subt lection should be replaced with the wrapper model.
1
Pudil,Novoviˇc ov´a and Kittler(1994)prented“floating”arch methods in feature lection.The are quential arch methods characterized by a dynamically changing number of features included or eliminated at each step. They were shown to give very good results and to be computationally more effective than the branch and bound method.
Koller and Sahami(1996)examined a method for feature subt lection bad on Information Theory:they prented a theoretically justified model for optimal feature lection bad on using cross-entropy to minimize the amount of predictive information lost during feature elimination.
Jain and Zongker(1997)considered various feature subt lection algo-rithms and found that the quential forwardfloating lection algorithm,pro-pod by Pudil,Novoviˇc ov´a and Kittler(1994),dominated the other algorithms tested.
Dash and Liu(1997)gave a survey of feature lection methods for classifi-cation.
In a comparative study of feature lection methods in statistical learning of text categorization(with a focus is on aggressive dimensionality reduction), Yang and Pedern(1997)evaluated document frequency(DF),information gain(IG),mutual information(MI),aχ2-test(CHI)and term strength(TS); and found IG and CHI to be the most effective.
Blum and Langley(1997)focusd on two key issues:the problem of lecting relevant features and the problem of lecting relevant examples.
Kohavi and John(1997)introduced wrappers for feature subt lection. Their approach arches for an optimal feature subt tailored to a particular learning algorithm and a particular training t.
Yang and Honavar(1998)ud a genetic algorithm for feature subt lec-tion.
Liu and Motoda(1998)wrote their book on feature lection which offers an overview of the methods developed since the1970s and provides a general framework in order to examine the methods and categorize them.
广州产品外观设计Weston,et al.(2001)introduced a method of feature lection for SVMs which is bad uponfinding tho features which minimize bounds on the leave-one-out error.The method was shown to be superior to some standard feature lection algorithms on the data ts tested.
Xing,Jordan and Karp(2001)successfully applied feature lection methods (using a hybrid offilter and wrapper approaches)to a classification problem in molecular biology involving only72data points in a7130dimensional space. They also investigated regularization methods as an alternative to featucandy boy
re lec-tion,and showed that feature lection methods were preferable in the problem they tackled.
See Miller(2002)for a book on subt lection in regression.
Forman(2003)prented an empirical comparison of twelve feature lection methods.Results revealed the surprising performance of a new feature lection metric,‘Bi-Normal Separation’(BNS).
Guyon and Eliff(2003)gave an introduction to variable and feature lection.They recommend using a linear predictor of your a
2
linear SVM)and lect variables in two alternate ways:(1)with a variable ranking method using a correlation coefficient or mutual information;(2)with a nested subt lection method performing forward or backward lection or with multiplicative updates.
For a summary of feature lection methods e Figure1,and for a taxonomy of algorithms e Figure2.
Figure1:Summary of feature lection methods.Dash and Liu(1997)
3
Figure2:A taxonomy of feature lection algorithms.Jain and Zongker(1997) References
BLUM,Avrim L.,and Pat LANGLEY,1997.Selection of relevant features and examples in machine learning.Artificial Intelligence,97(1–2),245–271.
DASH,M.,and H.LIU,1997.Feature lection for classification.Intelligent Data Analysis,1(1–4),131–15
6.
refreshed
FORMAN,George,2003.An extensive empirical study of feature lection metrics for text classification.Journal of Machine Learning Rearch,3, 1289–1305.
GUYON,Isabelle,and Andr´e ELISSEEFF,2003.An introduction to variable and feature lection.Journal of Machine Learning Rearch,3,1157–1182. JAIN,Anil,and Douglas ZONGKER,1997.Feature lection:Evaluation, application,and small sample performance.IEEE Transactions on Pattern Analysis and Machine Intelligence,19(2),153–158.
hersJOHN,George H.,Ron KOHAVI,and Karl PFLEGER,1994.Irrelevant features and the subt lection problem.In:William W.COHEN and Haym HIRSH, eds.Machine Learning:Proceedings of the Eleventh International Conference. San Francisco,CA:Morgan Kaufmann Publishers,pp.121–129.
4
KIRA,Kenji,and Larry A.RENDELL,1992.A practical approach to fea-ture lection.In:Derek H.SLEEMAN and Peter EDWARDS,eds.ML92: Proceedings of the Ninth International Conference on Machine Learning.San Francisco,CA,USA:Morgan Kaufmann Publishers Inc.,pp.249–256.
KOHAVI,Ron,and George H.JOHN,1997.Wrappers for feature subt lec-tion.Artificial Intelligence,97(1–2),273–324.
KOLLER,Daphne,and Mehran SAHAMI,1996.Toward optimal feature lec-tion.In:Proceedings of the Thirteenth International Conference on Machine Learning.Morgan Kaufmann,pp.284–292.
LIU,Huan,and Hiroshi MOTODA,1998.Feature Selection for Knowledge Discovery and Data Mining.The Kluwer International Series in Engineering and Computer Science.Kluwer Academic Publishers.
MILLER,Alan,2002.Subt Selection in Regression.Second ed.Chapman& Hall/CRC.
PUDIL,P.,J.NOVOVIˇCOV´A,and J.KITTLER,1994.Floating arch meth-ods in feature lection.Pattern Recognition Letters,15(11),1119–1125. WESTON,Jason,et al.,2001.Feature lection for SVMs.In:Todd K.LEEN, Thomas G.DIETTERICH,and Volker TRESP,eds.Advances in Neural In-formation Processing Systems13.Cambride,MA:The MIT Press,pp.668–674.
XING,Eric P.,Michael I.JORDAN,and Richard M.KARP,2001.Feature lection for high-dimensional genomic microarray data.In:ICML’01:Pro-ceedings of the Eighteenth International Conference on Machine Learning. San Francisco,CA,USA:Morgan Kaufmann,pp.601–608.
YANG,Jihoon,and Vasant HONAVAR,1998.Feature subt lection using a genetic algorithm.IEEE Intelligent Systems,13(2),44–49.
trottYANG,Yiming,and Jan O.PEDERSEN,1997.A comparative study of feature lection in text categorization.In:ICML’97:Proceedings of the Fourteenth International Conference on Machine Learning.San Francisco,CA,USA: Morgan Kaufmann Publishers Inc.,pp.412–420.
5