Feature Selection

更新时间:2023-07-05 00:30:32 阅读：评论：0

Feature Selection

Martin Sewell杂乱无章什么意思

2007hurt locker

推荐信英文1Deﬁnition

Feature lection(also known as subt lection)is a process commonly ud in machine learning,wherein a subt of the features available from the data are lected for application of a learning algorithm.The best subt contains the least number of dimensions that most contribute to accuracy;we discard the remaining,unimportant dimensions.This is an important stage of pre-processing and is one of two ways of avoiding the cur of dimensionality(the other is feature extraction).

There are two approaches:

forward lection Start with no variables and add them one by one,at each step adding the one that decreas the error the most,until any further addition does not signiﬁcantly decrea the error.omnia

backward lection Start with all the variables and remove them one by one, at each step removing the one that decreas the error the most(or in-creas it only slightly),until any further removal increas the error sig-niﬁcantly.

店铺介绍怎么写

To reduce overﬁtting,the error referred to above is the error on a validation t that is distinct from the training t.

2Chronological Literature Review

Kira and Rendell(1992)described a statistical feature lection algorithm called RELIEF that us instance bad learning to assign a relevance weight to each feature.

John,Kohavi and Pﬂeger(1994)addresd the problem of irrelevant features and the subt lection problem.They prented deﬁnitions for irrelevance and for two degrees of relevance(weak and strong).They also state that features lected should depend not only on the features and the target concept,but also on the induction algorithm.Further,they claim that theﬁlter model approach to subt lection should be replaced with the wrapper model.

Pudil,Novoviˇc ov´a and Kittler(1994)prented“ﬂoating”arch methods in feature lection.The are quential arch methods characterized by a dynamically changing number of features included or eliminated at each step. They were shown to give very good results and to be computationally more eﬀective than the branch and bound method.

Koller and Sahami(1996)examined a method for feature subt lection bad on Information Theory:they prented a theoretically justiﬁed model for optimal feature lection bad on using cross-entropy to minimize the amount of predictive information lost during feature elimination.

Jain and Zongker(1997)considered various feature subt lection algo-rithms and found that the quential forwardﬂoating lection algorithm,pro-pod by Pudil,Novoviˇc ov´a and Kittler(1994),dominated the other algorithms tested.

Dash and Liu(1997)gave a survey of feature lection methods for classiﬁ-cation.

In a comparative study of feature lection methods in statistical learning of text categorization(with a focus is on aggressive dimensionality reduction), Yang and Pedern(1997)evaluated document frequency(DF),information gain(IG),mutual information(MI),aχ2-test(CHI)and term strength(TS); and found IG and CHI to be the most eﬀective.

Blum and Langley(1997)focusd on two key issues:the problem of lecting relevant features and the problem of lecting relevant examples.

Kohavi and John(1997)introduced wrappers for feature subt lection. Their approach arches for an optimal feature subt tailored to a particular learning algorithm and a particular training t.

Yang and Honavar(1998)ud a genetic algorithm for feature subt lec-tion.

Liu and Motoda(1998)wrote their book on feature lection which oﬀers an overview of the methods developed since the1970s and provides a general framework in order to examine the methods and categorize them.

广州产品外观设计Weston,et al.(2001)introduced a method of feature lection for SVMs which is bad uponﬁnding tho features which minimize bounds on the leave-one-out error.The method was shown to be superior to some standard feature lection algorithms on the data ts tested.

Xing,Jordan and Karp(2001)successfully applied feature lection methods (using a hybrid ofﬁlter and wrapper approaches)to a classiﬁcation problem in molecular biology involving only72data points in a7130dimensional space. They also investigated regularization methods as an alternative to featucandy boy

re lec-tion,and showed that feature lection methods were preferable in the problem they tackled.

See Miller(2002)for a book on subt lection in regression.

Forman(2003)prented an empirical comparison of twelve feature lection methods.Results revealed the surprising performance of a new feature lection metric,‘Bi-Normal Separation’(BNS).

Guyon and Eliﬀ(2003)gave an introduction to variable and feature lection.They recommend using a linear predictor of your a

linear SVM)and lect variables in two alternate ways:(1)with a variable ranking method using a correlation coeﬃcient or mutual information;(2)with a nested subt lection method performing forward or backward lection or with multiplicative updates.

For a summary of feature lection methods e Figure1,and for a taxonomy of algorithms e Figure2.

Figure1:Summary of feature lection methods.Dash and Liu(1997)

Figure2:A taxonomy of feature lection algorithms.Jain and Zongker(1997) References

BLUM,Avrim L.,and Pat LANGLEY,1997.Selection of relevant features and examples in machine learning.Artiﬁcial Intelligence,97(1–2),245–271.

DASH,M.,and H.LIU,1997.Feature lection for classiﬁcation.Intelligent Data Analysis,1(1–4),131–15

refreshed

FORMAN,George,2003.An extensive empirical study of feature lection metrics for text classiﬁcation.Journal of Machine Learning Rearch,3, 1289–1305.

GUYON,Isabelle,and Andr´e ELISSEEFF,2003.An introduction to variable and feature lection.Journal of Machine Learning Rearch,3,1157–1182. JAIN,Anil,and Douglas ZONGKER,1997.Feature lection:Evaluation, application,and small sample performance.IEEE Transactions on Pattern Analysis and Machine Intelligence,19(2),153–158.

hersJOHN,George H.,Ron KOHAVI,and Karl PFLEGER,1994.Irrelevant features and the subt lection problem.In:William W.COHEN and Haym HIRSH, eds.Machine Learning:Proceedings of the Eleventh International Conference. San Francisco,CA:Morgan Kaufmann Publishers,pp.121–129.

KIRA,Kenji,and Larry A.RENDELL,1992.A practical approach to fea-ture lection.In:Derek H.SLEEMAN and Peter EDWARDS,eds.ML92: Proceedings of the Ninth International Conference on Machine Learning.San Francisco,CA,USA:Morgan Kaufmann Publishers Inc.,pp.249–256.

KOHAVI,Ron,and George H.JOHN,1997.Wrappers for feature subt lec-tion.Artiﬁcial Intelligence,97(1–2),273–324.

KOLLER,Daphne,and Mehran SAHAMI,1996.Toward optimal feature lec-tion.In:Proceedings of the Thirteenth International Conference on Machine Learning.Morgan Kaufmann,pp.284–292.

LIU,Huan,and Hiroshi MOTODA,1998.Feature Selection for Knowledge Discovery and Data Mining.The Kluwer International Series in Engineering and Computer Science.Kluwer Academic Publishers.

MILLER,Alan,2002.Subt Selection in Regression.Second ed.Chapman& Hall/CRC.

PUDIL,P.,J.NOVOVIˇCOV´A,and J.KITTLER,1994.Floating arch meth-ods in feature lection.Pattern Recognition Letters,15(11),1119–1125. WESTON,Jason,et al.,2001.Feature lection for SVMs.In:Todd K.LEEN, Thomas G.DIETTERICH,and Volker TRESP,eds.Advances in Neural In-formation Processing Systems13.Cambride,MA:The MIT Press,pp.668–674.

XING,Eric P.,Michael I.JORDAN,and Richard M.KARP,2001.Feature lection for high-dimensional genomic microarray data.In:ICML’01:Pro-ceedings of the Eighteenth International Conference on Machine Learning. San Francisco,CA,USA:Morgan Kaufmann,pp.601–608.

YANG,Jihoon,and Vasant HONAVAR,1998.Feature subt lection using a genetic algorithm.IEEE Intelligent Systems,13(2),44–49.

trottYANG,Yiming,and Jan O.PEDERSEN,1997.A comparative study of feature lection in text categorization.In:ICML’97:Proceedings of the Fourteenth International Conference on Machine Learning.San Francisco,CA,USA: Morgan Kaufmann Publishers Inc.,pp.412–420.

本文发布于:2023-07-05 00:30:32，感谢您对本站的认可！

本文链接：https://www.wtabcd.cn/fanwen/fan/78/1078876.html

上一篇：单层神经网络、多层感知机、深度学习的总结

下一篇：AI中英词汇对照

标签：广州店铺外观设计介绍产品

留言与评论（共有 0 条评论）