几种降维技术在分类问题中的效果评估
耳机一边没声音作者:詹鹏伟 谢小姣
来源:《科技创新与应用》老人脚肿是什么原因2018年第21期红于二月花
摘 要:高维数据将会给数据分析带来极大的困难,因其所导致的数据分布稀疏化和数据组织效果的下降将会大大影响模型的性能。降维就是用于解决“维度灾难”的方法之一。文章从PCA英文电影海报、LLE、Isomap三种常见的降维方法入手,首先介绍了它们的实现原理,进一步结合KNN、SVM、RandomForest、Naive Bayes以及Logistics Regression模型构建了用于评价三种降维方法的综合交叉模型。结果表明,在文章所使用的数据集中,经过PCA方法与Isomap方法降维后的数据在可视的2维空间上分布较为均匀,而LLE方法分布则较为集中。且使用了PCA与Isomap方法的分类模型训练后的平均准确率高达96.44%与96.90%,高于LLE方法处理后所得的90.74%,PCA与Isomap具有较优的降维效果。本研究中所采用的方法与所得的结果为降维方法的选择提供了有益的参考。
关键词:降维;PCA;LLE;Isomap;效果评估肝火旺长痘怎么办
中图分类号:TP311.13 文献标志码:A 文章编号:怎么鼓励孩子2095-2945(2018)21-0022-03
Abstract: High-dimensional data will bring great difficulties to data analysis, and the spar distribution of data and the decline of data organization effect it caus will greatly affect the performance of the model. Dimensionality reduction is one of the ways to solve the "dimension disaster". Starting with three common dimensionality reduction methods, i.e.,摆席 PCA, LLE and Isomap, this paper introduces their implementation principles, and then constructs a comprehensive cross model for evaluating the three dimensionality reduction methods bad on the models of KNN, SVM, RandomForest, Naive Bayes and Logistics Regression. The results show that in the data t ud in this paper, after dimensionality reduction by PCA method and Isomap method,硝苯地平副作用 the distribution of the data is uniform in the visible two-dimensional space, while the distribution of LLE method is more concentrated. The average accuracy of the classification model trained with PCA and Isomap is 96.44% and 96.90%, which is higher than 90.74% with Isomap and 90.74% with LLE. The methods ud in this study and the results obtained provide a uful reference for the choice of dimensionality reduction methods.