用户名: 密码: 验证码:
融合序列后向选择与支持向量机的混合式特征选择算法
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Hybrid Feature Selection Algorithm for Fusion Sequence Backward Selection and Support Vector Machine
  • 作者:吴清寿 ; 刘长勇 ; 林丽惠
  • 英文作者:WU Qing-Shou;LIU Chang-Yong;LIN Li-Hui;School of Mathematics and Computer Science, Wuyi University;Fujian Provincial Key Laboratory of Cognitive Computing and Intelligent Information Processing;
  • 关键词:混合式特征选择 ; 序列后向选择 ; 支持向量机 ; 降维
  • 英文关键词:hybrid feature selection;;sequential backward selection;;Support Vector Machine(SVM);;dimension reduction
  • 中文刊名:XTYY
  • 英文刊名:Computer Systems & Applications
  • 机构:武夷学院数学与计算机学院;认知计算与智能信息处理福建省高校重点实验室;
  • 出版日期:2019-07-15
  • 出版单位:计算机系统应用
  • 年:2019
  • 期:v.28
  • 基金:福建省自然科学基金(2019J01835,2017J01651,2017J01780);; 福建省中青年教师教育科研项目(JAT170608);; 认知计算与智能信息处理福建省高校重点实验室开放课题(KLCCIIP2017104)~~
  • 语种:中文;
  • 页:XTYY201907028
  • 页数:6
  • CN:07
  • ISSN:11-2854/TP
  • 分类号:178-183
摘要
维度灾难是机器学习任务中的常见问题,特征选择算法能够从原始数据集中选取出最优特征子集,降低特征维度.提出一种混合式特征选择算法,首先用卡方检验和过滤式方法选择重要特征子集并进行标准化缩放,再用序列后向选择算法(SBS)与支持向量机(SVM)包裹的SBS-SVM算法选择最优特征子集,实现分类性能最大化并有效降低特征数量.实验中,将包裹阶段的SBS-SVM与其他两种算法在3个经典数据集上进行测试,结果表明,SBS-SVM算法在分类性能和泛化能力方面均具有较好的表现.
        Dimensional disaster is a common problem in machine learning tasks. The feature selection algorithm can select the optimal feature subset from the original data set and reduce the feature dimension. A hybrid feature selection algorithm is proposed. Firstly, the chi-square test and filtering method are used to select the important feature subsets and normalize scale, and then SBS-SVM wrapped by SBS and SVM. The algorithm selects the optimal feature subset to maximize the classification performance and effectively reduce the number of features. In the experiment, the SBS-SVM in the parcel stage and the other two algorithms are tested on three classical data sets. The results show that the SBS-SVM algorithm has better performance in classification performance and generalization ability.
引文
1黄铉.特征降维技术的研究与进展.计算机科学,2018,45(6A):16-21,53.
    2Liu H,Yu L.Toward integrating feature selection algorithms for classification and clustering.IEEE Transactions on Knowledge and Data Engineering,2005,17(4):491-502.[doi:10.1109/TKDE.2005.66]
    3初蓓,李占山,张梦林,等.基于森林优化特征选择算法的改进研究.软件学报,2018,29(9):2547-2558.[doi:10.13328/j.cnki.jos.005395]
    4Almuallim H,Dietterich TG.Learning Boolean concepts in the presence of many irrelevant features.Artificial Intelligence,1994,69(1-2):279-305.[doi:10.1016/0004-3702(94)90084-1]
    5Pudil P,Novovi?ováJ,Kittler J.Floating search methods in feature selection.Pattern Recognition Letters,1994,15(11):1119-1125.[doi:10.1016/0167-8655(94)90127-9]
    6Fujarewicz K,Wiench M.Selecting differentially expressed genes for colon tumor classification.International Journal of Applied Mathematics and Computer Science,2003,13(3):327-335.
    7Kabir MM,Shahjahan M,Murase K.A new hybrid ant colony optimization algorithm for feature selection.Expert Systems with Applications,2012,39(3):3747-3763.[doi:10.1016/j.eswa.2011.09.073]
    8Mao Y,Zhou XB,Xia Z,et al.A survey for study of feature selection algorithms.Pattern Recognition and Artificial Intelligence,2007,20(2):211-218.
    9叶小泉,吴云峰.基于支持向量机递归特征消除和特征聚类的致癌基因选择方法.厦门大学学报(自然科学版),2018,57(5):702-707.
    10Tan KC,Teoh EJ,Yu Q,et al.A hybrid evolutionary algorithm for attribute selection in data mining.Expert Systems with Applications,2009,36(4):8616-8630.[doi:10.1016/j.eswa.2008.10.013]
    11谢娟英,谢维信.基于特征子集区分度与支持向量机的特征选择算法.计算机学报,2014,37(8):1704-1718.
    12雷海锐,高秀峰,刘辉.基于机器学习的混合式特征选择算法.电子测量技术,2018,41(16):42-46.
    13武小年,彭小金,杨宇洋,等.入侵检测中基于SVM的两级特征选择方法.通信学报,2015,36(4):2015127.
    14Platt JC.Fast training of support vector machines using sequential minimal optimization.Sch?lkopf B,Burges CJC,Smola AJ.Advances in Kernel Methods:Support Vector Learning.Cambridge:MIT Press,1998:185-208.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700