APT样本的有效网络特征筛选算法

设为首页

收藏本站

网站地图 | English | 公务邮箱

远程访问

NSTL服务站

APT样本的有效网络特征筛选算法

详细信息查看全文 | 推荐本文 |

英文篇名：Effective Network Feature Filtering Algorithm for APT Samples
作者：李翼宏 ; 杜镇宇 ; 胡劲松
英文作者：LI Yihong;DU Zhenyu;HU Jinsong;Department of Network,Electronic Countermeasure Institute,National University of Defense Technology;
关键词：APT攻击 ; 网络特征 ; 降维 ; k-means++ ; 区分度
英文关键词：APT attack;;network features;;dimension reduction;;k-means ++;;discrimination
中文刊名：JSGG
英文刊名：Computer Engineering and Applications
机构：国防科技大学电子对抗学院网络系;
出版日期：2018-05-19 16:37
出版单位：计算机工程与应用
年：2019
期：v.55;No.922
基金：国家自然科学基金(No.U1636201)
语种：中文;
页：JSGG201903014
页数：7
CN：03
分类号：88-94

摘要

在研究APT攻击的防御方案过程中,针对提取APT样本网络特征的维数过高问题,提出一种基于k-means++聚类的APT样本有效网络特征筛选算法。该算法的思路是首先基于聚类的思想将提取的原特征集划分成APT流量特征集与背景流量特征集,然后计算去掉某一维特征向量后聚类性能的变化程度,最后根据该结果评价该特征向量的区分度。其中,有效特征向量即为区分度超过设定阈值的特征向量。目的就是从提取的原特征集中筛选出有效特征,达成对特征的降维,从而降低后续威胁情报形成和部署检测工作的时空开销。实验结果表明,该算法具有一定可行性,针对此问题相比于其他筛选算法具有一定的优势。
By studying the defense scheme of APT attacks, this paper proposes an effective network feature filtering algorithm based on k-means++ clustering to deal with the problem of high dimensionality of network features which extracted from APT samples. Firstly, this algorithm divides the original feature set into APT traffic feature set and normal traffic feature set by the clustering method. Then, it calculates the degree of variation of clustering performance after removing a certain dimension feature. Finally, the degree of discrimination of the feature vector is evaluated according to the result.Among them, the effective feature vector is whose discrimination degree exceeds the set threshold. The purpose of this paper is to filter out the effective features from the extracted original feature sets. In this way, it can reduce the dimensionality of the features so as to reduce the space-time overhead of subsequent threat intelligence formation and detection. The experimental results show that the proposed algorithm is feasible and has some advantages over other filtering algorithms.

引文

[1]张帅.对APT攻击的检测与防御[J].信息安全与技术,2011(9):125-127.
    [2] Liu X.Research on prevention solution of advanced persistent threat[C]//International Conference on Software Engineering,Knowledge Engineering and Information Engineering,2014.
    [3] Vukalovi?J,Delija D.Advanced persistent threats-detection and defense[C]//International Convention on Information and Communication Technology,Electronics and Microelectronics,2015:1324-1330.
    [4]周涛.大数据与APT攻击检测[J].信息安全与通信保密,2012(7):23.
    [5]付钰,李洪成,吴晓平,等.基于大数据分析的APT攻击检测研究综述[J].通信学报,2015,36(11):1-14.
    [6]李骏韬,施勇,薛质.基于DNS流量和威胁情报的APT检测[J].信息安全与通信保密,2016(7):84-88.
    [7]史博.基于网络通信行为异常的窃密APT攻击检测研究[D].南京:南京理工大学,2015.
    [8] 360公司.2016中国高级持续性威胁研究报告[EB/OL].(2017-02-13).http://zt.360.cn/1101061855.php?dtid=110-1062514&did=490274251.
    [9]曾玮琳,李贵华,陈锦伟.基于APT入侵的网络安全防护系统模型及其关键技术研究[J].现代电子技术,2013(17):78-80.
    [10] The MITRE Corporation.Structured Threat Information Expression(STIX?)[EB/OL].(2017-09-22).https://stixproject.github.io/documentation/suggested-practices/.
    [11] Jones K S.A statistical interpretation of term specificity and its application in retrieval[J].Journal of Documentation,1972,60(1):493-502.
    [12]周志华,王珏.机器学习及其应用[M].北京:清华大学出版社,2009.
    [13] Cavnar W B,Trenkle J M.N-gram-based text categorization[C]//Proceedings of SDAIR-94,3rd Annual Symposium on Document Analysis and Information Retrieval,Las Vegas,US,1994:161-175.
    [14] Sebastiani F.Machine learning in automated text categorization[J].ACM Computing Surveys,2002,34(1):1-47.
    [15]刘君强,孙晓莹,潘云鹤.关联规则挖掘技术研究的新进展[J].计算机科学,2004,31(1):110-113.
    [16] Arthur D,Vassilvitskii S.k-means++:the advantages of careful seeding[C]//Proceedings of Eighteenth ACM-SIAM Symposium on Discrete Algorithms,2007:1027-1035.
    [17] APT1 exposing one of China’s cyber espionage units[EB/OL].(2012).https://wenku.baidu.com/view/41006431376baf1ffc4fadd3.html.
    [18] Cuckoo official website.Automated malware analysis:CuckooSandbox[EB/OL].(2016-03-28).http://docs.cuckoosandbox.org.
    [19] Smith J.Cuckoo sandbox part3:testing[EB/OL].(2015-11-09).https://gwallgofi.com/cuckoo-sandbox-part-3-testing/.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700