用户名: 密码: 验证码:
数据挖掘在入侵检测中的应用
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着计算机网络的迅速发展,人们的生活、工作、学习中已经越来越离不开计算机网络。人们逐渐意识到计算机网络发展的同时,带来的安全问题危害越来越严重。计算机网络的安全问题之一就是攻击者利用计算机网络入侵受害者的计算机,访问未授权的系统资源,修改受害者数据,操作受害者计算机等,进行窃取、破坏、控制等相关非法行为。网络入侵行为不断增多,使得社会和学术界对于网络入侵行为的检测越来越重视,如何快速、准确地检测出网络入侵行为已成为当今网络安全领域的主要研究课题。
     入侵检测系统(IDS)能够用来识别计算机网络中的入侵行为。本文研究了入侵检测数据的特征,设计了基于模糊化数据挖掘的入侵检测系统。
     其中一个入侵检测系统是使用模糊化支持向量机进行主动学习,获取分类器来进行入侵检测。将粗糙集引入支持向量机分类器的构造方法中,通过粗糙集来约减冗余数据,加速学习过程。该方法的误报率低,但漏报率高。
     另一个入侵检测系统是基于模糊化的频繁模式增长(Frequent Pattern growth, FP-growth)算法,核心是关联规则分类引擎,用模糊关联规则集来描述不同的类别,通过估算待检测样本与各类规则集的匹配度,以最匹配的类别作为样本的检测结果。由于Fuzzy Apriori算法效率不够高,本文提出模糊FP-growth算法,提出模糊化的FP-tree构建和挖掘过程,并提出新方法对模糊FP-tree进行剪枝,剔除不包括在规则中的项,加速挖掘过程;还能直接由模糊FP-tree推导关联规则,以此取代最后扫描数据库推导关联规则的步骤,加速了整个训练过程。该技术将模糊化理论和频繁模式关联规则挖掘结合起来,实验结果表明,该技术有效地提高了学习效率,降低了漏报率。
     本文的主要工作集中在基于模糊化数据挖掘的入侵检测应用。针对上述两种数据挖掘算法,探索了模糊化数据挖掘应用于入侵检测的方法。
With the rapid development of computer networks, people's life, work, learning has become increasingly inseparable from the computer network. People come to realize that while the development of computer networks, security problems caused by more serious harm. Computer network security is one of an attacker using the victim's computer, computer network intrusion, unauthorized access to system resources, modify data on victims, victims and computer operations, to steal, destroy, control and other related illegal activities. Growing network intrusions, making the social and academic network intrusion detection for more and more attention, how quickly and accurately detect network intrusion network security has become a major research topic.
     Intrusion Detection System (IDS) can be used to identify the computer network intrusion. This paper studies the characteristics of intrusion detection data, designed based on fuzzy data mining intrusion detection system.
     One of the intrusion detection system is the use of fuzzy support vector machine active learning, access classifier for intrusion detection. Rough set support vector machine classifier construction method, by rough set to reduce redundant data is about to accelerate the learning process. The method of false positives is low, but the omission rate.
     Another intrusion detection system is based on the fuzzy frequent pattern growth (Frequent Pattern growth, FP-growth) method, the core of association rules engine, with a set of fuzzy association rules to describe the different categories, estimated to be detected by various types of samples and set of matching rules in order to best match the type of a sample of the test results. As Fuzzy Apriori algorithm efficiency is not high enough, this paper fuzzy FP-growth algorithm, the FP-tree fuzzy construction and mining process, and propose a new method for fuzzy FP-tree pruning, removing not included in the rules of entry, speed up the mining process; also directly derived from the FP-tree fuzzy association rules, replacing the final scan the database for association rules derived steps to accelerate the entire training process. The technology theory and the fuzzy association rule mining frequent patterns combined results show that the technology to effectively improve the learning efficiency and reduce the false negative rate.
     This article mainly focused on data mining based on fuzzy intrusion detection application. In response to these two kinds of data mining algorithms to explore the fuzzy data mining method applied to intrusion detection.
引文
[1]Lee W.A Data Mining Framework for Constructing Features and Models for Intrusion Detection Systems[D】.New York:Columbia University.1999.
    [2]徐著,刘宝旭,许榕生.基于数据挖掘技术的入侵检测系统设计与实现[J】.计算机工程.2002.6.
    [3]邹仕洪.孵喜戎,龚向阳.基于数据挖掘与CIDF的自适应入侵检测系统[J】.计算机工程与应用,2002.11.
    [4] R.Agrawal, T.Imielinaki.A.Swami. Mining association rules between sets of items in large database[C].InProe.of the ACM SIGMOD Conference On Management of Data, Pages207-216. WashingtonD.C. 2003.
    [5]连一峰,戴英侠,王航.基于模式挖掘的用户行为异常检测[J】.计算机学报.2003.9.
    [6] Fukuda et. Mining optimized association rules for numeric attributes[C]. In:Proceedings of the Fifteenth ACM SIGACT-SIG2 MOD-SIGART Symposium on Principles of Database Systems. Canada. 1996.182-191
    [7]梁吉业,曲开社,徐宗本.信息系统的属性约简[J].系统工程理论与实践.2001,21(12):76-80
    [8] SlowinskiR. Rough Classification of HSV Patients. Intelligent Decision SuppoR[M】. Roman Slowinski Kluwer.1992.927-944
    [9] HuX.H.CereoneN. Learning in relational databases: A rough set, approach[J]. Inter J of Computational Intelligence. 1995.11(2). 323-385
    [10] Lenarcik A.Piasta Z. Discretization of condition attributes space . Intelligent Decision Support [M]. Kluwer:Roman Slowinski Kluwer. 1992: 373~ 389
    [11] J.A.Major, J.Mangano. Selecting Among Rules Induced from a Hurricane Database[J]. Proc. AAAI93 Workshop Knowledge Discovery in Databases. July 1993:28-44.
    [12] S.Pauray M, Tsai and C.Chen. Mining interesting association rules from customer databases and transaction databases[J]. Information Systems. December 2004, Vol. 29, Issue 8:685-696.
    [13]胡云可等.基于概念格的分类和关联规则的集成挖掘方法[J】.软件学报. 2000Vol.11: 1478-1484. [14】W.lee, S.Stolfo, K.Mok. A data mining framework for buiding intrusion detection models, in Proc. IEEE Symposium on Security and Privacy. 1999, 120-132 [15】G.Florez, S.M.Bridges, R.B. Vaughn, An improved algorithm for fuzzy data mining for intrusion detection, in: proc. North American Fuzzy Information Processing Society Conference NAFIPS, New Orleans, LA, (2002). 27-29 [16】S.J.Han, S.B.Cho. Detecting intrusion with rule-based integration of multiple models, Comput. Security 22(7)(2003) 613-623 [17】E.Biermann, E. Cloete, L.M. Venter, A Comparison of Intrusion Detection Systems. Comput Security 20(8)(2001) 676-683 [18】D.Anderson, T.F.Lunt, H.Javits, et al. Detecting unusual program behavior using the statistical components of NIDES. NIDES Technical Report, SRI International. 1995 [19】H.Debar, M.Becker, D.Siboni. A neural network component for an Intrusion DetectionSystems, In Proc. IEEE Computer Society Symposium on Research in Security and Privacy, Oakland, CA, 1992.240-250
    [20] R.Lippmann, S.Cunningham. Improving intrusion detection performance using keyword selection and neural networks, Comput. Netw. 2000, 34(4). 594-603
    [21] W.W.Cohen, Fast effective rule induction, In:Proceedings of the 12th Interntional Conference on Machine Learning, July 1995,115-123
    [22] W.Lee, S.Stolfo, K.Mok, A data mining framework for building intrusion detection models, In:Proceedings of the IEEE Symposium on Security and Privacy, 1999, 120-132
    [23] A.Lazarevic, L.Ertoz, V.Kumar, et al. A comparative study of anomaly detection schemes in network intrusion detection, In: Proceedings of the Third SIAM Conference on Data Mining, May 2003
    [24] R.Agrawal, T.Imielinski, and A.Swami. Sets of items in large databases[C]. Mining associaton rules between In Proc. 1993 ACM-SIGMOD Int.Conf.Management Data, pages 207-216, Washington D.C. May 1993
    [25] Sumathi, S/Sivanandam, S.N. Introduction to Data Mining And Its Applications[M], Springer-Verlag New York Inc. 2005.8
    [26] MARC BOULLE, Khiops. A Statistical Discretization Method of Continuous Attributes, Machine Learning[J].2004. 5:53-69
    [27] Lee W, Stolfo S, Mok k. Mining audit data to build intrusion detection models[C]. In Proc. of the International Conference on Knowledge and Data Mining, August 1998
    [28] R Agrawal, T Imieliński, A Swami. Mining association rules between sets of items in large databases[C]. In Proc. of the ACM SIGMOD Conference on Management of Data, May 1993: 207-216
    [29] Arman Tajbakhsh, Mohammad Rahmati, Abdolreza Mizaei. Intrusion detection using fuzzy association rules[J]. Applied Soft Computing, 2009, 9: 462-469
    [30] E Hüllermeier. Implication-based fuzzy association rules[C]. In Proc. PKDD-01, Fifth European Conf. on Principles and Practice of Knowledge Discovery in Databases, LNAI, vol. 2168, Freiburg, Germany, September 2001, Springer, Berlin: 241-252.
    [31] H jiawei, P jian, Y Yiwen, et al. Mining frequent patterns without candidate generation: A frequent-pattern tree approach[J]. Data Mining and Knowledge Discovery, 2004, 8(1): 53-87
    [32] R.Agrawal, R.Srikant, Fast algorithms for mining association rules, In: Proceedings of the 20th International Conference on Very Large Databases, Santiago, Chile, 1994, 487-499
    [33] C.Kuok, A.FU, M.Wong, Mining fuzzy association rules in databases, SIGMOD record 27(1) 1998. 41-46.
    [34] Bridges S, Vaughn R. Fuzzy data mining and genetic algorithms applied to intrusion detection[C].Baltimore, MA:Proc. 23rd National Information Systems Security Conf.2000
    [35] Eleazar Eskin, Andrew Amold, Michael Prerau, etc. A Geometric framework for unsupervised anomaly detection: detecting intrusion in unlabled data[C]. Kluwer: Data mining for Security Application(DMSA-2002). 2002.
    [36] L.Kaufman, P.J.Roussceuw. Finding Groups in Data: an Introduction to Cluster Analysis. NY: John Wiley & Sons, 1990
    [37] R.Ng, J.Han. Efficient and effective clustering method for spatial data mining. Proceedings of the International Conference on Very Large Data Bases(VLDB’94),San Francisco: Morgan Kaufman Publishers, 1994:144-155
    [38] T.Zhang, R.Ramakrishnan, M.Livny. BRICH: An efficient data clustering method for very large databases. ACM SIGMOD Record, 25(2),1996:103-114
    [39] S.Guha, R.Rasogi, K.Shim. CURE: An efficient clustering algorithm for large databases. Proceedings of the 1998 ACM SIGMOD international conference on Management of data, New York, NY, USA: ACM Press, 1998:73-84
    [40] S.Guha, R.Rastogi and K.Shim. Rock: A robust clustering algorithm for categorical attributes. Proceedings of the 15th International Conference on Data Engineering, Washington, DC, USA: IEEE Computer Society, 1999: 512-521
    [41] G.Karpis, E.H.Han, V.Kumar. CHAMELEON: A Hierarchical Clustering Algorithm Using Dynamic Modeling. IEEE Trans. On Computer, 32(8),1999:68-75
    [42] Luo J, Bridges S M. Mining fuzzy association rules and fuzzy frequency episodes for intrusion detection. International Journal of Intelligent Systems, 2000, 15(8):687-704
    [43] M.Ester, H.P.Kriegel, J.Sander. et.al. A density-based algorithm for discovering clusters in large spatial databases. Proc. Of the 2nd Int’l Conf. on Knowledge Discovery and Data Mining(KDD’96).Portland:1996: 226-231
    [44] M.Ankerst, et.al. OPTICS: Ordering Points to Identify Clustering Structure. Proceedings of the ACM SIGMOD Conference on Management of Data, Philadelphia: ACM Press, 1999: 49-60
    [45] A.Hinneburg, D.Keim. An efficient approach to clustering large multimedia database with noise. Proceedings of the 4th ACM SIGKDD on knowledge Discovery and Data Mining, NY: AAAI Press, 1998: 58-65
    [46] Zadeh L A. Toward a Theory of Fuzzy System: Fuzzy Sets and Fuzzy Information Granulation Theory [M]. Beigjing: Beijing Normal University Press, 2000: 29-61
    [47] Filip Mulier. Vapnik-CHervonenkis(VC) Learning Theory and Its Applications. IEEE Trans. on Neural Networks. Vol.10,No.5, Sep 1999
    [48]V.Vapnik. Nature of Statistical Learning Theory. John Wiley and Sons, Inc.New York, in preparation.
    [49] J.C.Burges. A Tutorial on Support Vectoor Machines for Pattern Recognition. Bell Laboratories, Lucent Technologies. 1997
    [50] Corinna Cortes, V.Vapnik. Support-Vector Network. Machine Learning, 20.273-297(1995)
    [51] S.S.Keerthi, et al. Improvements to Platt’s SMO Algorithm for SVM Classifier Design
    [52] Edgar Osuna et al. Training Support Vector Machines: an Application to Face Detection
    [53] John C.Platt. Using Analytic QP and Sparseness to Speed Training of Support Vector Machines【54】Nando de Freitas et al. Sequential Support Vector Machines. Neural Networks for Signal Processing IX. 1999, 31-40
    [55] C.Borgelt. Efficient implementation of Apriori and Elact, Presented at Workshop of Frequent Item Set Mining Implementations FIMI, USA, 2003
    [56] C.Borgelt, R.Kruse. Induction of association rules: Apriori implementation. Presented at 15th Conference on Computational Statistics. 2002
    [57] R.Srikant, R.Agrawal. Mining quantitative association rules in large relational tables. In: Proceedings of the International Conference on Management of Data, 1995, 1-12
    [58]C.Kuok, A.Fu, M.Wong. Mining fuzzy association rules in databases, SIGMOD Record.1998, 27(1).41-46
    [59]G.Forez, S.M.Bridges, R.B.Vaughn, An improved algorithm for fuzzy data mining for intrusion detection, In: Proceedings of the North American Fuzzy Information Processing Society Conference NAFIPS, New Orleans, LA, 2001, 27-29
    [60]J.Li, H.Shen, R.Topr. Mining the optimal Class Association Rule set, Konwledge Based Syst. 15,2002, 399-405
    [61]B.Liu, W.Hsu, Y.Ma, Integrating classification and association rule mining, In: Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining , New York, 1998, 80-86
    [62] S.Medasani, J.Kim, R.Krishnapuram. An overview of membership function generation techniques for pattern recognition. Int.J.Approx.Reason. 1998, 19(3). 391-417
    [63] C.Borgelt. Association Rule Induction. http://fuzzy.cs.uni-magdeburg.de/~borgelt.2005.
    [64] A.Tajbakhsh. Design and implementation of an Intrusion Detection Systems using data mining techniques. M.Sc. Thesis, Instructor M.Rahmati, Department of computer engineering, Amirkabir University of Technology, Iran. 2006
    [65] H.J.Zimmermann, third ed., Fuzzy Set Theory and its Applications, Kluwer, Boston, 1996.【66】U Bezdek, R Enrlich, W Full. FCM: The fuzzy c-means clustering algorithm[J]. Computers & Geosciences, 1984, 10(2): 191-203.【67】Arman Tajbakhsh, Mohammad Rahmati, Abdolreza Mizaei. Intrusion detection using fuzzy association rules[J]. Applied Soft Computing, 2009, 9: 462-469

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700