用户名: 密码: 验证码:
数据流频繁模式挖掘关键算法及其应用研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着计算机技术的高速发展和信息技术的广泛应用,数据流已在商务管理中的性能检测、网络流量管理中的异常检测及报警、零售业中的事务处理等领域中得到广泛的应用。数据流的分析和挖掘已成为一个热点研究问题。其中,数据流频繁模式的挖掘是数据流挖掘中最基本的问题之一,因此数据流频繁模式挖掘的研究更具有挑战意义。
     现行的基于数据挖掘技术的入侵检测系统不仅对新的攻击或特征未知的入侵无能为力,而且检测的实时性和准确性均达不到实际应用的需求。研究高效的、实时性强的数据流频繁模式挖掘算法并将其应用于入侵检测系统中,将会推动入侵检测走向实用,因此,基于数据流挖掘技术的入侵检测系统的研究在理论上与实际应用上都具有重要意义。
     针对现有最大频繁项集挖掘算法中存在压缩存储结构复杂、结点维护量大、时空消耗偏大等问题,本文提出了一种基于前缀模式树的最大频繁项集挖掘算法MMFI-DS.该算法设计的压缩存储结构——SEFI-tree结构简单,捕获数据流重要信息元素的能力强,结点维护开销小,采用自顶向下和自底向上双向搜索策略,可尽早剪掉较短非频繁项集的超集和较长最大频繁项集的子集,减少项目集支持度计算,降低了算法开销。
     针对现有闭频繁项集挖掘算法存在搜索空间大、产生中间结果多、时间效率不高等问题,提出了一种基于二进制位的闭频繁项集挖掘算法A-NewMoment。该算法采用二进制位图技术设计了BV-DFIlist数据结构来记录数据流信息及闭频繁项集,提出“不需扩展”与“向下扩展”搜索策略,快速挖掘出频繁1-项集所产生的支持度为最大、最长闭频繁项集外的其余闭频繁项集,避免生成大量中间结果,提高了算法的时间效率;研究“动态不频繁剪枝策略”从存储闭项集的HTC表中快速删除非闭频繁项集和“动态不搜索策略”维护所有发生变化的闭频繁项集,降低闭频繁项集的维护代价,提高算法的时间效率。
     针对现有的算法精确度不够高,结点维护代价高,时空效率低等问题,提出了挖掘Top-K闭频繁项集算法MTKCFI-SWo该算法设计了结构紧凑的前缀模式树CFP-tree来压缩存储数据流滑动窗口内的有效信息,结点存储信息量少,降低结点维护代价,提高算法的时间效率。CFP-tree无需固定滑动窗口尺寸大小,在任意滑动窗口尺寸大小的情况下及时捕获新增数据流信息。采用指针操作技术,不需遍历整个CFP-tree,从CFP-tree中删除大量不频繁项目,提高算法的时间效率。采用“动态确定”挖掘阈值与剪枝阈值,提高算法的精确度与时间效率。
     为提高现有的入侵检测系统实时在线挖掘效果和检测精度,论文提出一种基于数据流最大频繁项集的入侵检测系统模型MMFIID-DS。该系统设计各种剪枝策略,挖掘经过训练学习后的正常数据集、异常数据集和当前检测数据流的最大频繁项集,建立系统的正常行为模式、异常行为模式和用户行为模式,将误用检测和异常检测两种入侵检测方法有机地结合,实时在线检测入侵,达到提高检测精度和系统响应速度的目的。
With the rapid development of the computer technology and wide application of information technology, data stream has been widely used in performance detection of business management, abnormal detection and alarming of network flow management as well as the transaction processing in retail industry. Analysis and mining of data stream have become a hot issue for research, of which frequent pattern mining of data streams is one of the most fundamental. Consequently, the research in this field is more challenging.
     Current intrusion detection system based on data mining technology can neither detect new and unknown attacks, nor meet the actual requirements of application in terms of real-time response and accuracy. Research of data stream frequent pattern mining algorithm with high efficiency and real-time response by applying it into IDS will put the intrusion detection into practice. Therefore, research of IDS based on data stream mining technology is significant both in theory and practical use.
     In consideration of the existing problems such as complexity in compression storage structure, mass maintenance of nodes and higher consumption of space and time, a typical algorithm for mining maximal frequent itemsets based on prefix pattern tree, called MMFI-DS, is then presented. The compression storage structure designed by this algorithm, SEFI-tree, is simple in structure and has a strong competence in capturing key information of data streams with less expenditure in maintenance of nodes. By adopting the top-down and bottom-up approaches, the shorter supersets of infrequent items and the longer subsets of maximal frequent itemsets can be promptly pruned to reduce calculation of itemsets support and expenditure of the algorithm.
     In consideration of the existing problems such as larger search space, redundancy in intermediate results and inefficiency in time consumption, a closed frequent itemsets mining algorithm called A-NewMoment based on binary bit is presented. By adopting bitmap technology on the basis of binary bit, the algorithm has designed BV-DFIlist data structure to record data stream information and closed frequent itemsets with presentation of "WSS" and "CSS" search strategies for rapid mining of closed frequent itemsets other than the longest closed frequent itemsets with maximum support produced by frequent1-itemset. Thus, time efficiency is greatly improved by avoiding massive intermediate results. Studying "DNFIPS" for rapid deletion of no closed frequent itemsets from the HTC in storage closed itemsets as well as the "DNSS" for maintenance of all changed closed frequent itemsets can reduce the maintenance cost of closed frequent itemsets and improve time efficiency of this algorithm.
     In consideration of the existing problems such as inaccuracy of the current algorithm, higher maintenance cost of nodes and inefficiency in time and space, a mining Top-K closed frequent itemsets algorithm MTKCFI-SW is presented. The algorithm has designed a compact prefix tree called CFP-tree to compress effective information in sliding windows for storage of data stream, which features in less information storage and lower maintenance cost for improvement of time efficiency. The CFP-tree, which is capable of promptly capturing newly added data stream information, does not need to fix the size of sliding windows. By adoption of pointer operations, large quantities of infrequent itemsets can be deleted from the CFP-tree to improve time efficiency of this algorithm without traversing the whole CFP-tree. Besides, the adoption of "DD" of mining threshold and pruning threshold also improves the accuracy and time efficiency of the algorithm.
     For the purpose of improving the real-time online mining effect and detection precision of current intrusion detection system, the dissertation has presented an intrusion detection system model MMFIID-DS based on maximal frequent itemsets over data stream. By designing different pruning strategies and mining of trained normal and abnormal data sets as well as maximal itemsets of the current detection data stream, the system has established normal behavior and abnormal behavior and as well as user behavior patterns, with organic combination of misuse detection and anomaly detection for the purpose of achieving online detection intrusion as well as improving detection precision and response speed of system.
引文
[1]KU,J.Heidemann. Rapid Model Parameterization from Traffie Measurements [J]. ACM Transaetionson Modeling and Computer Simulation,2002,50 (4):201-229.
    [2]C.Cranor,Y.Gao,T Johnson, V.Shkapenyuk,et al.High Performance Network Monitoring with an SQL Interface.In Proceeding of the 2002 ACM SIGMOD International Conference on Management of Data, Madison, Wisconsin (ACM SIGMOD 2002),2002:623-628.
    [3]Doantam Phan, John Gerth, Marcia Lee.Visual Analysis of Network Flow Data with Timelines and Event Plot[J]. Mathematics and Visualization,2008,17(10): 85-99
    [4]Hash Singhal,George Michailidis.Optimal sampling in state space models with aplications to network monitoring.In Proceedings of the 2008 ACM SIGMETRICS international Conference on Measurement and Modeling of Computer Systems (ACM SIGMETRICS 2008),2008:278-289
    [5]Palanivel Andiappan Kodeswaran, Sethuram Balaji Kodeswaran, Anupam Joshi, and Tim Finin. Enforcing security in semantics driven policy based networks. In Proceedings of the 24th International Conference on Data Engineering Workshops, Secure Semantic Web (SSW 2008),2008:490-497
    [6]C.EstanandG.Varghese.New Directions in Traffie Measurement and Aceounting: Foeusing on the Elephants, Ignoring the Mice[J].ACM Transactions on Computer Systems,2003,45 (7):270-313.
    [7]Mengmeng Liu,Svilen R.Mihaylov,Zhuowei Bao. SmartCIS:Integrating digital and physical enviroments [J]. SIGMOD Record,2010,39(1),346-352.
    [8]D.J.Abadi,W.Undner,S.Madden. An Integration Frame work for Sensor Networks and Data Stream Management Systems. In Proceeding of the Very Large Data Bases (VLDB 2004),2004:1361-1364.
    [9]Michael J Franklin. Continuous trend-based clustering in data streams. In Proceeding of the Fourth ACM International Conference on Distributed Event Based Systems (DEBS 2010),2010:1-10
    [10]J.Chen,D.Dewitt,F.Tian. A Sealable Continuous Query System for Internet Databases.In Proceeding of the ACM International Conference on Management of Data (MOD 2000),2000,379-390.
    [11]Chetan Gupta, Song Wang, Ismail Ari, Ming Hao. Chaos:A data stream analysis architecture for enterprise applications. In Proceeding of the 11the IEEE Conference on Commerce and Enterprise Computing (CEC 2009),2009:33-40
    [12]Shaik Akbar, Dr.k.Nageswara Rao. Intrusion detection system methodologies based on data analysis[J]. Computer Applications,2010,5(2):245-256
    [13]C.Cortes,KFisher,D.Pregibon. Haneock:A Language for Extracting Signatures from Data Streams. In Proceeding of the ACM International Conference on Knowledge Discovery and Data Mining (KDD 2000),2000:9-17.
    [14]. P.Domingos, G.Hulten. Mining High-Speed Data Streams.In Proceeding Of the ACM International Conference on Knowledge Discovery and Data Mining (KDD 2000),2000:71-80.
    [15]G..Hulten, L.Spencer, P.Domingos. Mining Time-Changing Data Streams. In Proceeding of the ACM International Conference on Knowledge Discovery and Data Mining (KDD 2001),2001:97-106.
    [16]Willie Ng,Manoranjan Dash. A comparison between approximate counting and sampling methods for frequent pattern mining on data streams [J]. Intelligent Data Analysis,2010,14(6):749-771.
    [17]王卉.最大频繁项集挖掘算法及应用研究[D].武汉:华中科技大学,2005,68-71.
    [18]俞研,黄皓.基于小样本标记实例的数据流集成入侵检测模型[J].电子学报,2007,35(2),234-239
    [19]俞研,郭山清,黄皓.基于数据流的异常入侵检测[J].计算机科学,2007,34(5),66-71.
    [20]B.Babcock, S.Babu, M.Datar. Models and Issues in Data Stream System. In Proceeding of the ACM International Conference on Symposium on Principles of Database System (PODS 2002),2002:1-16.
    [21]L.Golab, M.T.Ozsu. Issues in Data Stream Management. In Proceeding of the 2003 ACM SIGMOD International Conference on Management of Data (ACM SIGMOD 2003) 2003:5-14.
    [22]NanJiang,LeGruenwald. Research issues in data stream association rule mining.ACM SIGMODE Record[J].2006,35(1):446-476.
    [23]Mohamed Medhat,Arkady Zaslarsky,Shonnali krishnaswamy.Mining data streams:a review.ACM SIGMOD Record[J].2005,34(2):18-26.
    [24]刘学军.数据流聚集查询和频繁模式挖掘的研究[D].南京:东南大学,2006:2-5
    [25]Y.Zhu, D.Shasha.Stream:Statistical Monitoring of Thousands of Data Streams in Real Time.In:Proceeding of the 28th International Conference on Very Large Data Bases (VLDB 2002) 2002:358-369.
    [26]D.Barbara. Requirements for Clustering Data Streams. In Proceeding of the Eighth SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD 2002) 2002:23-27
    [27]Garofalakis M,Gehrke J,Rastogi R.Querying and Mining Data Streams:You Only Get One Look A Tutorial.In:Proceeding of the 2002 ACM SIGMOD International Conference on Management of Data (ACM SIGMOD 2002) 2002:635-635
    [28]Charu C.Aggarwal.Data Streams Models and Algorithms[M].New York:Springer Published,2007:273-289.
    [29]Jiawei Han,Micheline Kamber.Data Mining Concepts and Techniques [M].Simon Crump Published,2006:234-245.
    [30]J.S.Vitter. Random Sampling with a Reservoir[J]. ACM Transactions on Mathematical Software,1985,3(11):37-57.
    [31]P.B.Gibbons, Y.Matias.New Sampling-based Summary Statistics for Improving Approximata Query Answers. In proceeding of the ACM SIGMOD International Conference on Management of Data (ACM SIGMOD 1998) 1998:331-342.
    [32]GibbonsP.B.Fast increlnental maintenance of approximate histograms.In Proceeding of the 23th International Conference on Very Large Data Bases (VLDB 1997) 1997:466-475
    [33]M.J.zaki. Multivariate Equi-width Data Swapping for Private Data Publication [J]. SIGMOD Reeord,2008,24(2):233-244.
    [34]S Guha. On the space-time of optimal, approximate and streaming algorithms for synopsis construction problems. The VLDB Journal[J],2008,17(6):1509-1535.
    [35]Jose Aguiar,Morase Filho. Accurate histogram-based XML summarization. In Proceeding of the 2008 ACM symposium on Applied computing (ACM 2008) 2008:342-356.
    [36]Albert Bifet,Geoff Holmes,Bernhard pfahringer. Improving Adaptive Bagging Methods for Evolving Data Streams. Lecture Notes in Computer Science[J] 2009, Volume 5828/2009:23-37.
    [37]Alon.N. The Space Complexity of Approximating the Frequency Moments. In Proceeding of the 28th Annual ACM Symposium on the Theory of Computing (STOC 1996) 1996:20-29.
    [38]Andrew Nealen,Olga Sorkine,Marc Alexa.A sketch-based interface for detail-preserving mesh editing. In Proceeding of the International Conference on Computer Graphics and Interactive Techniques Archive (ACM SIGGRAPH 2007) 2007:178-189.
    [39]Anil K.Jain,Brendan klare.Matching Forensic Sketches and Mug Shots to Apprehend Criminals.Computer[J].2011,44(5),94-96.
    [40]Yablo,Stephen.Advertisement for a Sketch of an Outline of a Prototheory of Causation.Computer[J].2010,20(8):98-117.
    [41]杜云艳,温伟,曹锋.空间数据挖掘的地理案例推理方法及实验[J].地理研究,2009,28(5):1285-1296.
    [42]Albert Bifet,Eibe Frank,Geoffrey Holmes.Accurate Ensembles for Data Streams:Combining Restricted Hoeffding Trees using Stacking.In:Proceeding of the 2010 ACM SIGKDD International Conference Knowledge Discovery and Data Mining (ACM SIGKDD 2010) 2010:97-106.
    [43]Marcus,Gallaqher.An Empirical Study of Hoeffding Racing for Model Selection in k-Nearest Neighbor Classification.Journal of Information Sciences, 2008,176(6):1986-2015
    [44]Yu J X,Chong Z,Lu H,et al.A False Negative Approach to Mining Frequent Itemsets from High Speed Transactional Data Streams.Journal of Information Sciences,2006,176(7):1986-2015.
    [45]陈富赞,李敏强.基于项目集格及位图索引的频繁项目集发现算法[J].系统工程与理论,2008,26(2):26-34.
    [46]吕橙,郝莹,张翰韬.基于垂直二进制位图的频繁模式挖掘算法[J].山东大学学报(理学版),2007,42(5):1-7.
    [47]Congnan,Soon M.Chung. A scalable algorithm for mining maximal frequent sequences using a sample[J].Knowledge and Information System,2008,15(2): 49-179.
    [48]Leander Schietgat, Jan Ramon.An Efficiently Computable Graph-Based Metric for the Classification of Small Molecules[J].Lecture Notes in Computer Science,2008,5255/2008:97-209.
    [49]任家东,解玉洁,何海涛.基于改进前缀树PStree的最大序列模式挖掘[J].计算机研究与发展,2010,47(z1):67-74.
    [50]Byeong-Soo Jeong, Young-Koo Lee.Efficient single-pass frequent pattern mining using a prefix-tree [J] Journal of Information Science,2009,179(5): 559-583.
    [51]Syed khairuzzaman,Chowdhury farhan,Byeong-Soo Jeong. CP-tree:a tree structure for single-pass frequent pattern mining.In Proceeding of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining (PAKDD 2008) 2008:1022-1027.
    [52]Tzung-Pei Hong, Chun-Wei Lin, Yu-Lung Wu. Incrementally fast updated frequent pattern trees[J].Exper Systems with Applications,2008,34(4): 2424-2435.
    [53]Carson Kai-Sang Leung,Quanmrul.Kham,Zhan Li.CanTree:a canonical-order tree for incremental frequent-pattern mining[J].Knowledg and Information System,2008,11(3):287-311.
    [54]Nasser Yazdani,Hossein Mohammadi.DMP-tree:A dynami M-way prefix tree data structure for strings matching[J].Computer & Electrical Engineering,2010, 36(5):818-834.
    [55]Kostas Patrumpas,Timos Sellis.Windows Specification over Data Streams[J]. Lecture Notes in Computer Science,2006,4254/2006:445-464.
    [56]CH.Lin,DYChiu,YH.Wu,ALP.Chen,T.Hsinchu.Mining Frequent Itemsets from Data Streams with a Time-Sensitive Sliding Window.In Proceeding of the fifth SIAM International on Data Mining (SDM 2005) 2005:56-67.
    [57]GS.Manku,R.Motwani. Approximate Frequency Counts over Data Streams.In Proceeding of the 28th International Conference on Very Large Data Bases (VLDB 2002) 2002:346-357.
    [58]James Cheng,Yiping Ke,Wilfred Ng.A survey on algorithms for mining frequent itemsets over data streams[J].Knowledge and Information System,2008, 16(1):1-27.
    [59]Mao Yiming,Yang Lumin,Li Hong.Ming Maximal Frequent Itemsets over the Entire History of Data Streams.Proceeding in 2009 International Workshop on Database Technology and Application (DBTA 2009) 2009:413-417.
    [60]H.Li,S.Lee,M.Shan.Online Mining(Recently)Maximal Frequent Itemsets over Data Streams.In Proceeding of the fifteenth International Workshops on Research Issues in Data Engineering:Stream Data Mining and Applications (RIDE-DMA 2005) 2005:11-18.
    [61]ED.Demaine,A.Lopez-Ortiz,JI.Munro.Mining Closed Itemsets in Data Stream Using Formal Concept Analysis[J].Lecture Notes in Computer Science, 2010,6263/2010:285-296.
    [62]C.Jin,W.Qian,C.Sha,JX.Yu,A.Zhou. Dynamically Maintaining Frequent Items over a Data Stream.In Proceeding of the 2003 ACM International Conference on Information and Knowledge Management (CIKM 2003) 2003:287-294.
    [63]RM.Karp,S.Shenker,CH.Papadimitriou.A Simple Algorithm for Finding Frequent Elements in Streams and Bags [J]. ACM Transactions on Database Systems,2003,28(1):51-55.
    [64]Y.Chi,H.Wang,PS.Yu,RR.Muntz.Moment:Maintaining Closed Frequent Itemsets over a Stream Sliding Window.In:Proceeding of the fourth IEEE International Conference on Data Mining (ICDM 2004) 2004:59-66.
    [65]Mao Yimin,Yang Lumin,LiHong,etal. Mining Closed Frequent Itemsets in the Sliding Window over Data Streams.2009 IEEE Youth Conference on Information,Computer and Telecommunications (YC-ICT2009) 2009:146-149.
    [66]Syed Khairuzzaman, Chowdhury Farhan, Byeong-Soo Jeong.sliding window-based frequent pattern mining over data streams [J] Journal of Information Science,2009,179(22):3843-3865.
    [67]A.Arasu,GS.Manku.Approximate Counts and Quantiles over Sliding Windows. In Proceeding of the twenty-third ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (Pods 2004) 2004:286-296.
    [68]Jurgen Kramer,Bernhard Seeger.semantics and implementation of continuous sliding window queries over data streams [J]. ACM Transactions on Database Systems,2009,34(1):247-253.
    [69]JH.Chang,WS.Lee.Finding Recent Frequent Itemsets Adaptively over Online Data Streams.In Proceeding of the ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2003) 2003:487-492.
    [70]C.Giannella,J.Han,E-Robertson,C.Liu.Mining Frequent Itemsets over Arbitrary Time Intervals in Data Streams[EB/OL] 2003.http://citeseerx.ist.psu.edu.
    [71]张听,李晓光,王大玲等.数据流中一种快速启发式频繁模式挖掘方法[J].软件学报,2005,16(12):2099-2105.
    [72]Kuen-Fang Jea,Chao-Wei Li.Discovering frequent itemsets over transactional data streams through an efficient and stable approximate approach[J].Expert System with Applications.2009,36(10):323-331.
    [73]Ferry Irawan,Nishad Manerikar,Themis Palpanas. Efficiently Discovering Recent Frequent Items in Data Streams[J]. Leture Notes in Computer Science,2008,5069/2008:222-239.
    [74]Younghee Kim,Joonsuk Ryu,Ungmo kim. FIA:Frequent Itemsets Mining Based on Approximate Counting in Data Streams[J].Leture Notes in Computer Science,2009,5863/2009:312-322.
    [75]熬富江.数据流频繁模式挖掘关键算法及其仿真应用研究[D].长沙:国防科技大学,2008:8-12.
    [76]D.Lee,W.Lee.Finding Maximal Frequent Itemsets over Online Data Streams Adaptively.In Proceeding of the Fifth IEEE International Conference on Data Mining (ICDM 2005) 2005:266-273.
    [77]G.Mao,X.Wu,X.Zhu,et al.Mining Maximal Frequent Itemsets from Data Streams[J] Journal of Information Science,2007,33(3):251-262.
    [78]Bai-En Shie,Vincent S.Tseng,Philip S.Yu.Online mining of temporal maximal utility itemsets from data streams.In Proceeding of the ACM Symposium on Applied Computing (SAC 2011) 2011,124-136.
    [79]Pei Shuai Chen,Chong Huan Huan Xu.Maximal Frequent Itemsets in Data Stream Mining Based on Orderly-Compound Policy[J].Applied Mechanics and Materials,2010,26(6):113-117.
    [80]Lichao Guo,Hongye Su,Yu Qu.Approximate mining of global closed frequent itemsets over data streams[J] Journal of the Frankklin Institute,2011,348(6): 1052-1081.
    [81]Keming Tang,Caiyan Dai,Ling Chen.ItemListFCI:An Algorithm for Mining Closed Frequent Itemsets Based on Bit Tabl[J].Applied Mechanics and Materials,44-47(2011):3159-3163.
    [82]N.Jiang,L.Gruenwald.CFI-Stream:Mining Closed Frequent Itemsets in Data Streams.In Proceeding of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2006) 2006:592-597.
    [83]刘旭,毛国君等.数据流中频繁闭项集的近似挖掘算法[J].电子学报,2007,35(5):900-905.
    [84]黄国立,王立波,任家东.一种基于滑动窗口的数据流频繁闭项集挖掘算法[J].计算机研究与发展,2009,46(z2):1738-1743.
    [85]Pauray S.M. Tsai.Mining top-k frequent closed itemsets over data streams using the sliding window model [J].Expert Systems with Applications,2010, 37(10):6968-6973.
    [86]Hoang Thanh Lam.Mining top-k frequent items in a data stream with flexible sliding window.In Proceeding of the 16th ACM SIGKDD international conference on knowledge discovery and data mining (KDD 2011) 2011, 283-291.
    [87]T.M.Quang,S.Oyanagi,K.Yamazaki.ExMiner:An Efficient Algorithm for Mining Top-K Frequent Patterns [J].Lecture Notes in Computer Science,2006,4093/2006:436-447.
    [88]T.M.Quang,S.Oyanagi,K.Yamazaki.Mining the K-Most Interesting Frequent Patterns Sequentially.In Proceeding of Intelligent Data Engineering and Automated Learning(IDEAL 2006)2006:620-628.
    [89]Bi-Ru Dai,Hung-Lin Tiang,Chih-Heng Chung.Mining Top-K Sequential Patterns in the Data Stream Environment.In Proceeding of the 2010 International Conference on Technologies and Applications of Artificial Intelligenc (TAAI 2010) 2010:142-149.
    [90]J.Pei,G.Dong,W.Zou,J.Han.On Computing Condensed Frequent Pattern Bases. In Proceeding of Second IEEE International Conference on Data Mining (ICDM 2002)2002:378-385.
    [91]J.Pei,G.Dong,W.Zou,J.Han.Mining Condensed Frequent-Pattern Bases[J]. Knowledge and Information Systems,2004,6 (5):570-594.
    [92]L.Yang,M.Sanver.Mining Short Association Rules with One Database Scan.In Proceeding of the International Conference on Information and Knowledge Engineering (IKE 2004) 2004:392-398.
    [93]C.Giannellas,J.Han,J.Pei,X.Yan,PS.Yu.Mining Frequent Patterns in Data Streams at Multiple Time Granularities[EB/OL] 2003. http://citeseerx.ist. psu.edu.
    [94]R. Agrawal and R. Srikant. Fast Algorithms for Mining Association Rules. In Proceeding of the 20th International Conference on Very Large Data Bases (VLDB 1994) 1994:487-499.
    [95]J. Han, J. Pei, and Y. Yin. Mining Frequent Patterns without Candidate Generation. In Proceeding of the ACM International Conference on Management of Data (SIGMOD 2000)2000:1-12.
    [96]D.Lee,W.Lee.Finding Frequent Itemsets over Online Data Streams Adaptively.In Proceeding of the Fifth IEEE International Conference on Data Mining (ICDM 2005) 2005,266-273.
    [97]Nagender Bandi,Ahmed Metwally,Divyakant Agrawal. Fast Data Stream Algorithms using Associative Memories. In Proceeding of the 2007 ACM SIGMOD international conference on Management of data (SIGMOD 2007) 2007:247-256.
    [98]En Tzu Wang,Arbee L.P.Chen.A novel hash-based approach for mining frequent itemsets over data streams requiring less memory space[J].DATA MINING AND KNOWLEDGE DISCOVERY,2009,19(1):132-172.
    [99]王伟平,李建中,张冬冬,郭龙江.一种有效的挖掘数据流近似频繁项算法[J].软件学报,2005,16(12):2099-2105.
    [100]Hua-Fu Li, Suh-Yin Lee. Mining Frequent Itemsets over Data Streams Using Efficient Window Sliding Techniques[J]. Exper Systems with Applications, 2009,36(2):1466-1477.
    [101]J.Cheng,Y.Ke,W.Ng.Maintaining Frequent Itemsets over High-Speed Data Streams.In Proceeding of the 10th Pacific-Asia Conference on Knowledge Discovery and Data Mining(PAKDD 2006)2006:462-467.
    [102]Leung C K S,Khan Q I.DSTree:A Tree Structure for the Mining of Frequent Sets from Data Streams.In Proceedings of the sixth International Conference on Data Mining (DMN 2006) 2006:928-932
    [103]李国徽,陈辉,陈刚等.挖掘数据流任意滑动时间窗口内频繁模式[J].软件学报,2009,34(2):108-110.
    [104]Ranganath,B.N,Murty,M.N. Stream-Close:Fast Mining of Closed Frequent Itemsets in High Speed Data Streams. In Proceeding of the 2008 IEEE International Conference on Data Mining Workshops (ICDMW 2008) 2008:516-525.
    [105]Hua-Fu Li, Suh-Yin Lee. Incremental Updates of Closed Frequent Itemsets over Continuous Data streams[J]. Exper Systems with Applications, 2009,36(2):2451-2458.
    [106]B.Babcock,Maintaining Variance. k-Medians over Data Stream Windows. InProceeding of the 22th ACM SIGACT-SIGMODE-SIGART Symp.on Principles of Database System (PODS 2003) 2003:234-243.
    [107]Metwally. Efficient Compution of frequent and Top-K Elements in Data Streams.In Proceeding Of the 10th International Conference on Database Theory (ICDT 2005) 2005:398-412.
    [108]Golab. Identifying Frequent Items in Sliding Windows over On-Line Packet Streams.In Proceeding of the 3th ACM SIGCOMM Conference on Internet Measurement Conference (IMC 2003) 2003:173-178.
    [109]Wong R.C.W, Fu A.W.C. Mining Top-K Frequent Itemsets from Data Streams[J].DATA MINING AND KNOWLEDGE DISCOVERY,2006,13(2): 193-217
    [110]JH.Chang,WS.Lee.estWin:Adaptively Monitoring the Recent Change of Frequent Itemsets over Online Data Streams.In Proceeding of the twelfth International Conference on Information and Knowledge Management (CIKM 2003) 2003:536-539.
    [111]Hua-FuLi, Lee. Interactive Mining of Top-K Frequent Closed Itemsets from Data Streams[J].Expert Systems with Application,2009,36(2):779-788.
    [112]Yang Bei, Huang Houkuan, Wu Zhifeng.TOPSIS:Finding Top-K Significant N-Itemsets in Sliding Windwos Adaptively[J].Knowledge-Based Systems, 2008,21 (6):443-449.
    [113]Pauray S.M. Tsai.Ming top-k frequent closed itemsets over data streams using the sliding window model[J].Expert System with Applications,2010,37(3): 6968-6973.
    [114]Hoang Thanh Lam, Toon Calders.Mining Top-K frequent items in a data stream with flexible sliding windows.In Proceeding of the 16th ACM SIGKDD Conference on knowledge Discovery and Data Mining (KDD 2010) 2010:167-178.
    [115]宋威,杨炳儒,徐章艳.一种改进的频繁闭项集挖掘算法[J].计算机研究与发展,2007,45(z2):278-286.
    [116]C Lucchesc,S Orlando,R Perego. Fast and memory efficient mining of frequent closed itemsets [J]. IEEE Trans on Knowledge and Data Engineering,2006, 18(1):21-36
    [117]http://www.almaden.ibm.com
    [118]Jianyong Wang,Han Jiawei. CLOSET+:Searching for the Best Strategies for Mining Frequent closed Itemsets. In Proceeding of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2003) 2003:236-245.
    [119]黄敏明,林柏钢.基于遗传算法的模糊聚类入侵检测研究[J].通信学报,2009,30(11A):140-145.
    [120]Anderson J, P.Computer. Security Threat Monitoring and Surveillance [EB/OL] 1980.http://csrc.nist.gov/publications/history/ande80.pdf.
    [121]Lee Wenke, Stolfo S J, Mok K W. A data mining framework for building int rusion detection models. In Proceedings of t he 1999 IEEE Symposium on Security and Privacy (S&P1999) 1999:212-223.
    [122]Zhou Z H,Wu J X, Tang W. Ensemble neural Networks many could be better than all[J]. Artificial Intelligence,2002,137(1):239-263.
    [123]Didaci L, Giacinto G, Roli F. Ensemble learning for intrusion detection in computer networks.In Proceeding of the Workshop on Machine Learning,Methods and Applications (AⅡA 2002)2002:432-445.
    [124]Mukkamala S,Sung A H,Abraham A. Intrusion detection using an ensemble of intelligent paradigms [J] Journal of Network and Computer Applications 2005,28 (2):167-182.
    [125]Han Jiawei, Kamber M. Data Mining:Concepts and Techniques[M]. Morgan Kaufmann Publishers,2001:87-92.
    [126]Breunig M M, Kriegel H P, Ng R T, et al. LOF:Identifying density based local outliers. In Proceeding of the ACM SIGMOD International Conference on Management of Data (SIGMOD 2000) 2000:93-104.
    [127]Portnoy L, Eskin E, Stolfo S J. Intrusion detection with unlabeled data using clustering. In Proceeding of ACM CSS Workshop on Data Mining Applied to Security (DMSA 2001) 2001:78-91.
    [128]Wang Q, Megalooikonomou V. A clustering algorithm for intrusion detection. In Proceeding of the Conference on Data Mining, Int rusionDetection Information Assurance, and Data Networks Security (SPIE 2005) 2005:256-267.
    [129]Ganti V, Gehrke J, Ramakrishnan R. Mining Data Streams Under Block Evolution[J]. ACM SIGKDD Explorations Newsletter,2002,3 (2):1-11.
    [130]BenDavid S, Gehrke J, Kifer D. Detecting Change in Data St reams. In Proceeding of the 30th International Conference on Very Large Data bases (VLDB 2004) 2004:67-79.
    [131]Daniel Barbara,Julia Couto,Sushil Jajodia,et al.ADAM:A Testbed for Exploring the Use of Data Mining in Intrusion Detection[J].SIGMOD Record, 2001,30(4):15-24.
    [132]Salvatore J.Stolfo,Wenke Lee, et al.Data Mining-based Intrusion Detectors:An Overview of the Columbia IDS Project[J]. SIGMOD Record, 2001,30(4):5-14.
    [133]L. Ertoz, E. Eilertson, A. Lazarevic. The MINDS -Minnesota Intrusion Detection System in Next Generation Data Mining[EB/OL]2004. http://www.cs. umn.edu/research/MINDS/MINDS_papers.htm.
    [134]http://www.ornl.gov/sci/knowledgediscovery/Sensor KDD-2008/index.htm
    [135]Y. Liu, Y. Li, H. Man, and W. Jiang. A hybrid data mining anomaly detection technique in ad hoc networks[J]. International Journal of Wireless and Mobile Computing,2007,2(1):37-46.
    [136]毛国军,宋东军.基于多维数据流挖掘技术的入侵检测模型与算法[J].计算机研究与发展,2009,46(z4):602-609.
    [137]http://www.Kdd.ics.uci.edu/database/kddcup99/kddcup/99.html.
    [138]http://www.ll.mit.edu/IST/ideval/data/1998/1998_data_index.html.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700