基于向量和矩阵的频繁项集挖掘算法研究
详细信息 本馆镜像全文    |  推荐本文 | | 获取馆网全文
摘要
为了能快速、高效地从事务数据库中挖掘所有的频繁项集,提出了一种基于向量和矩阵的VMA高效算法。该算法只需扫描数据库一次,将事物数据库转化到布尔向量中,对频繁1-项集按支持度大小进行非递减排序,排序后在很大程度上减少了用于扩展的k-项集(k>2),生成一个2-项集支持度矩阵,由频繁k-项集(k≥2)扩展生成频繁(k+1)-项集。大量实验结果表明,VMA算法的性能不但明显优于Apriori算法,而且适应于大型事务数据库中频繁项集挖掘。
In order to rapidly and efficiently find frequent itemsets from transaction database,a novel VMA algorithm based on vector and matrix is presented.The transaction database is transformed into Boolean vectors by scanning database only once.It is largely reduced for extended k-itemsets(k>2) after sorting frequent 1-itemsets according to support count for non-descending order,and then generates a 2-itemset support matrix and generates frequent(k+1)-itemsets extended by the frequent k-itemsets(k≥2).The experimental results show that the performance of VMA algorithm is not only significantly higher than Apriori algorithm but also suitable for mining frequent item sets in large transaction database.
引文
[1]LIU Zhi,YI Weiguo,LU Mingyu,et al.Application of asso-ciation rule mining using vector method in coronary heart diseasediagnoses[J].Computer Engineering,2010,36(6):42-44(in Chinese).[刘智,伊卫国,鲁明羽,等.向量法关联规则挖掘在冠心病诊断中的应用[J].计算机工程,2010,36(6):42-44.]
    [2]CHEN Jing,LI Zhengyuan,WANG Lina,et al.Associationrule algorithm based on interestingness measure on earthquakecatalog[J].Application Research of Computers,2011,28(6):2078-2081(in Chinese).[陈晶,李正媛,王丽娜,等.一种地震目录中基于兴趣度的关联规则分析方法[J].计算机应用研究,2011,28(6):2078-2081.]
    [3]ZHANG Zongyu,ZHANG Yaping,ZHANG Jingyuan,et al.pplication of improved association rule algorithm in collegeteaching management[J].Computer Engineering,2012,38(2):75-78(in Chinese).[张宗郁,张亚平,张静远,等.改进关联规则算法在高校教学管理中的应用[J].计算机工程,2012,38(2):75-78.]
    [4]WANG Chengliang,WU Yanjuan.Research and application ofefficient association rule discovery algorithm of Chinese medicine[J].Computer Engineering and Applications,2010,46(34):119-122(in Chinese).[王成良,吴艳娟.高效中药关联规则发现算法研究及应用[J].计算机工程与应用,2010,46(34):119-122.]
    [5]QIAN Xuezhong,KONG Fang.Research of apriori algorithmin mining association rules[J].Computer Engineering and Ap-plications,2008,44(17):138-140(in Chinese).[钱雪忠,孔芳.关联规则挖掘中对Apriori算法的研究[J].计算机工程与应用,2008,44(17):138-140.]
    [6]ZHANG Wendong,YIN Jinhuan,JIA Xiaofei,et al.Re-search of a frequent itemsets mining algorithm based on vector[J].Journal of Shandong University(Natural Science),2011,46(3):31-34(in Chinese).[张文东,尹金焕,贾晓飞,等.基于向量的频繁项集挖掘算法研究[J].山东大学学报(理学版),2011,46(3):31-34.]
    [7]Grahne G,Zhu J F.Fast algorithm for frequent itemset miningusing FP-trees[J].IEEE Trans on Knowledge and Data Engi-neering,2005,17(10):1347-1362.
    [8]Wu Fan.A new approach to mine frequent patterns using item-transformation methods[J].Information Systems,2007,32(7):1056-1072.
    [9]Yu Wangjun,Wang Xiaochun,Wang Fangyi,et al.The re-search of improved apriori algorithm for mining association rules[C]//11th IEEE International Conference on CommunicationTechnology Proceedings,2008:513-516.
    [10]WEI Yongqing,YANG Renhua,LIU Peiyu.An improvedapriori algorithm for association rules of mining[C]//Proc ofIEEE International Symposium on IT in Medicine&Educa-tion.Beijing:IEEE Press,2009:942-946.
    [11]NIU Xiaofei,SHI Bing,LU Jun,et al.A high efficiency al-gorithm of ABM for mining association rule[J].ComputerEngineering,2004,30(11):118-120(in Chinese).[牛小飞,石冰,卢军,等.挖掘关联规则的高效ABM算法[J].计算机工程,2004,30(11):118-120.]
    [12]ZHANG Yuntao,YU Zhilou,ZHANG Huaxiang.Researchon high efficiency mining frequent itemsets on association rules[J].Computer Engineering and Applications,2011,47(3):139-141(in Chinese).[张云涛,于治楼,张化祥.关联规则中频繁项集高效挖掘的研究[J].计算机工程与应用,2011,47(3):139-141.]
    [13]LV Taoxia,LIU Peiyu.Algorithm for generating strong as-sociation rules based on matrix[J].Application Research ofComputers,2011,28(4):1031-1033(in Chinese).[吕桃霞,刘培玉.一种基于矩阵的强关联规则生成算法[J].计算机应用研究,2011,28(4):1031-1033.]
    [14]Frequent itemset mining implementations repository[EB/OL].[2012-04-03].http://fimi.ua.ac.be/data/.

版权所有:© 2023 中国地质图书馆 中国地质调查局地学文献中心