用户名: 密码: 验证码:
一种利用相关性度量的不确定数据频繁模式挖掘
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Frequent Patterns Mining for Uncertain Data Using Correlation Metric
  • 作者:任永功 ; 高鹏 ; 张志鹏
  • 英文作者:REN Yong-gong;GAO Peng;ZHANG Zhi-peng;School of Computer and Information Technology,Liaoning Normal University;
  • 关键词:数据挖掘 ; 频繁模式 ; 加权模式 ; 相关模式 ; 不确定数据
  • 英文关键词:data mining;;frequent patterns;;weighted pattern;;correlated pattern;;uncertain data
  • 中文刊名:XXWX
  • 英文刊名:Journal of Chinese Computer Systems
  • 机构:辽宁师范大学计算机信息与技术学院;
  • 出版日期:2019-03-15
  • 出版单位:小型微型计算机系统
  • 年:2019
  • 期:v.40
  • 基金:国家自然科学基金项目(61373127)资助;; 辽宁省高等学校优秀人才支持计划项目(LR2015033)资助;; 辽宁省博士启动基金项目(20170520207)资助
  • 语种:中文;
  • 页:XXWX201903030
  • 页数:5
  • CN:03
  • ISSN:21-1106/TP
  • 分类号:161-165
摘要
大多数不确定数据库中频繁项集挖掘算法都是基于支持度的限制来剪枝组合搜索空间,因而得到关联性很弱的频繁项集并且对加权相关模式的挖掘效果不显著.本文针对加权不确定数据,提出一种新的策略:基于相关性度量的不确定数据频繁模式挖掘(UFPM-CM).首先,本文采用一种新的树结构和一个针对树结构的新的度量来提高挖掘性能.其次,提出了新的不确定置信度度量来挖掘不确定数据库中的相关模式.最后,利用UFPM算法快速挖掘出相关性强的频繁模式.实验研究结果表明所提出的策略产生了较少但极具价值的模式且其效率优于同类算法.
        In uncertain databases,most of the frequent item mining algorithms are utilizing the limitation of support to prune the combined search space,thus the correlations of frequent itemset they obtain are often very weak,moreover,the mining effect of weighted correlation model is not significant. We proposed a newuncertain frequent pattern mining based on correlation metric( UFPM-CM)approach. A newtree structure and newmetric in UFPM-CMare present to improve the mining performance. Besides,UFPM-CMpropose a newuncertainty confidence metric to explore the phase correlation in the database. Our experimental results suggest that,the proposed UFPM-CMapproach could produce fewer but extremely valuable patterns and outperforms than those of existing work.
引文
[1]Zhang Bu-zhong,Jiang Ke-qin,Zhang Yu-zhou.Survey on incremental association rule mining research[J].Journal of Chinese Computer Systems,2016,37(1):18-23.
    [2]Agrawal R,Srikant R.Fast algorithms for mining association rules[C].Proc.20th Int.Conf.Very Large Data Bases(VLDB),1994,1215:487-499.
    [3]Han J,Pei J,Yin Y,et al.Mining frequent patterns without candidate generation:a frequent pattern tree approach[J].Data M ining and Know ledge Discovery,2004,8(1):53-87.
    [4]Troiano L,Scibelli G.Mining frequent itemsets in data str-eams w ithin a time horizon[J].Data&Know ledge Engineering,2014,89(1):21-37.
    [5]Ahmed C F,Tanbeer S K,Jeong B S,et al.Interactive mining of high utility patterns over data streams[J].Expert Systems w ith Applications,2012,39(15):11979-11991.
    [6]Samiullah M,Ahmed C F,Fariha A,et al.Mining frequent correlated graphs w ith a new measure[J].Expert Systems w ith Applications,2014,41(4):1847-1863.
    [7]Tu Li,Wu Mao-gang,Yang Li-zhi.Clustering algorithm on uncertain stream based on time-fading model[J].Journal of Chinese Computer Systems,2014,35(9):2039-2043.
    [8]Chui C K,Kao B,Hung E.Mining frequent itemsets from uncertain data[C].Pacific-Asia Conference on Know ledge Discovery and Data M ining,2007:47-58.
    [9]Leung C K S,Mateo M A F,Brajczuk D A.A tree based approach for frequent pattern mining from uncertain data[C].Pacific-Asia Conference on Know ledge Discovery and Data M ining,2008:653-661.
    [10]Aggarwal C C,Li Y,Wang J,et al.Frequent pattern mining with uncertain data[C].Proceedings of the 15th ACM SIG-KDD International Conference on Know ledge Discovery and Data M ining,2009:29-38.
    [11]Leung C K S,Tanbeer S K.Fast tree based mining of frequent itemsets from uncertain data[C].International Conference on Database Systems for Advanced Applications,2012:272-287.
    [12]Leung C K S,Tanbeer S K.PUF-tree:a compact tree struc-ture for frequent pattern mining of uncertain data[C].Pacific-Asia Conference on Knowledge Discovery and Data Mining,2013:13-25.
    [13]He Yun-bin,Wang Xiao,Wan Jing,et al.Uncertain data clustering algorithm based on density in the obstacle space[J].Journal of Chinese Computer Systems,2017,38(12):2772-2776.
    [14]Ahmed A U,Ahmed C F,Samiullah M,et al.Mining interesting patterns from uncertain databases[J].Information Sciences,2016,354:60-85.
    [15]Liu Wen-yuan,Du Ying,Chen Zi-jun.Nearest neighbor queries w ith range constrained on uncertain data[J].Journal of Chinese Computer Systems,2012,33(6):1189-1194.
    [16]Liu Wen-yuan,Li Cheng-fang,Chen Zi-jun.Probabilistic threshold visible nearest neighbor queries on uncertain data[J].Journal of Chinese Computer Systems,2013,34(8):1803-1808.
    [17]Yun U.Efficient mining of weighted interesting patterns with a strong w eight and/or support affinity[J].Information Sciences,2007,177(17):3477-3499.
    [18]Xiong H,Tan P N,Kumar V.Mining strong affinity association patterns in data sets w ith skew ed support distributeon[C].IEEEInternational Conference on Data M ining,2003:387-394.
    [19]Bernecker T,Kriegel H P,Renz M,et al.Probabilistic frequent itemset mining in uncertain databases[C].Proceedings of the 15th ACM SIGKDD International Conference on Know ledge Discovery and Data M ining,2009:119-128.
    [20]Omiecinski E R.Alternative interest measures for mining associations in databases[J].IEEE Transactions on Know ledge and Data Engineering,2003,15(1):57-69.
    [1]张步忠,江克勤,张玉州.增量关联规则挖掘研究综述[J].小型微型计算机系统,2016,37(1):18-23.
    [7]屠莉,吴懋刚,杨立志.基于时间衰减模型的不确定数据流聚类算法[J].小型微型计算机系统,2014,35(9):2039-2043.
    [13]何云斌,王霄,万静,等.障碍空间中基于密度的不确定数据聚类算法[J].小型微型计算机系统,2017,38(12):2772-2776.
    [15]刘文远,杜颖,陈子军.不确定数据上范围受限的最近邻查询算法[J].小型微型计算机系统,2012,33(6):1189-1194.
    [16]刘文远,李承芳,陈子军.面向不确定数据的概率阈值可见最近邻查询算法[J].小型微型计算机系统,2013,34(8):1803-1808.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700