用户名: 密码: 验证码:
Mining Itemset-based Distinguishing Sequential Patterns with Gap Constraint
详细信息    查看全文
  • 作者:Hao Yang (17)
    Lei Duan (17) (18)
    Guozhu Dong (19)
    Jyrki Nummenmaa (20)
    Changjie Tang (17)
    Xiaosong Li (18)

    17. School of Computer Science
    ; Sichuan University ; Chengdu ; China
    18. West China School of Public Health
    ; Sichuan University ; Chengdu ; China
    19. Department of Computer Science & Engineering
    ; Wright State University ; Dayton ; USA
    20. School of Information Sciences
    ; University of Tampere ; Tampere ; Finland
  • 关键词:Itemset ; Sequential pattern ; Contrast mining
  • 刊名:Lecture Notes in Computer Science
  • 出版年:2015
  • 出版时间:2015
  • 年:2015
  • 卷:9049
  • 期:1
  • 页码:39-54
  • 全文大小:359 KB
  • 参考文献:1. Dong, G, Pei, J (2007) Sequence Data Mining. Springer-Verlag, Berlin, Heidelberg
    2. Dong, G., Bailey, J., eds.: Contrast Data Mining: Concepts, Algorithms, and Applications. CRC Press (2012)
    3. Agrawal, R., Srikant, R.: Mining sequential patterns. In: Proceedings of the Eleventh International Conference on Data Engineering, pp. 3鈥?4. IEEE Computer Society, Washington, DC (1995)
    4. Zaki, MJ (2001) Spade: an efficient algorithm for mining frequent sequences. Mach. Learn. 42: pp. 31-60 CrossRef
    5. Ji, X, Bailey, J, Dong, G (2007) Mining minimal distinguishing subsequence patterns with gap constraints. Knowl. Inf. Syst. 11: pp. 259-286 CrossRef
    6. Yan, X., Han, J., Afshar, R.: Clospan: mining closed sequential patterns in large databases. In: SDM (2003)
    7. Han, J., Dong, G., Yin, Y.: Efficient mining of partial periodic patterns in time series database. In: Proceedings of the 15th International Conference on Data Engineering, pp. 106鈥?15. IEEE Computer Society, Washington, DC (1999)
    8. Zhang, M., Kao, B., Cheung, D.W., Yip, K.Y.: Mining periodic patterns with gap requirement from sequences. ACM Trans. Knowl. Discov. Data 1(2), August 2007
    9. Pei, J, Wang, H, Liu, J, Wang, K, Wang, J, Yu, PS (2006) Discovering frequent closed partial orders from strings. IEEE Trans. on Knowl. and Data Eng. 18: pp. 1467-1481 CrossRef
    10. Ferreira, PG, Azevedo, PJ Protein sequence pattern mining with constraints. In: Jorge, AM, Torgo, L, Brazdil, PB, Camacho, R, Gama, J eds. (2005) Knowledge Discovery in Databases: PKDD 2005. Springer, Heidelberg, pp. 96-107 CrossRef
    11. She, R., Chen, F., Wang, K., Ester, M., Gardy, J.L., Brinkman, F.S.L.: Frequent-subsequence-based prediction of outer membrane proteins. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 436鈥?45. ACM, New York, NY (2003)
    12. Zeng, Q, Chen, Y, Han, G, Ren, J (2014) Sequential pattern mining with gap constraints for discovery of the software bug features. Journal of Computational Information Systems 10: pp. 673-680
    13. Conklin, D, Anagnostopoulou, C (2011) Comparative pattern analysis of cretan folk songs. Journal of New Music Research 40: pp. 119-125 CrossRef
    14. Rabatel, J., Bringay, S., Poncelet, P.: Contextual sequential pattern mining. In: Proceedings of the 2010 IEEE International Conference on Data Mining Workshops. ICDMW 2010, pp. 981鈥?88. IEEE Computer Society, Washington, DC (2010)
    15. Feng, J., Xie, F., Hu, X., Li, P., Cao, J., Wu, X.: Keyword extraction based on sequential pattern mining. In: Proceedings of the Third International Conference on Internet Multimedia Computing and Service. ICIMCS 2011, pp. 34鈥?8. ACM, New York, NY (2011)
    16. Chang, JH (2011) Mining weighted sequential patterns in a sequence database with a time-interval weight. Know.-Based Syst. 24: pp. 1-9 CrossRef
    17. C茅cile, L.K., Chedy, R., Mehdi, K., Jian, P.: Mining statistically significant sequential patterns. In: Proceedings of the 13th IEEE International Conference on Data Mining (ICDM2013). ICDM2013, pp. 488鈥?97. IEEE Computer Society, Dallas, TX (2013)
    18. Antunes, C, Oliveira, AL Generalization of pattern-growth methods for sequential pattern mining with gap constraints. In: Perner, P, Rosenfeld, A eds. (2003) Machine Learning and Data Mining in Pattern Recognition. Springer, Heidelberg, pp. 239-251 CrossRef
    19. Pei, J., Han, J., Mortazavi-asl, B., Pinto, H., Chen, Q., Dayal, U., Chun Hsu, M.: Prefixspan: mining sequential patterns efficiently by prefix-projected pattern growth. In: Proceedings of the 17th International Conference on Data Engineering, pp. 215鈥?24. IEEE Computer Society, Washington, DC (2001)
    20. Xie, F, Wu, X, Hu, X, Gao, J, Guo, D, Fei, Y, Hua, E (2013) MAIL: mining sequential patterns with wildcards. Int. J. Data Min. Bioinformatics 8: pp. 1-23 CrossRef
    21. Zhang, M, Kao, B, Cheung, DW, Yip, KY (2007) Mining periodic patterns with gap requirement from sequences. ACM Transactions on Knowledge Discovery from Data (TKDD) 1: pp. 7 CrossRef
    22. Shah, C.C., Zhu, X., Khoshgoftaar, T.M., Beyer, J.: Contrast pattern mining with gap constraints for peptide folding prediction. In: FLAIRS Conference, pp. 95鈥?00 (2008)
    23. Deng, K, Za茂ane, OR Contrasting sequence groups by emerging sequences. In: Gama, J, Costa, VS, Jorge, AM, Brazdil, PB eds. (2009) Discovery Science. Springer, Heidelberg, pp. 377-384 CrossRef
    24. Wang, X, Duan, L, Dong, G, Yu, Z, Tang, C Efficient mining of density-aware distinguishing sequential patterns with gap constraints. In: Bhowmick, SS, Dyreson, CE, Jensen, CS, Lee, ML, Muliantara, A, Thalheim, B eds. (2014) Database Systems for Advanced Applications. Springer, Switzerland, pp. 372-387 CrossRef
    25. Dong, G., Li, J.: Efficient mining of emerging patterns: discovering trends and differences. In: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 43鈥?2 (1999)
    26. Li, J., Liu, G., Wong, L.: Mining statistically important equivalence classes and delta-discriminative emerging patterns. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD 2007, pp. 430鈥?39 (2007)
    27. Rymon, R.: Search through systematic set enumeration. In: Proc. of the 3rd Int鈥檒 Conf. on Principle of Knowledge Representation and Reasoning. KR 1992, pp. 539鈥?50 (1992)
  • 作者单位:Database Systems for Advanced Applications
  • 丛书名:978-3-319-18119-6
  • 刊物类别:Computer Science
  • 刊物主题:Artificial Intelligence and Robotics
    Computer Communication Networks
    Software Engineering
    Data Encryption
    Database Management
    Computation by Abstract Devices
    Algorithm Analysis and Problem Complexity
  • 出版者:Springer Berlin / Heidelberg
  • ISSN:1611-3349
文摘
Mining contrast sequential patterns, which are sequential patterns that characterize a given sequence class and distinguish that class from another given sequence class, has a wide range of applications including medical informatics, computational finance and consumer behavior analysis. In previous studies on contrast sequential pattern mining, each element in a sequence is a single item or symbol. This paper considers a more general case where each element in a sequence is a set of items. The associated contrast sequential patterns will be called itemset-based distinguishing sequential patterns (itemset-DSP). After discussing the challenges on mining itemset-DSP, we present iDSP-Miner, a mining method with various pruning techniques, for mining itemset-DSPs that satisfy given support and gap constraint. In this study, we also propose a concise border-like representation (with exclusive bounds) for sets of similar itemset-DSPs and use that representation to improve efficiency of our proposed algorithm. Our empirical study using both real data and synthetic data demonstrates that iDSP-Miner is effective and efficient.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700