用户名: 密码: 验证码:
基于关联规则的基因芯片数据挖掘与应用
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
人类基因组草图(Human Genome Draft,HGD)的绘制完成标志着现代生命科学研究从基因组时代进入了后基因组时代,研究的重心由结构基因组学转向功能基因组学,基因彼此之间的相互作用、相互影响越来越多地受到研究者的关注。基因芯片作为一种高通量的检测技术,可以同时检测成千上万条基因的表达水平,成为研究基因与基因之间相互作用关系的强大工具。随着基因芯片大量数据的产生,数据挖掘成为从基因芯片表达数据中解读基因相关信息的重要技术手段。
     本研究针对目前关联规则挖掘技术用于基因芯片表达数据分析时存在的问题,从三个方面进行了比较全面和深入的研究:时序基因芯片表达数据的跨事务关联规则挖掘、传统关联规则中基因表达状态信息缺失问题及大量关联规则的聚类问题等。本文的主要内容及贡献包括:
     (1)时序基因芯片表达数据中的跨事务关联规则挖掘研究
     为了解决传统关联规则忽视数据中的时间信息以及无法对基因的表达状态进行动态预测的问题,本研究提出将跨事务关联规则挖掘技术引入到时序基因芯片表达数据的分析之中,并对跨事务关联规则进行了详细介绍。结合生物学数据库,包括Gene Ontology基因注释数据库、iHOP数据库、DAVID生物信息学资源数据库等,对挖掘出来的跨事务关联规则进行分析,结果显示跨事务关联规则能够有效地挖掘时序基因芯片表达数据中的隐含信息,产生的关联规则符合生物学背景,合理地描述基因之间的动态表达行为。因此,跨事务关联规则为基因功能的预测提供了新的手段和方法。
     (2)传统关联规则中基因表达状态信息缺失问题研究
     通过对传统关联规则中基因表达状态信息缺失这一问题的深入分析,本研究设计了一种新型的关联规则类型——差异表达关联规则(Differential Expression Association Rules,DEAR),并给出了基本定义及相关概念。为了能够有效地挖掘差异表达关联规则,本文提出了一种算法——差异表达关联规则矩阵算法(Differential Expression Association RulesMatrix Algorithm,DEARM算法),并对进行了详细地阐述。实验结果表明,差异表达关联规则在发现基因表达模式及控制冗余规则产生方面要优于传统关联规则。差异表达关联规则作为一种新的关联规则类型,是对关联规则挖掘内容的丰富,将有助于研究人员从基因芯片表达数据中揭示基因之间隐含的表达关系。
     (3)大量关联规则的聚类研究
     关联规则挖掘通常会推导出大量的规则,这给后期的分析与利用带来了巨大的障碍。本研究针对这一现实问题,提出了采用聚类分析对关联规则进行后期处理。为了更有效地对关联规则进行聚类,本文提出了新的关联规则相似性度量方法——内容结构加权度量,从关联规则的结构及内容上全面反映关联规则的相似性,克服了已有度量方法的缺陷只注重内容方面的缺陷。本文将聚类结果与生物学数据库Gene Ontology相结合进行分析,从生物学的角度说明了同一个子类中的关联规则所涉及的基因有着相似或者相关的生物学基础,体现了聚类在关联规则后期分析处理中的价值。因此,聚类分析将为研究才从关联规则中发现感兴趣的模式提供重要的、可视化的技术手段。
The completion of human genome draft (HGD) shows that modern life science research has entered the post-genomic era, the research focus has shifted from structural genomics to functional genomics, and strong interest has arisen regarding the elucidation of interactions between genes. The DNA microarray, a high-throughput method, is able to routinely measure the expression levels of hundreds of thousands of genes simultaneously, so it’s a powerful tool to find the relations among genes. Due to its high-throughput experimental data, data mining technique has become an important method to extract useful information from them.
     To address the problem of association rule mining in microarray gene expression data, this dissertation thoroughly studied the following three aspects: the mining of inter-transaction association rules from time series microarray data, the problem of the absence of gene expression status information in traditional association rules, and the clustering of association rules. The main contributions of this dissertation are summarized as follows:
     (1) The study of the mining of inter-transaction association rules from time series microarray data
     Due to the ignoring of temporal information in time series microarray data, the traditional association rules only reflect the relations among genes at the same time point, and they fail to present the dynamic relations. So we proposed to mine the inter-transaction association rules from such data, and inter-transaction association rules was introduced in details. Some biological information databases, such as gene ontology (GO), iHOP (Information Hyperlinked over Proteins) and DAVID (The Database for Annotation, Visualization and Integrated Discovery), were used to help understanding the inter-transaction association rules. Results show that the rules can extract efficiently hidden information from time series microarray data, and the rules describing the behaviors of genes over times are in accordance with biological background. Therefore, the inter-transaction association rule can be used as a new approach to predict the functions of genes.
     (2) The study of the absence of gene expression status information in traditional association rules
     By analyzing deeply the problem of the absence of gene expression status information in traditional association rules, we proposed a new type of association rules, differential expression association rules (DEAR), and their definition and relative concept were introduced. In order to mine DEAR efficiently, differential expression association rules matrix algorithm (DEARM algorithm) was proposed, and a detailed description was given. Experimental results indicate that DEAR has better performance than traditional association rules on extracting gene expression patterns and controlling redundant rules. DEAR as a new type of association rules enriches the association rules mining technique, which will help researcher to reveal the hidden interactions among genes from microarray data.
     (3) The study of the clustering of association rules
     A large number of association rules are usually discovered from microarray data, and it is difficult to analyze and utilize them. For the sake of tackling this problem, we proposed to cluster association rules. In this paper, we proposed a new similarity metric to cluster association rules efficiently, which measures the similarity between both the structure and the contents of two rules. Hence it overcomes the drawback of traditional similarity metrics focusing only on contents. By analyzing intensively the sub-cluster of association rules together with the Gene Ontology (GO) annotation database, we found that the genes consisting of association rules in the same sub-cluster have similar or relevant biological background, indicating the value of clustering for association rules. Accordingly, clustering is an important visual technique for association rules mining to find hidden interesting patterns.
引文
1. Agrawal R., Mannila H., Srikant R., etc., Fast discovery of association rules, Advances in Knowledge Discovery and Data mining, AAAI/MIT Press, 1996, 307-328
    2. Artamonova II, Frishman G, Gelfand MS, Frishman D.2005. Mining sequence annotation databanks for association patterns. Bioinformatics. 2005 Nov 1;21 Suppl 3:iii49-57.
    3. Arthur ML., Introduction to Bioinformatics. Oxford University Press Inc. 2002.
    4. Attila G., Ulrich W., Simon BO., Etzard S., Ralph S., Mining co-regulated gene profiles for the detection of functional associations in gene expression data. Bioinformatics. Vol. 23 no.15 pages 1927-1935, 2007.
    5. Baldi,P.,Hatfield,GW.,DNA Microarrays and gene expression: from experiments to data analysis and modeling.(影印版).北京:科学出版社. 2003.10.
    6. Barnes MR. Bioinformatics for Geneticists(2nd Edition): A bioinformatics primer for the analysis of genetic data. John Wiley & Sons Inc. 2007.
    7. Becquet C, Blachon S, Jeudy B, et al. Strong-association-rule mining for large-scale gene-expression data analysis: a case study on human SAGE data. Genome Biology 2002, 3: research0067.1-0067.16.
    8. Bowdena, JR., Brennanb, PA. DNA microarray technology: insights for oral and maxillofacial surgeons. British Journal of Oral and Maxillofacial Surgery (2004) 42, 542-545.
    9. Chen G., Wei Q,Fuzzy association rules and the extended miningalgorithms. Information Systems 147 (2002) 201-228.
    10. Claverie JM., Notredame C., Bioinformatics For Dummies(2nd Edition), Wiley Publishing, Inc. 111 River Street Hoboken, NJ 07030-5774, 2007.
    11. Creighton C. and Hanash S. Mining gene expression databases for association rules. Bioinformatics, 19,2003.
    12. David H., Heikki M., Padhraic S., Principles of Data Mining.北京:机械工业出版社,2003
    13. Evangelos t, Giovanni f., Data mining and knowledge discovery approaches based on rule induction techniques. Springer Science+Business Media, LLC, 233 Spring Street, New York, NY 10013, USA. 2006.
    14. Evfimievski A., Srikant R., Agrawal R., Gehrke J., Privacy preserving mining of association rules, SIGKDD'02, ACM Press, Edmonton, Alberta, Canada, 2002, pp1-12.
    15. Fayyad U., Piatetsky-Shapiro G., and Padhraic Smyth. Knowledge Discovery and Data Mining: Towards a Unifying Framework. Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining (KDD-96), CA, AAAI Press, 1996.
    16. Fayyad U, Stolorz U, Data mining and KDD: Promise and Challenges, Future Generation Computer Systems, 1997, 13(1): 99-115.
    17. Gasmi G., Hamrouni T., Abdelhak S.,et al., Extracting generic basis of association rules from sage data. In P. Perka and B.Cremilleux, editors, Proc. of 8th Intl. 2005,ECML/PKDD Workshop Discovery Challenge, Porto, Portugal.
    18. Georgii E, Richter L, Ruckert U,et al., Analyzing microarray data using quantitative association rules. Bioinformatics 2005, 21(Suppl 2): ii123-ii129.
    19. Giovanni F., Carlo V., Mathematical Methods for Knowledge Discovery and Data Mining. InformatIon Science Reference, Hershe, New York, IGI Global. 2008.
    20. Han J., Kamber M., Data Mining: Concepts and Techniques,北京:高等教育出版社,2001.
    21. Hand D, Mannila H, smypth P. Principle of Data Mining. Cambridge, CA. MIT Press. 2001.
    22. Hsu Hui-Hwang. Advanced data mining technologies in bioinformatics. Idea Group Inc. 2006.
    23. Imad Raha, Dongmei Ren, Amal Perera, et al. Incremental Interactive Mining of Constrained Association Rules from Biological Annotation Data with Nominal Features. ACM Symposium on Applied Computing, SAC’05, March 13-17, 2005, Santa Fe, New Mexico, USA. Pp123-127.
    24. Isaac SK., Alvin K., Atul JB., Microarrays for an Integrative Genomics. The MIT Press. Cambridge, Massachusetts London, England, 2003.
    25. Jarno Tuimala, M. Minna Laine. DNA Microarray Data Analysis. Scientific Computing Ltd. Picaset Oy, Helsinki 2003
    26. John W., Encyclopedia of data warehousing and mining. Idea Group Inc., 2006.
    27. Karl T., Ronald W., Yvan S., Ann N., Knowledge Discovery and Emergent Complexity in Bioinformatics, Springer-Verlag Berlin Heidelberg, 2007
    28. Li Y., Ning P,Wang X.S., Jajodia S., Discovering calendar-based temporal association rules, Data and Knowledge Engineering 44 (2003) 193-218.
    29. Michael J.Korenberg. Microarray data analysis: methods and applications. Humana Press Inc. Totowa, New Jersey. 2007.
    30. Nitin Gupta, Nitin Mangal, Kamal Tiwari, et al. Data Mining, LNAI 3755, pp.273–281, 2006.Springer-Verlag Berlin Heidelberg 2006.
    31. Olson, NE. (2006). The microarray data analysis process: from raw data to biological significance. NeuroRx 3(3): 373-383.
    32. Oyama T, Kitano K, Satou K, Ito T.,Extraction of knowledge on protein-protein interaction by association rule discovery. Bioinformatics,2002,18(5):705-714
    33. Pedro CS, Monica C, Andres R,et al,Integrated analysis of gene expression by association rules discovery. BMC Bioinformatics 2006, 7:54 doi:10.1186/1471-2105-7-54
    34. Quackenbush, J. Microarray data normalization and transformation. NatureGenetics. 2002. Vol.32 Supplement: 496-501.
    35. Rattanakronkul N.,Wattarujeekrit T.,Waiyamai K.,Predicting. Protein Structural Class from Closed Protein Sequences. PAKDD 2003, LNAI 2637, pp. 136–147, 2003. Springer-Verlag Berlin Heidelberg 2003.
    36. Schena,M. Microarray Analysis. 2003. John Wiley & sons, Inc.
    37. Steen,K., A Biologist's Guide to Analysis of DNA Microarray Data. A John Wiley & Sons, Inc.,Publication. 2002
    38. Stekel,D., MICROARRAY BIOINFORMATICS. Cambridge University Press. 2003
    39. Tan PN, Steinbach M, Kumar V. Introduction to Data Mining. Post & Telecom Press. 2006.1
    40. Sushmita M., Tinku A., Data Mining Multimedia, Soft Computing, and Bioinformatics. John Wiley & Sons, Inc., Hoboken, New Jersey. 2003.
    41. Tang Y, Jin B, Zhang YQ.2005. Granular support vector machines with association rules mining for protein homology prediction. Artif Intell Med. 2005 Sep-Oct;35(1-2):121-34.
    42. Tuzhilin A.,Adomavicius G.,Handling very large numbers of association rules in the analysis of microarray data. In Proceedings of the Eighth ACM SIGKDD International Conference on Data Mining and Knowledge Discovery. 23-26 July 2002 Edmonton, Canada;2002:396-404.
    43. Usama MF., Gregory PS., Padhraic S., From Data Mining To Knowledge Discovery. Adavance In Knowledge Dicovery And Data Mining, AAAI/MIT Press. 1996.
    44. Wang HC., Lee YS., Huang TH., Gene Relation Finding Through Mining Microarray Data and Literature. Trans. on Comput. Syst. Biol. V, LNBI 4070, pp. 83-96, 2006.
    45. Wang W., Yang J., Yu PS. Efficient mining of weighted association rules (WAR). Proceedings of the KDD, Boston MA, 2000, pp270-274.
    46. Westhead DR, Parish JH, Twyman RM.著;王明怡等译.生物信息学(第二版).北京:科学出版社,2004.9.
    47. Wijisen Jef. Treads in databases: Reasoning and Mining. IEEE Trans. OnKnowledge and Data Engineering. 2001, 13(3): 426-428.
    48. Witten IH, Frank E. Data Mining: Practical Machine Learning Tools and Techniques with Java Implenentations. San Francisco: CA. Morgan Kaufmann. 2000.
    49. Yang, IV, Chen E. Within the fold: assessing differential expression measures and reproducibility in microarray assays. Genome Biology.2002, 3(11):research0062.
    50. Zhang S., Zhang C., Yan X. Post-mining maintenance of association rules by weighting.Information Systems 28 (2003) 691-707.
    51.陈京民,数据仓库与数据挖掘技术,北京:电子工业出版社,2002.
    52.陈亮曦,刘新峰,史忠植.基于片断模式的多时间序列关联分析.计算机科学. 2006,33(1):232-235
    53.陈双平,郑浩然,刘海燕,王煦法.蛋白质序列中的关联规则发现及其应用.生物物理学报.2006,22(3),171-176
    54.崔雷,医学数据挖掘,高等教育出版社,2006.7
    55.郭萌,王珏,数据挖掘与数据库知识发现:综述,模式识别与人工智能,1998,11(3):292-299.
    56.胡侃,夏绍玮,基于大型数据仓库的数据采掘:研究综述,软件学报,1998,9(1):53-63.
    57.蒋定锋,潘娟娟,赵耐青.差异表达基因筛选方法的比较.中国卫生统计. 2006, 23(5): 417-420
    58.焦李成,刘芳,缑水平,等.智能数据挖掘与知识发现.西安电子科技大学出版社. 2006.
    59.刘独玉,杨晋浩,钟守铭.关联规则挖掘研究综述.成都大学学报(自然科学版).2006.25(1):54-58
    60.刘同明等,数据挖掘技术及其应用,北京:国防工业出版社,2001.
    61.马力军,关联规则算法性能分析与仿真研究,西安交通大学研究学位论文,2000.
    62.欧阳为民,蔡庆生.数据库中的时态数据发掘研究,计算机科学. 1998,25 (4): 60-63
    63.史忠植,知识发现,北京:清华大学出版社,2002.
    64.孙啸,陆祖宏,谢建明,生物信息学基础,清华大学出版社,北京,2005.
    65.杨金水.基因组学.北京:高等教育出版社,2002.6.
    66.袁玉波,杨传胜、黄廷祝等.数据挖掘与最优化技术及其应用.北京:科学出版社,2007.7
    67.赵国屏等,生物信息学,北京:科学出版社,2002.
    68.钟晓,马少平,张拔,俞瑞钊,数据挖掘综述,模式识别与人工智能,2001,14(1): 48-55.
    1. Agrawal R, Imielinski T, Swami A. Mining association rules between sets of items in large databases. Proc l993 ACM-SIGMOD Int.Conf Management of Data. Washington, D.C. pp207-216.
    2. Agrawal R., Srikant R. Fast algorithms for mining association rules. In Proc. 1994 Int. Conf. Very Large Databases, Santiago, Chile, September 1994, pp487-499.
    3. Agrawal R., et al. Parallel mining of association rules. IEEE transactions on knowledge and data engineering, 1996, 8(6), 962-969.
    4. Creighton C, Hanash S. Mining Gene Expression Databases For Association Rules. Bioinformatics. 2003, 19:79-86.
    5. Cheung DW., Han J., A Fast Distributed Algorithm for Mining Association Rules, Proceeding of 1996 International Conference on parallel and distributed information system, Miami Beach, Florida, USA.1996.
    6. Daniel PB., Werner D., Martin G., A Practical Approach To Microarray Data Analysis. Kluwer Academic Publishers. 2003.
    7. Daniel T. Larose. Discovering knowledge in data: an introduction to datamining. Published by John Wiley & Sons, Inc., Hoboken, New Jersey. Pp180-198. 2005.
    8. Eisen M.B., Spellman P.T., Brown P.O. and Botstein D. Cluster analysis and display of genome-wide expression patterns. Proc. Natl Acad. Sci. USA, 1998. 14863-14868.
    9. Han J., Fu Y. Discovery of multi-level association rules from large databases. Proc. of 21st international conference on very large databases, Zurich, Switzerland, Sept. 1995, 420-431.
    10. Han J., Pei J., Yin Y., Mining Frequent Patterns without Candidate Generation, Proc. ACM SIGMOD Int'1 Conf. On Management of Data, ACM Press, Dallas, Texas, TX USA, 2000, pp.1-12.
    11. Hsu Hui-Hwang. Advanced data mining technologies in bioinformatics. Idea Group Inc. Idea Group Publishing. 2006.
    12. Nong Y., The Handbook of Data Mining. Lawrence Erlbaum Associates,Inc., Publishers, 10 Industrial Avenue, Mahwah, New Jersey. 2003.
    13. Park J.S., Chen M.,Yu P.S. An Effiective Hash Based Algorithm for Mining Association Rules. ACM-SIGMOD International Conference Management of Data,1995.
    14. Paolo Giudici, Applied Data Mining: Statistical Methods for Business and Industry. John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex PO19 8SQ, England. 2003.
    15. Pascal Poncelet,Maguelonne Teisseire,Florent Masseglia. Data Mining Patterns: New Methods and Applications. IGI Global. Information Science Reference. 2008.
    16. Petra Perner (Ed.) Advances in Data Mining: Applications in Image Mining, Medicine and Biotechnology, Management and Environmental Control, and Telecommunications. 4th Industrial Conference on Data Mining, ICDM 2004 Leipzig, Germany, July 4-7, 2004. Springer Science + Business Media, Inc. 2005.
    17. Savasere A, Omiecinski E and Navathe S. An efficient algorithm for mining association rules in large database. In: Proc. of 21th Int. Conf. on Very LargeData Base. Zurich, Switzerland, 1995, 432-444.
    18. Sushmita M., Tinku A., Data Mining Multimedia, Soft Computing, and Bioinformatics. John Wiley & Sons, Inc., Hoboken, New Jersey. 2003.
    19. Srikant R., Agrawal R.. Mining quantitative association rules in large relational tables. In: Proc. 1996 ACM SIGMOD int'1 conf. Management data, Montreal, Canada, 1996, 1-12.
    20. Tan PN, Steinbach M, Kumar V. Introduction to Data Mining. Post & Telecom Press. 2006.1
    21. Toivonen H, Sampling large database for association rules. In: Proc. of 22th Int. Conf. on Very Large Data Base. Mumbai, India, 1996, 132-145.
    22.焦李成,刘芳,缑水平,等.智能数据挖掘与知识发现.西安电子科技大学出版社. 2006.
    23.張嘉惠.交易型資料庫之跨交易關聯規則探勘之研究.國立中央大學,資訊工程研究所,博士論文, 2006
    1. Ashburner M, Ball C, Blake J, et al. Gene Ontology: Tool For The Unification Of Biology. Nat Genet. 2000, 25:25-29.
    2. Attila G., Ulrich W., Simon BO., Etzard S., Ralph S., Mining co-regulated gene profiles for the detection of functional associations in gene expression data. Bioinformatics. Vol. 23 no.15 pages 1927-1935, 2007.
    3. Baldi P., Hatfield GW., DNA Microarrays and gene expression: from experiments to data analysis and modeling.(影印版).北京:科学出版社. 2003.10.
    4. Ballarati L, Rossi E, Bonati MT, et al., 13q Deletion and central nervous system anomalies: further insights from karyotype-phenotype analyses of 14 patients. Journal of Medical Genetics 2007;44:e60
    5. Barnes MR., Bioinformatics for Geneticists: A bioinformatics primer for theanalysis of genetic data (2nd Edition). John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex PO19 8SQ, England. 2007.
    6. Barry RZ., Weimin F., Geoffrey W., et al. GoMiner: a Resource For Biological Interpretation Of Genomic And Proteomic Data . Genome Biology, 2003; 4(4): R28. Epub 2003 Mar 25.
    7. Becquet C, Blachon S, Jeudy B, et al. Strong-Association-Rule Mining For Large-Scale Gene-Expression Data Analysis: A Case Study On Human SAGE Data . Genome Biology. 2002,3(12) :research0067.1–0067.16
    8. Berberidis C., Angelis L., Vlahavas I. Inter-transaction association rules mining for rare events prediction. Proc. Hellenic Conf. on Artificial Intelligence, Samos, Greece, pp.308-317, May 2004.
    9. Boycott KM, Flavelle S, Bureau A, et al., Homozygous deletion of the very low density lipoprotein receptor gene causes autosomal recessive cerebellar hypoplasia with cerebral gyral simplification. Am. J. Hum. The American Journal of Human Genetics, Volume 77, Issue 3, 477-483, 1 September 2005
    10. Brown L, Paraso M, Arkell R, Brown S, In vitro analysis of partial loss-of-function ZIC2 mutations in holoprosencephaly: alanine tract expansion modulates DNA binding and transactivation. Human Molecular Genetics 2005 14(3):411-420;
    11. Creighton C, Hanash S. Mining Gene Expression Databases For Association Rules. Bioinformatics. 2003, 19:79-86.
    12. Deguchi K, Inoue K, Avila WE, et al., Reelin and disabled-1 expression in developing and mature human cortical neurons. J Neuropathol Exp Neurol. 2003 Jun;62(6):676-84.
    13. Duncan B, Zhao K, HMGA1 mediates the activation of the CRYAB promoter by BRG1. DNA Cell Biol. 2007 Oct;26(10):745-52.
    14. Gao C, Tung AKH, Xu X, et al. FARMER: Finding Interesting Rule Groups in Microarray Datasets. SIGMOD. 2004 June 1318,2004, Paris, France.
    15. Georgii E, Richter L, Ruckert U, et al. Analyzing Microarray Data Using Quantitative Association Rules. Bioinformatics. 2005,21(Suppl 2): ii123-ii129.
    16. Grinblat Y, Sive H. ZIC gene expression marks anteroposterior pattern in the presumptive neurectoderm of the zebrafish gastrula. Dev. Dyn., 2001, 222: 688-689.
    17. Herz J, Chen Y, Reelin, lipoprotein receptors and synaptic plasticity. Nat Rev Neurosci. 2006 Nov;7(11):850-9.
    18. Hsu Hui-Hwang. Advanced data mining technologies in bioinformatics. Idea Group Inc. 2006.
    19. Hudson AJ, Munoz DG, EA familial syndrome of congenital cataract, mental impairment, and dentate gyrus atrophy. Ann Neurol. 1997 Apr;41(4):512-20.
    20. Ishiguro A, Aruga J, Functional role of Zic2 phosphorylation in transcriptional regulation. FEBS Lett. 2008 Jan 23;582(2):154-8.
    21. Lee Mei-Ling Ting. Analysis of Microarray Gene Expression Data. Kluwer Academic Publishers. 2004.
    22. Licker KS, Hutson LD, Genome-wide analysis and expression profiling of the small heat shock proteins in zebrafish. Gene. Vol.403, 1-2: 60-69, 2007
    23. Lu H., Han J., Feng L., Stock Movement and n-Dimensions Intertransaction Association Rules. Proc. 1998 SIGMOD Workshop Research Issues on Data Mining and Knowledge Discovery, vol. 12, pp. 1-7, June 1998.
    24. Michael J.Korenberg. Microarray data analysis: methods and applications. Humana Press Inc. Totowa, New Jersey. 2007.
    25. Moheb LA, Tzschach A, Garshasbi M, et al., Identification of a nonsense mutation in the very low-density lipoprotein receptor gene (VLDLR) in an Iranian family with dysequilibrium syndrome. Eur J Hum Genet. 2008 Feb; 16(2):270-3.
    26. Nagai T, Aruga J, Takada S, et al. The expression of the mouse Zic1, Zic2, and Zic3 gene suggests an essential role for Zic genes in body pattern formation. Dev. Biol., 1997, 182: 299-313.
    27. Nyholm MK, Wu SF, Dorsky RI, Grinblat Y, The zebrafish zic2a-zic5 gene pair acts downstream of canonical Wnt signaling to control cell proliferation in the developing tectum. Development 2007, 134, 735-746
    28. Pedro CS, Monica C, Andres R, et al. Integrated Analysis Of GeneExpression By Association Rules Discovery. BMC Bioinformatics, 2006, 7:54-69.
    29. Symth P, Goodman, RM. An Information Theoretic Approach To Rule Induction From Databases . IEEE Transactions on Knowledge and Data Engineering, 1992,4(4):301-316.
    30. Takahiro N. Kenji S. Emiko F, et al. A System for Finding Association Rules from Microarray Data and Public Databases . Genome Informatics. 2000,11: 356-357
    31. Tung AKH, Lu HJ, Han J, et al. Breaking the Barrier of Transactions Mining Inter-Transaction Association Rules. ACMKDD conference, San Di ego, USA,1999, 297-301.
    32. Tung AKH, Lu HJ, Han J, et al. Efficient Mining of Intertransaction Association Rules. IEEE Transactions On Knowledge And Data Engineering. 2003,15(1): 43-56
    33. Wang HC., Lee YS., Huang TH., Gene Relation Finding Through Mining Microarray Data and Literature. Trans. on Comput. Syst. Biol. V, LNBI 4070, pp. 83-96, 2006.
    34. Westhead DR., Parish JH., Twyman RM.(王明怡等译),Bioinformatics(生物信息学),北京:科学出处社,2004.
    35.秦亮曦,刘新峰,史忠植.基于片段模式的多时间序列关联分析.计算机科学. 2006,33(1):232-235
    36.秦亮曦,史忠植.多时间序列跨事务关联分析研究.计算机工程与应用. 2005, 27(41): 10-12, 173
    37.张娟,王慧锋.股票时间序列模型的关联规则挖掘.天津理工大学学报. 2006, 22(2): 35-38
    1. Abdelaziz B., George CR. Using metarules to organize and group discovered association rules. Data Min Knowl Disc. 14:409-431. 2007.
    2. Agrawal R., Imielinski T., Swami A. Mining association rules between sets of items in large databases. In Proceedings of 1993 ACM-SIGMOD International Conference on Management of Data, pages 207-216,1993.
    3. Attila G., Ulrich W., Simon BO., Etzard S., Ralph S., Mining co-regulated gene profiles for the detection of functional associations in gene expression data. Bioinformatics. Vol. 23 no.15 pages 1927-1935, 2007.
    4. Becquet C., Blachon S., Jeudy B., Boulicaut J.F. and Gandrillon O. Strong-association-rule mining for large-scale gene-expression data analysis: a case study on Human SAGE data.Genome Biology 2002, 3(12): research0067.1-0067.16.
    5. Carmona Saez,P., Monica C., Andres R. Integrated analysis of gene expression by Association Rules Discovery. BMC Bioinformatics, 7, 54. 2006
    6. Creighton Chad and Hanash Samir. Mining gene expression databases for association rules. Bioinformatics, 19: 79 - 86. 2003.
    7. Dong Guozhu, Pei Jian. Sequence Data Mining. Springer Science+Business Media, LLC. 2007.
    8. Eisen,M.B., Spellman,P.T., Brown,P.O. and Botstein,D. Cluster analysis and display of genome-wide expression patterns. Proc. Natl Acad. Sci. USA, 95, 14863-14868. 1998
    9. Georgii Elisabeth, Richter Lothar, Ulrich Rückert, and Stefan Kramer. Analyzing microarray data using quantitative association rules. Bioinformatics, 21: ii123 - ii129. 2005
    10. Hsu Hui-Hwang. Advanced data mining technologies in bioinformatics. Idea Group Inc. Idea Group Publishing. 2006.
    11. Ji Liping and Tan Kian-Lee. Mining gene expression data for positive and negative co-regulated gene clusters. Bioinformatics. 2004. Vol. 20 no. 16, pages 2711-2718.
    12. Koyuturk M., Szpankowski W., and Ananth Grama. Biclustering gene-feature matrices for statistically significant dense patterns. In Proceedings of the 8th Annual International Conference on Research in Computational Molecular Biology, pages 480-484. 2004
    13. Madeira S.C. and Oliveira A.L. An Evaluation of Discretization Methods for Non-Supervised Analysis of Time-Series Gene Expression Data, Inesc-ID Technical Report 42/2005, INESC-ID, Lisboa, Portugal, December 2005.
    14. MathWorks,MATLAB Programming:The Language of Technical Computing. The MathWorks, Inc.,2004.
    15. McIntosh T. and S. Chawla, On discovery of maximal confident rules without support pruning in microarray data. In 5th ACM SIGKDD Workshop on Data Mining in Bioinformatics (BIOKDD’05), 2005, pp. 37-45.
    16. McIntosh T. and Chawla S., High Confidence Rule Mining for Microarray Analysis. IEEE/ACM Transactions On Computational Biology And Bioinformatics. November 13, 2006
    17. Pascal Poncelet,Maguelonne Teisseire,Florent Masseglia. Data Mining Patterns: New Methods and Applications. IGI Global. Information Science Reference. 2008.
    18. Pedro CS, MONICA C, ANDRES R, et al. Integrated analysis of gene expression by association rules discovery. BMC Bioinformatics, 2006, 7:54-69.
    19. Pensa RG., Leschi C., Besson J., and Boulicaut J.. Assessment of discretization techniques for relevant pattern discovery from gene expression data. In 4th Workshop on Data Mining in Bioinformatics, 2004.
    20. Takahiro N. KENJI S. EMIKO F, et al. A System for Finding Association Rules from Microarray Data and Public Databases. Genome Informatics. 2000,11: 356-357
    21. Tuzhilin, A., and G. Adomavicius. Handling Very Large Numbers of Association Rules in the Analysis of Microarray Data. In Proc. Eighth Intl. Conf. on Knowledge Discovery and Data Mining (KDD2002), pp. 396-404. 2002.
    22. Wendy LM., Angel RM., Computational Statistics Handbook with Matlab. Chapman & Hall/CRC. 2002
    23.苏金明,阮沈勇编,MatLab实用教程.北京:电子工业出版社,2005.7
    24.胡守信,李柏年编,基于MatLab的数学实验.北京:科学出版社,2004.6
    25.龚剑,朱亮编,MatLab入门与提高.北京:清华大学出版社,2000.3
    1. Adomavicius G., Tuzhilin A. Expert-driven validation of rule-based user models in personalization applications. Data Mining and Knowledge Discovery, 5(1-2):33-58, 2001.
    2. Ashburner M, Ball C, Blake J, et al. Gene Ontology: Tool For The Unification Of Biology. Nat Genet. 2000, 25:25-29.
    3. Barry R Zeeberg, Weimin Feng, Geoffrey Wang, et al. GoMiner: a Resource For Biological Interpretation Of Genomic And Proteomic Data. Genome Biology, 2003; 4(4): R28. Epub 2003 Mar 25.
    4. Berry,J.A.,Linoff ,GS. Data Mining Techniques for Marketing ,Sales and Customer Support. John Wiley&Sons,Inc. 1997.
    5. Blanchard J., Guillet F., and Briand H.. Exploratory visualization for association rule rummaging. In Proceedings fourth International Workshop on Multimedia Data Mining MDM/KDD2003, pages 107-114, 2003.
    6. Bruzzese D. and Buono P.. Combining visual techniques for association rules exploration. In M. F. Costabile, editor, Proceedings Working Conference on Advanced Visual Interfaces AVI 2004, pages 381–384. ACM Press, 2004.
    7. Buono P. Analysing association rules with an interactive graph-based technique. In C. Stephanidis, editor, Proceedings HCI International, Special Session on Visual Data Mining, volume 4, pages 675–679. Lawrence Erlbaum, 2003.
    8. Buono P. and Costabile MF. Visualizing Association Rules in a Framework for Visual Data Mining. E.J. Neuhold Festschrift, LNCS 3379, pp.221-231, 2005. Springer-Verlag Berlin Heidelberg 2005
    9. Couturier O., J. Rouillard, and V. Chevrin. An Interactive Approach to Display Large Sets of Association Rules. Human Interface, Part I, HCII 2007, LNCS 4557, pp. 258-267, 2007.
    10. Dong G. and Li J. Interestingness of discovered association rules in terms of neighborhood-based unexpectedness. In Proceedings of the SecondPacific-Asia Conference on Knowledge Discovery and Data Mining, pages 72-86. Springer-Verlag, 1998.
    11. Eisen,M.B., Spellman,P.T., Brown,P.O. and Botstein,D. Cluster analysis and display of genome-wide expression patterns. Proc. Natl Acad. Sci. USA, 95, 14863–14868. 1998.
    12. Gupta GK, Strehl A, Ghosh J. Distance based clustering of association rules. In: Proc.Intelligent Engineering Systems Through Artificial Neural Networks, 1999, vol.9, 759-764.
    13. Hao M, Hsu M, Dayal U, et a1. Market basket analysis visualization on a spherical surface.HP Labs,Technical Report:HPL-2001-3.2001.
    14. Jorge A. Hierarchical clustering for thematic browsing and summarization of large sets of association rules. In Proceedings of the 2004 SIAM International Conference on Data Mining, 2004.
    15. Lent B, Swami A, Widom J. Clustering association rules. In: Proc. 1997 Int. Conf. Data Engineering, Birmingham. England, l997. 220-231.
    16. Mehmed Kantardzic. Data Mining:Concepts,Models,Methods,and Algorithms. 2002. Published by IEEE Press.
    17. Natarajan Rajesh, Shekar B., Understandability of Association Rules: A Heuristic Measure to Enhance Rule Quality. Studies in Computational Intelligence (SCI) 43, 179-203. 2007.
    18. Sahar S. Exploring interestingness through clustering: A framework. In Pro-ceedings of the IEEE International Conference on Data Mining (ICDM 2002),pages 677-680. IEEE, IEEE Computer Society Press, 2002.
    19. Symth P, Goodman RM. An Information Theoretic Approach To Rule Induction From Databases. IEEE Transactions on Knowledge and Data Engineering, 1992, 4(4): 301-316.
    20. Tan P.N.,Michael S.,Vipin K. Introduction to Data Mining. Published by Pearson Education,Inc. 2006.
    21. Techapichetvanich K. and Datta A. VisAR : A New Technique for Visualizing Mined Association Rules. ADMA 2005, LNAI 3584, pp. 88-95, 2005. Springer-Verlag Berlin Heidelberg 2005
    22. Toivonen H, Klemettinen M, Ronkainen P et a1. Pruning and grouping discovered association rules. In MLnet Workshop on Statistics, Machine Learning, and Discovery in Databases, Crete, Greece, l995, 47-52.
    23. Wang Hei-Chia,Lee Yi-Shiun,Huang Tian-Hsiang. Gene Relation Finding Through Ming Microarray Data and Literature. Transactions on Computational Systems Biology V, LNBI 4070,pp83-96,2006.
    24. Wang K., Tay SHW., Liu B. Interestingness-based interval merger for numeric association rules. In Proceedings of the International Conference on Data Mining and Knowledge Discovery, pages 121-128, New York City, August 1998. AAAI.
    25. Wong PC, Whitney P, Thomas J.Visualizing association roles for text mining.In:Proceedings of IEEE Symposium on Information Visualization,1999,San Francisco, IEEE Computer Society,120-123.
    26. Xu Jian-min,Cheng Yue-peng,Xin Li-jun.Document clustering approach based on term clustering and association rules.Computer Engineering and Applications. 2007,43(5):178-181.
    27.崔雷,主编.医学数据挖掘.北京:高等教育出版社. 2006.7.
    28.姜园,张朝阳,仇佩亮,戚玉鹏.对聚类算法普遍存在问题的解决办法.电路与系统学报. 2006.6. 9(3):92-99
    29.焦李成,刘芳,缑水平,刘静,陈莉.智能数据挖掘与知识发现.西安电子科技大学出版社,2006.8.
    30.娄兰芳,蒋志方,田世壮.影响关联规则挖掘的有趣性因素的研究,计算机工程与应用,2003,第6期,p190-191.
    31.阮备军,朱扬勇.基于商品分类信息的关联规则聚类.计算机研究与发展. 2004年2月,第41卷第2期,352-360
    32.易华容.聚类分析中相似性测量方法的研究.株洲师范高等专科学校学报.2002.4. 7(2):43-46
    33.袁玉波,杨传胜、黄廷祝等.数据挖掘与最优化技术及其应用.北京:科学出版社,2007.7

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700