用户名: 密码: 验证码:
siRNA设计中若干关键问题的研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
RNA干扰是由双链RNA引起的基因沉默现象,广泛应用于研究基因的功能、药物靶点筛选、疾病治疗等方面。siRNA设计是实现RNA干扰的有效途径,siRNA设计的优劣将直接影响RNA干扰的效果。
     目前siRNA设计方法中,设计规则方面存在的问题是:设计规则是基于序列特征,没有考虑靶结构对siRNA干扰效率的影响,导致设计出的siRNA序列的干扰效率较低。
     目前siRNA设计中,在预测候选siRNA的干扰效率方面存在的问题是:目前的预测方法主要考虑siRNA自身的特征,因此,预测的准确度不高,相关系数通常在0.63左右,从而导致候选的siRNA数量过多,给生物实验带来了很大的困难。如何提高siRNA干扰效率预测的准确度是目前急需解决的问题。
     由于siRNA的沉默效率与靶mRNA的结构相关,因此,包含了靶mRNA结构特征的siRNA设计可能会大大提高设计的准确性。本文提出了序列特征和结构特征相融合的siRNA设计算法,将其应用于2009年H1N1流感病毒和2008年季节性H1N1流感病毒的siRNA设计中。在多特征融合的靶向流感病毒的siRNA设计过程中,既考虑序列特征,也考虑靶序列的结构特征,用结构系数去衡量靶结构的优劣,根据结构系数的大小,选择出较优的候选靶序列,然后,根据靶序列设计出相应的siRNA序列。
     只有找到与siRNA干扰效率密切相关的特征,才能提高siRNA干扰效率预测的准确性。本文通过定性分析和定量分析,发现哺乳动物的siRNA干扰效率与mRNA的GC含量、靶点附近的GC含量、mRNA的茎比率、靶点附近的茎比率之间有很强的相关性。由于mRNA全局的特征和靶点附近局部的特征与siRNA干扰效率之间的相关性很强,所以,本文提出了一个基于随机森林的siRNA干扰效率预测模型,在预测siRNA干扰效率时,考虑siRNA自身特征的同时,也考虑mRNA全局的特征和靶点附近局部的特征。10折交叉验证的相关系数从0.63提高到0.7,从而证实了考虑mRNA全局的特征和靶点附近局部的特征可以显著地提高预测的准确性。
     综上所述,本文的创新点主要有以下两点:
     1、本文提出了多特征融合的siRNA设计算法,根据模式识别理论与实践,多特征融合是提高模式识别精度的有效手段。采用多特征(序列特征、结构特征)融合模型,来进行靶向流感病毒基因的siRNA设计,是提高其准确性的途径之一。
     2、本文提出了一个基于随机森林的siRNA干扰效率预测模型,在预测siRNA干扰效率时,考虑siRNA自身特征的同时,也考虑mRNA全局的特征和靶点附近局部的特征。10折交叉验证的相关系数从0.63提高到0.7,从而证实了考虑mRNA全局的特征和靶点附近局部的特征可以显著地提高预测的准确性。
RNA interference is making intra-cellular homology mRNA degradation byimport short double strand RNA, can inhibit expression of target mRNA. An effectiveapproach for RNA interference is through small interference RNA(siRNA) design, thequality of siRNA can influence the effect of RNA interference directly, thererfore,effective siRNA design method is crucial. Design siRNA by biological experimentrequires a lot of manpower and resources, high cost of experiments, long cycle, andlow efficiency, thus by bioinformatics and computer-aided means to design siRNA hasbecome effective means of achieving RNA interference.
     There are some problems in the design rule of siRNA design, at present the designrule is based on sequence feature, have not consider secondary structure of target, thusthe efficiency of designed siRNA is low.
     There are some problems in the prediction of candidate siRNA efficiency, atpresent predict of candidate siRNA efficiency are based on siRNA sequence features,the accuracy is low, the correlation coefficient is around0.63, thus which leads toexcessive number of candidate siRNA sequences, brings some difficulties to biologicalexperiments. How to improve accuracy of siRNA efficiency prediction is an urgentproblem.
     H1N1influenza virus is an RNA virus, it has strong infectivity and fast spreadvelocity, brings serious threat to human health. Now, the main method used to preventand treat flu is by vaccination and medication, the vaccine only can used to prevent fluand only for matched strains, when new flu outbreak, can not get correspondingvaccine timely, and can not guarantee the safety of the vaccine. Anti-influenza drugmainly are M2ion channel blocker and neuraminidase inhibitor, because after used ofthe former drug, can cause drug-resistant strains rapidly, thus the clinical application islimited;the price of the latter is expensive, ordinary people can not bear it and theproduction capacity of the drug is limited, if there is a large scale epidemic, thensupply of the drug is limited, we should pay more attention to that with the widely useof the drug, drug resistance is also steady spread and drug has some side effects on central system and digestive system. A Influenza virus brings serious threat to humanhealth, using the traditional method can not control new influenza virus timely andeffectively, thus researchers should consider various aspects of influenza virusinfection mechanism, look for effective method to prevent and treat influenza virus.
     By bioinformatics methods to analyze A H1N1influenza virus, using RNAinterference method to inhibit expression of virus gene, can control the spread of virus,compared with using the traditional experiment method to study H1N1influenza virus,this can reduce cost and shorten research cycle. RNA interference has become effectiveinstrument of inhibiting A influenza virus. The researchers according to traditionalsiRNA design method, designing siRNA which targeting to H1N1influenza virus toinhibit expression of the H1N1influenza virus gene, has got some achievements. Butat present siRNA design methods mainly are based on sequence features, have notconsidered influence of target structure on siRNA interference efficacy, thus designedsiRNA interference efficacy is low.
     Secondary structure of target mRNA is related to siRNA inhibitory efficacy, thuswhen designing effective siRNA, consider structure feature of target mRNA mayimprove accuracy. This study proposes a siRNA design algorithm which combinedsequence features and structure features, then apply it to design siRNA of2009H1N1influenza viral and2008seasonal H1N1influenza viral.
     Every H1N1influenza viral strain contains8gene fragments, namely PB2, PB1,PA, HA, NP, NA, MP, and NS, HA gene and NA gene are likely to mutation, while NP,MP, PA and PB1gene are relatively conservative, thus target gene of RNA interferencemainly are NP, MP, PA and PB1gene. The PA fragment has polymerase activity and isinvolved in the entire process of transcription and replication of the virus, play the roleof kinase or helicase, hence, it is a good target in the prevention and treatment ofH1N1flu, designing efficient siRNA to inhibit the expression of PA gene, can controlthe spread of H1N1influenza viral. In this study, the PA fragments of the H1N1influenza virus in2009and the seasonal influenza virus in2008of sequence andstructure are compared and analyzed, and found significant differences between them,not only in sequence features, but also in RNA secondary structures, which lead todifferent biological nature. This paper proposes a siRNA design algorithm whichcombined sequence features and structure features, when designing siRNA of H1N1influenza virus, not only considering sequence features, but also structure features, using structure coefficient to evaluate secondary structure of target, select the bettercandidate target and then according to target design corresponding siRNA sequence.On the basis of improved siRNA design algorithm, design siRNA of2009H1N1influenza virus and2008seasonal H1N1influenza virus respectively, and find that antarget which only have one base difference between2009H1N1influenza virus and2008seasonal H1N1influenza virus, which lay the foundation of finding mutualtarget.
     If researchers can find features which closely related to siRNA interferenceefficacy, then can improve the accuracy of prediction. This study proposes consideringmRNA global features and near siRNA binding site local features except siRNAfeatures, when predicting siRNA efficacy, considering20nucleotides at each side ofthe binding sequence, together with21nt at the siRNA binding region,61nt in all,named neighboring nucleotides. From the result of qualitative analysis, it can be seenthat the more the siRNA interference efficacy, the less the mRNA GC content, mRNAstem ratio, neighboring GC content, neighboring stem ratio. The qualitative analysisonly can see the tendency, but can not quantitative assessment, then do linearregression analysis, and find that there are strong correlation between the siRNAinhibitory efficacy and the average of the mRNA GC content, mRNA stem ratio,neighboring GC content, neighboring stem ratio, and the P-value is very significant.From the result of qualitative and quantitative analysis, it can be seen that there arestrong correlation between mRNA GC content, mRNA secondary structure feature andRNA interference efficacy, on the mRNA global level and neighboring location. Fromthe result of feature selection, it can be seen that some mRNA features and neighboringfeatures are important feature, and the number of important mRNA feature are muchmore than the number of important siRNA feature, thus when predicting siRNAinterference efficacy, should consider mRNA global feature and neighboring localfeature.
     Based on the above analysis, this study proposes a siRNA efficacy predictionmodel based on random forest using siRNA features, mRNA features, and near siRNAbinding site features, the correlation coefficient of10fold cross validation increasedfrom0.63to0.7, which confirmed that considering mRNA global feature andneighboring local feature can improve accuracy, therefore, when designing siRNA,should consider the influence of mRNA global features and near siRNA binding site local features on siRNA interference efficacy except siRNA features. The studysuggests that when designing effective siRNA target to mammal which have less GCcontent, fewer stem secondary structures, in other words, more loop secondarystructures of mRNA at both global and local flanking regions of the siRNA bindingsites are preferred,mRNA GC content and neighboring GC content less than50%arepreferred; mRNA stem ratio and neighboring stem ratio less than0.6are preferred. Thestudy provides a new idea for siRNA design, and directive significance to designeffective siRNA. In addition, the result of this study may also be helpful inunderstanding binding efficacy between microRNA and mRNA, it is because there aresome similarities between siRNA binding to mRNA and microRNA binding to mRNA.
     In summary, there are two innovation points in this paper:
     1、This study proposes a siRNA design algorithm of multi-feature fusion, whichtarget to influenza viral, according to theory and practice of pattern recognition,multi-feature fusion is effective means of improve recognition accuracy. Bymulti-feature(sequence feature and secondary structure) fusion method to designsiRNA which target to influenza viral is one means of improve accuracy.
     2、This study proposes a siRNA efficacy prediction model based on random forest,when predicting siRNA efficacy, consider the influence of mRNA global features andnear siRNA binding site local features on siRNA interference efficacy except siRNAfeatures. The correlation coefficient of10fold cross validation increased from0.63to0.7, which confirmed that considering mRNA global feature and neighboring localfeature can improve accuracy.
     For the future research, we will consider other features which related to siRNAinhibitory efficacy, mainly consider protein features. Protein binding features caninfluence siRNA inhibitory efficacy, it is because if there are proteins have bound ontarget, then siRNA difficult to bind on target, thus influence siRNA inhibitory efficacy.
引文
[1] Couzin J. Breakthrough of the year: Small RNAs make big splash[J]. Science,2002,298(5602):2296—2297.
    [2]满朝来,于晓龙. RNA干扰研究现状[J].中国农学通报,2007,23(10),238-243.
    [3] Matzke MA, Matzke AJM. Planting the seeds of a new paradigm[J]. PLOSBIOLOGY,2004,2(5): e133.
    [4]张安世,刘慧娟. RNA干扰及其应用[J].焦作师范高等专科学校学报,2004,20(3):58-60.
    [5]张新环. RNA干扰作用探究[J].河南科技学院学报(自然科学版),2006,34(1):5-8.
    [6]史毅,金由辛. RNA干扰与siRNA(小干扰RNA)研究进展[J].生命科学,2008,20(2):196-201.
    [7]遇玲,李名杨,郭余龙. RNA干扰机理与应用[J].安徽农业科学,2009,37(7):2870-2872.
    [8] Guo S, Kemphues KJ. par-1, a gene required for establishing polarity inC.elegans embryos, encodes a putative Ser/Thr kinase that is asymmetricallydistributed[J]. Cell,1995,81(4):611-620.
    [9] Fire A, Xu SQ, Montgomery MK, et al. Potent and specific genetic interferenceby double-stranded RNA in Caenorhabditis elegans [J]. Nature,1998,391(6669):806-811.
    [10]李建龙. RNAi技术中的siRNA序列设计方法研究[D].国防科学技术大学,2006.
    [11]谭金祥,任国胜. RNA干扰技术的研究进展[J].重庆医学,2005,34(2):282-285.
    [12]李方华,侯玲玲,苏晓华,郑扬等. RNA干扰的研究进展及应用[J].生物技术通讯,2010,21(5):740-745.
    [13]许德晖,黄辰,刘利英,宋土生.高效siRNA设计的研究进展[J].遗传,2006,28(11):1457–1461.
    [14] Mocellin S, Provenzano M.RNA interference: learning gene knock-down fromcell physiology[J]. Journal of Translational Medicine,2004,2:39.
    [15]刘明,刘国庆. RNA干扰的应用及其意义[J].生物学通报,2005,40(6):1-3.
    [16]聂瑞军, RNA干扰(RNAi)及其应用[J],生命科学趋势,2004,2(1):1-6.
    [17]万春鹏,周寿然,左爱仁. RNA干扰机制及其应用研究进展[J].现代生物医学进展,2008,8(2):372-375.
    [18]范怡敏,耿飞,吴兴中. RNAi的机制及RNAi技术的应用[J].2004,10(4):202-203.
    [19]王丽娜,袁崇刚. RNAi在药物研究中的应用[J].生命科学,2007,19(5):557-561.
    [20]王志勇,吕延成. RNA干扰中小干扰RNA的设计原则[J].医学综述,2010,16(5):675-677.
    [21] BERNSTEIN E,CAUDY A,HAMMOND S,et al. Role for a BidentateRibonuclease in the Initiation Step of RNA Interference [J]. Nature,2001,409(6818):363-366.
    [22] Elbashir SM,Harborth J,Weber K,Tuschl T. Analysis of gene function in somaticmammalian cells using small interfering RNAs. Methods,2002,26(2):199-213.
    [23] Reynolds A, Leake D, Boese Q, Scaringe S, Marshall WS, et al. Rational siRNAdesign for RNA interference [J]. Nature Biotechnology,2004,22(3):326–330.
    [24] Li W, Cha L. Predicting siRNA efficiency[J].Cellular and Molecular Life Sciences,2007,64(14):1785-1792.
    [25] Ui-Tei K, Naito Y, Takahashi F, Haraguchi T,et al. Guidelines for the selectionof highly effective siRNA sequences for mammalian and chick RNAinterference[J]. Nucleic Acids Research,2004,32(3):936–948.
    [26] Amarzguioui M, Prydz H. An algorithm for selection of functional siRNAsequences, Biochemical and Biophysical Research Communications,2004,316:1050–1058.
    [27] Jagla B, Aulner N, Kelly PD, et al. Sequence characteristics of functionalsiRNAs[J]. RNA,2005,11:864–872.
    [28] Hsieh AC, Bo R, Monola J, Vazquez F, et al. A library of siRNA duplexestargeting the phosphoinositide3-kinase pathway: determinants of gene silencingfor use in cell-based screens[J]. Nucleic Acids Research,2004,32(3):893-901.
    [29] Chalk AM, Wahlestedt C, Sonnhammer ELL. Improved and automated predictionof effective siRNA[J]. Biochemical and Biophysical Research Communications,2004,319(1):264–274.
    [30] Saetrom P. Predicting the efficacy of short oligonucleotides in antisense and RNAiexperiments with boosted genetic programming [J]. Bioinformatics,2004,20:3055-3063.
    [31] Huesken D, Lange J, Mickanin C, et al. Design of a genome-wide siRNA libraryusing an artificial neural network [J]. Nature biotechnology,2005,23(8):995-1000.
    [32] Vert JP, Foveau N, Lajaunie C, Vandenbrouck Y. An accurate and interpretablemodel for siRNA efficacy prediction [J]. BMC bioinformatics,2006,7:520.
    [33] Shabalina SA, Spiridonov AN, Ogurtsov AY. Computational models withthermodynamic and composition features improve siRNA design [J]. BMCBioinformatics,2006,7:65.
    [34] Matveeva O, Nechipurenko Y, Rossi L et, al. Comparison of approaches forrational siRNA design leading to a new efficient and transparent method[J].Nucleic Acids Research,2007,35(8): e63.
    [35] Gong WM, Ren YL, Xu QQ, et al. Integrated siRNA design based on surveying offeatures associated with high RNAi effectiveness [J]. BMC Bioinformatics,2006,7:516.
    [36] Lu ZJ, Mathews DH. Efficient siRNA selection using hybridizationthermodynamics [J], Nucleic Acids Research,2008,36(2):640-647.
    [37] Peek AS. Improving model predictions for RNA interference activities that usesupport vector machine regression by combining and filtering features [J].BMCBioinformatics,2007,8:182.
    [38] Wang LJ, Huang CY, Yang JY. Predicting siRNA potency with random forests andsupport vector machines [J]. BMC Genomics,2010,11(suppl3):s2.
    [39] Klingelhoefer JW, Moutsianas L, Holmes C. Approximate Bayesian featureselection on a large meta-dataset offers novel insights on factors that effectsiRNA potency [J]. Bioinformatics,2009,25(13):1594-1601.
    [40] Luo KQ, Chang DC. The gene-silencing efficacy of siRNA is strongly dependenton the local structure of mRNA at the target region [J]. Biochemical andBiophysical Research Communications,2004,318(1):303-310.
    [41] Schubert S, Grunweller A, Erdmann VA, Kurreck J. Local RNA target structureinfluences siRNA efficacy: systematic analysis of intentionally designed bindingregions [J]. Journal of Molecular Biology,2005,348(4):883-893.
    [42] Heale BSE, Soifer HS, Bowers C, Rossi JJ. siRNA target site secondary structurepredictions using local stable substructures[J]. Nucleic Acids Research,2005,33(3): e30.
    [43] Pan WJ, Chen CW, Chu YW. siPRED: predicting siRNA efficacy using variouscharacteristic methods[J]. PLOS One,2011,6(11): e27602.
    [44]邵惠训.季节性流感与甲型H1N1流感[J].中国医药生物技术,2010,5(3):231-233.
    [45]李战.2005-2010年济南市季节性流感及新甲型H1N1流感监测数据分析[D].山东大学,2011.
    [46]李刚.甲型H1N1流感病毒的分子特征[J].首都医科大学学报,2009,30(3):267-270.
    [47]施强.上海地区人甲型流感病毒基因变异与和季节性流行关系的研究[D],复旦大学,2010.
    [48]刘明,赵海波,田永强.2009甲型H1N1流感病毒研究进展[J].生物技术通报,2010,7:41-43.
    [49]王锋,高岚.甲型H1N1流感的特点及其防控[J].动物医学进展,2010,31(3):107-111.
    [50]林磊,童贻刚,祝庆余.新甲型H1N1流感病毒及疫情分析[J].军事医学科学院院刊,2009,33(3):201-204.
    [51]赵宇红,申昆玲.2009甲型H1N1流感研究进展[J].国际儿科学杂志,2010,37(1):6-10.
    [52]吴国灶,王樟凤,柴学文.甲型H1N1流感病毒的生物学特征及其防控[J].江西教育学院学报(综合).2010,31(3):31-34.
    [53]史迪,俞佳,史婧,马遂.2009年甲型H1N1流感的病原学及临床特征[J].中国急救医学,2010,30(2):122-125.
    [54]蔡闯,钟南山.2009年甲型H1N1流感研究近况[J],中国急救医学,2009,29(6):553-555.
    [55]刘超,胡春吉,徐瑞芹.2009甲型H1N1流感病毒的研究综述[J].安徽农业科学,2011,39(27):16767-16769.
    [56]徐小元,郑颖颖.甲型H1N1流感:一种新发传染病[J].中国医学杂志,2009,89(21):1441-1442.
    [57]李真,江倔,孙成栋,陈亮,冯丽.社区甲型流感临床特征分析与比较[J].中国医院感染学杂志,2011,21(5):867-869.
    [58]王清波,赵伟,叶伟.南京地区甲型H1N1流感56例临床分析[J].东南大学学报(医学版),2009,28(6):464-466.
    [59]郭芳.锦州市2007年~2009年季节性流感监测结果分析[J].中国卫生检验杂志,2010,20(4):856-857.
    [60]孔德文,高素香,张志诚,张新龙,吴炜.甲型H1N1流行性感冒35例影像学分析[J].中国实用医刊,2012,39(18):75-76.
    [61]张超.新型甲型H1N1/季节性流感病毒神经氨酸酶抑制剂评价体系的建立与药物筛选[D].北京协和医学院,2011
    [62]张伟. H5N1亚型高致病性禽流感病毒siRNA靶向制剂的研究[D].北京协和医学院,2010.
    [63] Spurgers, KB, Sharkey, CM, Warfield, KL, Bavari, S. Oligonucleotide antiviraltherapeutics: antisense and RNA interference for highly pathogenic RNAviruses[J]. Antiviral Research,2008,78(1):26–36.
    [64] Sui HY, Zhao GY, Huang JD, Jin DY, Yuen KY, Zheng BJ. Small interferingRNA targeting M2gene induces effective and long term inhibition of influenza avirus replication[J]. PLoS ONE,2009,4(5): e5671.
    [65] Wu ZQ, Yang YW, Yang F, Yang J, Hu YF, Zhao L, Wang JW, Jin Q. EffectivesiRNAs inhibit the replication of novel influenza A (H1N1) virus[J], AntiviralResearch,2010,85(3):559–561.
    [66]刘元宁,常亚萍,李志,张浩,田明尧.针对H1N1病毒的多特征siRNA设计[J].吉林大学学报(工学版),2010,40(3):776-781.
    [67] Yuanning Liu, Yaping Chang, Dong Xu, Zhi Li, Hao Zhang, Jie Li, Mingyao Tian.Optimised design of siRNA based on multi-featured comparison and analysis ofH1N1Virus[J], International Journal of data mining and bioinformatics,2013,7(4):345-357.
    [68] Watts JM, Dang KK, Gorelick RJ, Leonard CW, Jr JWB, Swanstrom R, Burch CL,Weeks KM. Architecture and secondary structure of an entire HIV-1RNAgenome[J]. Nature,460:711-716.
    [69]王翼飞,史定华.生物信息学—智能化算法及其应用[M].北京:化学工业出版社,2006:177-182.
    [70] Yiu SM, Wong PWH, Lam TW, Mui YC, Kung HF, Lin Marie, Cheung YT.Filteing of ineffective siRNAs and improved siRNA design tool[J].Bioinformatics,2005,21(2):144-151.
    [71] Tompkins SM, Lo CY, Tumpey TM, Epstein SL. Protection against lethalinfluenza virus challenge by RNA interference in vivo[J], PNAS,2004,101(23):8682-8686.
    [72]胡颖,叶枫,谢幸. RNA干扰技术中siRNA设计原则的研究进展[J].国际遗传学杂志,2007,30(6):419-422.
    [73] Mysara M, ELhefnawi M, Garibaldi JM, MysiRNA: Improving siRNA efficacyprediction using a machine-learing model combing multi-tools and wholestacking energy[J], Journal of Biomedical Informatics,2012,45(3):528-534.
    [74] Chan CY, Carmack CS, Long DD, Maliyekkel A, Shao Y, Roninson Igor B, DingY, A structural interpretation of the effect of GC-content on efficiency of RNAinterference[J], BMC Bioinformatics,2009,10(suppl1): s33.
    [75] Yuanning Liu, Yaping Chang, Chao Zhang, Qingkai Wei, Jingbo Chen, HuilingChen, Dong Xu. Influence of mRNA features on siRNA interference efficacy[J],Journal of Bioinformatics and Computational Biology,2013,11(3): SI.
    [76] Katoh T, Suzuki T, Specific residues at every third position of siRNA shape itsefficient RNAi activity[J], Nucleic Acids Research,2007,35(4): e27.
    [77] Shao Y, Chan CY, Maliyekkel A, et al., Effect of target secondary structure onRNAi efficiency[J], RNA,2007,13:1631-1640.
    [78]王全才.随机森林特征选择[D].大连理工大学,2011.
    [79]白杨.基于随机森林的外显子剪接增强子识别[D].哈尔滨工业大学,2010.
    [80]彭国兰.随机森林在企业信用评估中的应用[D].厦门大学,2007.
    [81]黄衍,查伟雄.随机森林与支持向量机分类性能比较[J].软件,2012,33(6):107-110.
    [82]葛振忠.基于随机森林和Copula的港口物流能力研究[D].天津大学,2010.
    [83]杨秋洁.基于IV属性选择的随机森林模型研究[D].合肥工业大学,2010.
    [84]李贞子,张涛,武晓岩,李康.随机森林回归分析及在代谢调控关系研究中的应用[J].中国卫生统计,2012,29(2):158-163.
    [85] Pang H, Lin AP, Holford M, et al. Pathway analysis using random forestsclassification and regression[J]. Bioinformatics,2006,22(16):2028-2036.
    [86] Truong Y, Lin XD, Beecher C. Learning a complex metabolomic dataset usingrandom forests and support vector machines[J]. Proceedings of the tenth ACMSIGKDD international conference on knowledge discovery and data mining,2004:835-840.
    [87]张华伟,王明文,甘丽新.基于随机森林的文本分类模型研究[J].山东大学学报(理学版),2006,41(3):5-9.
    [88]张春霞,郭高. Out-of-bag样本的应用研究[J].软件,2011,32(3):1-4.
    [89]李毓,张春霞.基于out-of-bag样本的随机森林算法的超参数估计[J].系统工程学报,2011,26(4):566-572.
    [90]方匡南,吴见彬,朱建平,谢邦昌.随机森林方法研究综述[J].统计与信息论坛,2011,26(3):32-38.
    [91]邱一卉,林成德.基于随机森林方法的异常样本检测方法[J],福建工程学院学报,2007,5(4):392-396.
    [92]马昕,王雪,杨洋.基于随机森林算法的大学生异动情况的预测[J].江苏科技大学学报(自然科学版),2012,26(1):86-90.
    [93]郝拉娣,于化东.标准差与标准误[J],编辑学报,2005,17(2):116-118.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700