用户名: 密码: 验证码:
蛋白质相互作用及其结合面热点残基的预测方法研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着人类基因组和其它物种基因组序列测定计划的顺利完成,生物学的研究从基因组时代步入后基因组时代。作为后基因组时代的重要研究领域之一的以蛋白质间相互作用研究为中心发展起来的蛋白质组学已经成为当今生命科学研究的热点和前沿领域。研究细胞内所有蛋白质的相互作用即相互作用组,分析各种蛋白质复合物的组成及其作用方式对于我们理解生物体的复杂运行机制至关重要。
     在过去的几年时间里,研究人员从计算角度出发,提出了很多的生物信息学方法来研究蛋白质相互作用。在这些方法之中,基于蛋白质序列的预测方法得到了极大的关注。这类方法不需要先验知识,可以广泛地用于蛋白质相互作用的研究之中。同时,蛋白质序列的测定速度远远大于蛋白质结构的实验鉴定速度。因此,利用蛋白质的序列信息来预测蛋白质之间的相互作用是一种非常理想的计算方法。本文从蛋白质序列出发,利用支持向量机和集成学习等机器学习方法来预测蛋白质相互作用。此外,我们还研究了对保持蛋白质的功能和蛋白质复合物结构的稳定性起着关键作用的热点残基。全文的主要工作概括如下:
     1.提出了一种基于氨基酸序列自相关描述符与旋转森林的蛋白质相互作用预测方法。自相关描述符刻画了在蛋白质序列上相隔一定距离的两个残基之间的相互作用,因此这种编码方式考虑到了氨基酸的邻域环境,可能会揭示整个序列上与蛋白质相互作用有关的模式。我们首先把氨基酸符号序列转换成理化属性表示的数值序列,然后利用自相关描述符把这些长度不等的蛋白质数值序列转换为一系列长度相同的矢量。最后我们应用旋转森林预测蛋白质相互作用。旋转森林是新近设计出的一种集成学习算法,可以同时提高集成分类器系统中的单分类器准确性和多样性。实验结果表明,我们的方法能够有效地预测蛋白质相互作用,在酵母和幽门螺杆菌数据集上均取得了理想的识别效果。
     2.提出了一种基于氨基酸序列分段局部描述符与支持向量机的蛋白质相互作用预测方法。蛋白质相互作用的一个重要特征是相互作用经常发生在序列上的间断区域,在这些区域中,那些序列上相距较远的残基通过蛋白质的折叠从而在空间上相距很近。基于氨基酸序列分段局部描述符考虑到了这种序列上相距较远残基之间的相互作用关系。我们首先将蛋白质序列划分为长度和组成可变的十个局部序列片段,然后再通过局部描述符来编码每一个局部序列片段。所以这种方法可以捕获多个相互重叠的序列上连续和间断的结合模式。在基于这种编码策略的支持向量机预测模型上的实验结果表明我们的方法能有效提高蛋白质相互作用的预测结果。
     3.构建了一个元学习方法模型来预测蛋白质相互作用。在我们上述提出的两种特征编码方法基础上,我们又根据相关的研究报道,选择了四种性能良好的编码方法。然后通过这些不同的特征编码方法结合支持向量机建立了六种基于蛋白质序列的相互作用预测单分类器模型。在这些性能优异的单分类器模型基础上,我们构建了基于元学习方法的蛋白质相互作用预测集成学习系统。结果表明元学习方法模型能够使预测性能获得较大的提升。此外,我们的模型在跨物种数据集上也表现出了良好的性能。
     4.提出了一种基于氨基酸溶剂可及性和突出指数的相互作用结合面热点残基预测方法。在应用计算方法来研究蛋白质相互作用结合面热点残基时,如何选择有效的生物特征是需要解决的关键问题。我们首先从蛋白质序列和结构出发,提取了一系列与热点残基可能相关的生物特征。然后通过特征选择,构建了九个基于单一特征的支持向量机分类模型。最后,为了进一步提高热点残基预测的精度,我们使用了简单的多数投票表决法来对这九个模型的输出进行了集成决策处理。我们的研究表明氨基酸残基的溶剂可及性和突出指数是热点残基预测中的主要判别特征。在这里,我们是首次应用氨基酸残基的突出指数来对热点残基进行预测。实验结果证实了我们的方法能更加有效地对热点残基进行分类,在预测精度上有着显著性的提高。
With the complement of the sequencing human and other species genome, the study of biology has been gradually transferred from the genomics era to the post-genomics era. As one of the most important field of post-genomics era, proteomics developed by focusing on the study of all possible protein-protein interactions (PPIs) in a cell has become the hot topic and fronter of life science. The studies of PPIs can help us to understand essential mechanisms of life processes.
     So far a number of computational methods have been explored for the large-scale prediction of PPIs. Among these methods, a unique category of protein sequence-based prediction methods attracted much attention. The accuracy and reliability of these methods do not depend on the prior information of the protein pairs. Due to the limited availability of three dimensional structures of proteins and the rapid increase of the number of protein sequences, the approaches that use amino acid sequence information alone to guide the discovery of PPIs are of particular interest. Therefore, the current study is to seek machine learning techniques such as support vector machine (SVM) and multiple classification system to predict PPIs from sequences. In addition, we also introduce an ensemble learning method with SVMs to predict hot spot residues, which are observed to be crucial for preserving protein function and maintaining the stability of protein association. The main works in this thesis can be introduced as follows:
     1. A sequence-based approach was proposed to predict PPIs by combining a new feature representation using autocorrelation descriptor with rotation forest. Autocorrelation descriptor accounts for the interactions between amino acid residues within a certain distance apart in the sequence, so this descriptor adequately takes the local environments of amino acids effect into account and makes it possible to discover patterns that run through entire sequences. The amino acid sequences were firstly translated into numerical values representing six physicochemical properties, and then these numerical sequences were converted into a serious of fixed-length vectors by autocorrelation descriptor. Finally, the rotation forest was constructed using these vectors as input. Rotation forest is a newly proposed robust ensemble system, which can enhance the accuracy and the diversity for single classifiers in the ensemble simultaneously. Experimental results on Saccharomyces cerevisiae and Helicobacter pylori datasets show that our proposed approach outperforms those previously published in literature, which demonstrates the effectiveness and efficiency of the proposed method.
     2. A method based on novel representation of local protein sequence descriptor and SVM was presented to infer PPIs. One particular feature of protein interaction is that the interactions usually occur in the discontinuous regions in the protein sequence, where distant residues are brought into spatial proximity by protein folding. In the current study, a novel representation of local protein sequence descriptor was used to involve the information of interactions between distant amino acids in the sequence. A protein sequence was characterized by ten local descriptors of varying length and composition. So this method is capable of capturing multiple overlapping continuous and discontinuous binding patterns within a protein sequence. As expected, the experimental results show that our SVM-based predictive model with this encoding scheme is an important complementary method for PPIs prediction.
     3. A public meta predictor was constructed to infer PPIs using only the information of protein sequence. Besides the foregoing two feature representation methods (i.e. autocorrelation descriptor and local descriptor), additional four methods were selected according to their prediction accuracy in previous studies. We then built six sequence-based individual classifiers by combining different feature representation methods and SVMs. Finally, we adopted another SVM as the meta predictor to integrate the prediction decision values of these excellent component predictors. The results demonstrated that our meta predictor is promising. In addition, we used the final prediction model trained on the PPIs dataset of S.cerevisiae to predict interactions in other species. The results reveal that the meta model is also capable of performing cross-species predictions.
     4. A feature-based method that combines protrusion index with solvent accessibility was presented for accurate prediction of hot spots in protein interfaces. Up to now, the biological properties that are responsible for hot spots have not been fully understood. Consequently, the features previously identified as being correlated with hot spots are still insufficient. We first extracted a wide variety of features from a combination of protein sequence and structure information. And then we performed feature selection to remove noisy and irrelevant features, and thus improved the performance of the classifier. After extensive feature selection, nine individual-feature based predictors were developed to identify hot spots using SVMs. Finally, we employed an ensemble classifier approach, which further improved prediction accuracies of hot spots. To demonstrate its effectiveness, the proposed method was applied to two benchmark datasets. Empirical studies show that our method can yield significantly better prediction accuracy than those previously published in the literature.
引文
Aloy P, Russell R.2002. Interrogating protein interaction networks through structural biology[J]. Proceedings of the National Academy of Sciences of the United States of America,99(9): 5896-5901.
    Aloy P, Russell R.2003. InterPreTS:protein interaction prediction through tertiary structure[J]. Bioinformatics,19(1):161-162.
    Bader G, Donaldson I, Wolting C, et al.2001. BIND--The Biomolecular Interaction Network Database[J]. Nucleic Acids Research,29(1):242.
    Bahadur P.2004. A dissection of specific and non-specific protein-protein interfaces[J]. Journal of molecular biology,336(4):943-955.
    Ben-Hur A, Noble W.2005. Kernel methods for predicting protein-protein interactions[J]. Bioinformatics,21 (Suppl 1):i38-i46.
    Ben-Hur A, Noble W.2006. Choosing negative examples for the prediction of protein-protein interactions[J]. BMC bioinformatics,7(Suppl 1):S2.
    Bock J, Gough D.2001. Predicting protein-protein interactions from primary structure[J]. Bioinformatics,17(5):455-460.
    Bogan A, Thorn K.1998. Anatomy of hot spots in protein interfaces[J]. Journal of molecular biology,280(1):1-9.
    Breiman L.1996. Bagging predictors[J]. Machine learning,24(2):123-140.
    Breiman L.1998. Arcing classifier[J]. The Annals of Statistics,26(3):801-824.
    Breiman L.2001. Random forests[J]. Machine learning,45(1):5-32.
    Breitkreutz B, Stark C, Reguly T, et al.2007. The BioGRID interaction database:2008 update[J]. Nucleic Acids Research.
    Burgoyne N, Jackson R.2006. Predicting protein interaction sites:binding hot-spots in protein-protein and protein-ligand interfaces[J]. Bioinformatics,22(11):1335-1342.
    Capra J, Laskowski R, Thornton J, et al.2009. Predicting Protein Ligand Binding Sites by Combining Evolutionary Sequence Conservation and 3D Structure[J]. PLoS Comput Biol, 5(12):e1000585.
    Charton M, Charton B.1982. The structural dependence of amino acid hydrophobicity parameters[J]. Journal of Theoretical Biology,99(4):629-644.
    Chatr-aryamontri A, Ceol A, Palazzi L, et al.2007. MINT:the Molecular INTeraction database[J]. Nucleic Acids Research,35(Database issue):D572.
    参考文献
    Chen L, Wu L, Wang Y, et al.2006. Inferring Protein Interactions from Experimental Data by Association Probabilistic Method[J]. Proteins:Structure, Function, and Bioinformatics,62: 833-837.
    Chen X, Liu M.2005. Prediction of protein-protein interactions using random decision forest framework[J]. Bioinformatics,21(24):4394-4400.
    Chen X, Jeong J.2009. Sequence-based prediction of protein interaction sites with an integrative method[J]. Bioinformatics,25(5):585-591.
    Cho K, Kim D, Lee D.2009. A feature-based approach to modeling protein-protein interaction hot spots[J]. Nucleic Acids Research,37(8):2672-2687.
    Clackson T, Wells J.1995. A hot spot of binding energy in a hormone-receptor interface[J]. Science,267(5196):383-386.
    Conte L, Chothia C, Janin J.1999. The atomic structure of protein-protein recognition sites[J]. Journal of molecular biology,285(5):2177-2198.
    Cristianini N, Shawe-Taylor J. An- introduction to support Vector Machines:and other-kernel-based learning methods[M]. Cambridge Univ Pr,2000.
    Dandekar T, Snel B, Huynen M, et al.1998. Conservation of gene order:a fingerprint of proteins that physically interact[J]. Trends in biochemical sciences,23(9):324-328.
    Darnell S, Page D, Mitchell J.2007. An automated decision-tree approach to predicting protein interaction hot spots[J]. PROTEINS-NEW YORK-,68(4):813-823.
    Darnell S, LeGault L, Mitchell J.2008. KFC Server:interactive forecasting of protein interaction hot spots[J]. Nucleic Acids Research,36(Web Server issue):W265-W269.
    Datsenko K, Wanner B.2000. One-step inactivation of chromosomal genes in Escherichia coli K-12 using PCR products[J]. Proceedings of the National Academy of Sciences of the United States of America,97(12):6640-6645.
    DeLano W.2002. Unraveling hot spots in binding interfaces:progress and challenges[J]. Current opinion in structural biology,12(1):14-20.
    Deng M, Mehta S, Sun F, et al.2002. Inferring domain-domain interactions from protein-protein interactions[J]. Genome research,12(10):15401548.
    Dietterich T.2000. Ensemble methods in machine learning[J]. Lecture notes in computer science, 1857:1-15.
    Duda R, Hart P, Stork D. Pattern classification[M]. Citeseer,2001.
    Dundas J, Ouyang Z, Tseng J, et al.2006. CASTp:computed atlas of surface topography of proteins with structural and topographical mapping of functionally annotated residues[J]. Nucleic Acids Research,34(Web Server issue):W116.
    Dunker A, Lawson J, Brown C, et al.2001. Intrinsically disordered protein[J]. Journal of Molecular Graphics and Modelling,19(1):26-59.
    Eisenberg D, Marcotte E, Xenarios I, et al.2000. Protein function in the post-genomic era[J]. Nature,405(6788):823-826.
    Elcock A, McCammon J.2001. Identification of protein oligomerization states by analysis of interface conservation[J]. Proceedings of the National Academy of Sciences,98(6): 2990-2994.
    Enright A, Iliopoulos I, Kyrpides N, et al.1999. Protein interaction maps for complete genomes based on gene fusion events[J]. Nature,402(6757):86-90.
    Feng Z, Zhang C.2000. Prediction of membrane protein types based on the hydrophobic index of amino acids[J]. Journal of Protein Chemistry,19(4):269-275.
    Fischer T, Arunachalam K, Bailey D, et al.2003. The binding interface database (BID):a compilation of amino acid hot spots in protein interfaces[J]. Bioinformatics,11:1453-1454.
    Frank E, Hall M, Trigg L, et al.2004. Data mining in bioinformatics using Weka[J]. Bioinformatics,20(15):2479-2481.
    Gallet X, Charloteaux B, Thomas A, et al.2000. A fast method to predict protein interaction sites from sequences[J]. Journal of molecular biology,302(4):917-926.
    Ghavidel A, Cagney G, Emili A.2005. A skeleton of the human protein interactome[J]. Cell, 122(6):830-832.
    Giot L, Bader J, Brouwer C, et al.2003. A protein interaction map of Drosophila melanogaster[J]. Science,302(5651):1727-1736.
    Gobel U, Sander C, Schneider R, et al.1994. Correlated mutations and residue contacts in proteins[J]. Proteins-Structure Function And Genetics,18(4):309-317.
    Goldenberg O, Erez E, Nimrod G, et al.2009. The ConSurf-DB:pre-calculated evolutionary conservation profiles of protein structures[J]. Nucleic Acids Research,37(Database issue): D323-D327.
    Gonzalez-Ruiz D, Gohlke H.2006. Targeting protein-protein interactions with small molecules: challenges and perspectives for computational binding epitope detection and ligand finding[J]. Current medicinal chemistry,13(22):2607-2625.
    Grantham R.1974. Amino acid difference formula to help explain protein evolution[J]. Science, 185:862-864.
    Guerois R, Nielsen J, Serrano L.2002. Predicting changes in the stability of proteins and protein complexes:a study of more than 1000 mutations[J]. Journal of molecular biology,320(2): 369-387.
    Guharoy M, Chakrabarti P.2005. Conservation and relative importance of residues across protein-protein interfaces[J]. Proceedings of the National Academy of Sciences,102(43): 15447-15452.
    Guldener U, Munsterkotter M, Oesterheld M, et al.2006. MPact:the MIPS protein interaction resource on yeast[J]. Nucleic Acids Research,34(Database Issue):D436.
    Guney E, Tuncbag N, Keskin O, et al.2008. HotSprint:database of computational hot spots in protein interfaces [J]. Nucleic Acids Research,36(Database issue):D662-D666.
    Guo Y, Yu L, Wen Z, et al.2008. Using support vector machine combined with auto covariance to predict protein-protein interactions from protein sequences[J]. Nucleic Acids Research,36(9): 3025-3030.
    Halle B.2002. Flexibility and packing in proteins[J]. Proceedings of the National Academy of Sciences,99(3):1274-1279.
    Han D, Kim H, Jang W, et al.2004. PreSPI:a domain combination based prediction system for protein-protein interaction[J]. Nucleic Acids Research,32(21):6312.
    Hansen L, Salamon P.1990. Neural network ensembles[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,12(10):993-1001.
    Ho T.1998. The random subspace method for constructing decision forests[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,20(8):832-844.
    Ho Y, Gruhler A, Heilbut A, et al.2002. Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry[J]. Nature,415(6868):180-183.
    Huang C, Morcos F, Kanaan S, et al.2007. Predicting protein-protein interactions from protein domains using a set cover approach[J]. IEEE ACM Transactions on Computational Biology and Bioinformatics,4(1):78-87.
    Hue M, Riffle M, Vert J, et al.2010. Large-scale prediction of protein-protein interactions from structures[J]. BMC bioinformatics,11(1):144.
    Humphrey W, Dalke A, Schulten K.1996. VMD:visual molecular dynamics[J]. Journal of molecular graphics,14(1):33-38.
    Ito T, Chiba T, Yoshida M.2001. Exploring the protein interactome using comprehensive two-hybrid projects[J]. TRENDS in Biotechnology,19(10):S23-S27.
    Jang H, Lim J, Lim J, et al.2006. Finding the evidence for protein-protein interactions from PubMed abstracts[J]. Bioinformatics,22(14):e220-e226.
    Jansen R, Yu H, Greenbaum D, et al.2003. A Bayesian networks approach for predicting protein-protein interactions from genomic data[J]. Science,302(5644):449-453.
    Jones S, Thornton J.1995. Protein-protein interactions:a review of protein dimer structures[J]. Progress in biophysics and molecular biology,63(1):31-65.
    Jones S, Thornton J.1996. Principles of protein-protein interactions[J]. Proceedings of the National Academy of Sciences,93(1):13-20.
    Kerrien S, Alam-Faruque Y, Aranda B, et al.2007. IntAct--open source resource for molecular interaction data[J]. Nucleic Acids Research,35(Database issue):D561.
    Keskin O, Ma B, Nussinov R.2005. Hot regions in protein-protein interactions:the organization and contribution of structurally conserved hot spot residues[J]. Journal of molecular biology, 345(5):1281-1294.
    Keskin O, Ma B, Nussinov R.2005. Hot regions in protein-protein interactions:the organization and contribution of structurally conserved hot spot residues[J]. Journal of molecular biology, 345(5):1281-1294.
    Keskin O, Gursoy A, Ma B, et al.2008. Principles of Protein-Protein Interactions:What are the Preferred Ways For Proteins To Interact?[J]. Chem Rev,108(4):1225-1244.
    Keskin O, Bahar I, Jernigan R, et al.1998. Empirical solvent-mediated potentials hold for both intra-molecular and inter-molecular inter-residue interactions[J]. Protein Science,7(12): 2578-2586.
    Kortemme T, Baker D.2002. A simple physical model for binding energy hot spots in protein-protein complexes[J]. Proceedings of the National Academy of Sciences,99(22): 14116-14121.
    Krigbaum W, Komoriya A.1979. Local interactions as a structure determinant for protein molecules:Ⅱ[J]. Biochimica et biophysica acta,576(1):204.
    Kuncheva L, Rodriguez J.2007. An experimental study on Rotation Forest ensembles[J]. Multiple classifier systems:459-468.
    Lee B, Richards F.1971. The interpretation of protein structures:Estimation of static accessibility [J]. Journal of molecular biology,55(3):379-380.
    Li J, Liu Q.2009.'Double water exclusion':a hypothesis refining the O-ring theory for the hot spots at protein interfaces [J]. Bioinformatics,25(6):743-750.
    Li L, Zhao B, Cui Z, et al.2006. Identification of hot spot residues at protein-protein interface[J]. Bioinformation,1(4):121-126.
    Li N, Sun Z, Jiang F.2008. Prediction of protein-protein binding site by using core interface residue and support vector machine[J]. BMC bioinformatics,9(1):553.
    Li X, Keskin O, Ma B, et al.2004. Protein-protein interactions:hot spots and structurally conserved residues often locate in complemented pockets that pre-organized in the unbound states:implications for docking[J]. Journal of molecular biology,344(3):781-795.
    Li X, Keskin O, Ma B, et al.2004. Protein-protein interactions:hot spots and structurally conserved residues often locate in complemented pockets that pre-organized in the unbound states:implications for docking[J]. Journal of molecular biology,344(3):781-795.
    Li Z, Lin H, Han L, et al.2006. PROFEAT:a web server for computing structural and physicochemical features of proteins and peptides from amino acid sequence[J]. Nucleic Acids Research,34(Web Server issue):W32.
    Licamele L, Getoor L, Predicting Protein-Protein Interactions Using Relational Features,2007.
    Lin N, et al.2004. Information assessment on predicting protein-protein interactions[J]. BMC Bioinformatics,5:154
    Lise S, Archambeau C, Pontil M, et al.2009. Prediction of hot spot residues at protein-protein interfaces by combining machine learning and energy-based methods[J]. BMC bioinformatics, 10(1):365.
    Liu K, Huang D.2008. Cancer classification using rotation forest[J]. Computers in Biology and Medicine,38(5):601-610.
    Liu Q, Li J.2010. Propensity vectors of low-ASA residue pairs in the distinction of protein interactions[J]. Proteins,78(3):589-602.
    Liu Y, Kim I, Zhao H.2008. Protein interaction predictions from diverse sources[J]. Drug discovery today,13(9-10):409-416.
    Lo S, Cai C, Chen Y, et al.2005. Effect of training datasets on support vector machine prediction of protein-protein interactions[J]. Proteomics,5(4):876-884.
    Lu L, Lu H, Skolnick J.2002. MULTIPROSPECTOR:an algorithm for the prediction of protein-protein interactions by multimeric threading[J]. Proteins:Structure, Function, and Bioinformatics,49(3):350-364.
    Ma B, Elkayam T, Wolfson H, et al.2003. Protein-protein interactions:structurally conserved residues distinguish between binding sites and exposed protein surfaces[J]. Proceedings of the National Academy of Sciences,100(10):5772-5777.
    Mandell J, Falick A, Komives E.1998. Identification of protein-protein interfaces by decreased amide proton solvent accessibility[J]. Proceedings of the National Academy of Sciences, 95(25):14705-14710.
    Martin S, Roe D, Faulon J.2005. Predicting protein-protein interactions using signature products[J]. Bioinformatics,21(2):218-226.
    Meador W, Means A, Quiocho F.1992. Target enzyme recognition by calmodulin:2.4 A structure of a calmodulin-peptide complex[J]. Science,257(5074):1251-1255.
    Mewes H, Heumann K, Kaps A, et al.1999. MIPS:a database for genomes and protein sequences[J]. Nucleic Acids Research,27(1):44.
    Mihel J, iki M, Tomi S, et al.2008. PSAIA-Protein Structure and Interaction Analyzer[J]. BMC Structural Biology,8(1):21.
    Moran P.1950. Notes on continuous stochastic phenomena[J]. Biometrika,37(1-2):17-23.
    Moreira I, Fernandes P, Ramos M.2007. Hot spots-A review of the protein-protein interface determinant amino-acid residues[J]. Proteins,68(4):803-812.
    Moreira I, Fernandes P, Ramos M.2007. Hot spots-A review of the protein-protein interface determinant amino-acid residues[J]. Proteins,68:803-812.
    Najafabadi H, Salavati R.2008. Sequence-based prediction of protein-protein interactions by means of codon usage[J]. Genome Biology,9(5):R87.
    Nanni L.2005a. Fusion of classifiers for predicting Protein-Protein interactions[J]. Neurocomputing,68:289-296.
    Nanni L.2005b. Hyperplanes for predicting protein-protein interactions[J]. Neurocomputing, 69(1-3):257-263.
    Nanni L, Lumini A.2006. An ensemble of K-local hyperplanes for predicting protein-protein interactions[J]. Bioinformatics,22(10):1207-1210.
    Naul B, A Review of Support Vector Machines in Computational Biology,2009.
    Noreen I, Thornton J.2003. Diversity of protein-protein interactions[J]. EMBO Journal,22(14): 3486-3492.
    Ofran Y, Rost B.2007. Protein-protein interaction hotspots carved into sequences[J]. PLoS Comput Biol,3(7):e119.
    Park Y.2009. Critical assessment of sequence-based protein-protein interaction prediction methods that do not require homologous protein sequences[J]. BMC bioinformatics,10(1): 419.
    Patil A, Nakamura H.2005. Filtering high-throughput protein-protein interaction data using a combination of genomic features[J]. BMC bioinformatics,6(1):100.
    Pazos F, Valencia A.2001. Similarity of phylogenetic trees as indicator of protein-protein interaction[J]. Protein Engineering Design and Selection,14(9):609-614.
    Pellegrini M, Marcotte E, Thompson M, et al.1999. Assigning protein functions by comparative genome analysis:Protein phylogenetic profiles[J]. Proceedings of the National Academy of Sciences of the United States of America,96(8):4285-4288.
    Peng Z, Fei-Fei T, Bo L, et al.2006. Genetic algorithm-based virtual screening of combinative mode for peptide/protein[J]. ACTA CHIMICA SINICA,64(7):691-697.
    Pennisi E.2001. The human genome[J]. Science,291(5507):1177-1180.
    Pintar A, Carugo O, Pongor S.2002. CX, an algorithm that identifies protruding atoms in proteins[J]. Bioinformatics,7:980-984.
    Pintar A, Carugo O, Pongor S.2003. DPX:for the analysis of the protein core[J]. Bioinformatics, 19(2):313-314.
    Porollo A, Meller J.2007. Prediction-based fingerprints of protein-protein interactions[J]. PROTEINS:Structure, Function, and Bioinformatics,66(3):630-645.
    Pupko T, Bell R, Mayrose I, et al.2002. Rate4Site:an algorithmic tool for the identification of functional regions in proteins by surface mapping of evolutionary determinants within their homologues[J]. Bioinformatics,18(Suppl 1):S71-S77.
    Qi Y, Klein S, Bar Z. Random forest similarity for protein-protein interaction prediction from multiple:proceedings of the the 10th Annual Pacific Symposium on Biocomputing,2005[C].
    Radivojac P, Iakoucheva L, Oldfield C, et al.2007. Intrinsic disorder and functional proteomics[J]. Biophysical journal,92(5):1439-1456.
    Rain J, Selig L, De Reuse H, et al.2001. The protein-protein interaction map of Helicobacter pylori[J]. Nature,409(6817):211-215.
    Rajamani D, Thiel S, Vajda S, et al.2004. Anchor residues in protein-protein interactions[J]. Proceedings of the National Academy of Sciences,101(31):11287-11292.
    Ramachandran P, Antoniou A.2008. Identification of Hot-Spot Locations in Proteins Using Digital Filters[J]. IEEE Journal of Selected Topics in Signal Processing,2(3):378-389.
    Rodriguez J, Kuncheva L, Alonso C.2006. Rotation forest:A new classifier ensemble method[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,28(10):1619-1630.
    Rogers J.2003. The finished genome sequence of Homo sapiens[J]. Cold Spring Harbor symposia on quantitative biology,68:1-11.
    Rose G, Geselowitz A, Lesser G, et al.1985. Hydrophobicity of amino acid residues in globular proteins[J]. Science,229(4716):834.
    Schapire R.1990. The strength of weak learnability[J]. Machine learning,5(2):197-227.
    Schwikowski B, Uetz P, Fields S.2000. A network of protein-protein interactions in yeast[J]. Nature Biotechnology,18(12):1257-1261.
    Shen J, Zhang J, Luo X, et al.2007. Predicting protein-protein interactions based only on sequences information[J]. Proceedings of the National Academy of Sciences,104(11): 4337-4341.
    Shi M, Xia J, Li X, et al.2010. Predicting protein-protein interactions from sequence using correlation coefficient and high-quality interaction dataset[J]. Amino Acids,38(3):891-899.
    Shoemaker B, Panchenko A.2007. Deciphering protein-protein interactions. Part II. Computational methods to predict protein and domain interaction partners[J]. PLoS Comput Biol,3(4):e43.
    Shoemaker B, Panchenko A.2007. Deciphering protein-protein interactions. Part Ⅰ[J]. PLoS Comput Biol,3(3):e42.
    Sickmeier M, Hamilton J, LeGall T, et al.2007. DisProt:the database of disordered proteins[J]. Nucleic Acids Research,35(Database issue):D786-D793.
    Skrabanek L, Saini H, Bader G, et al.2008. Computational prediction of protein-protein interactions[J]. Molecular biotechnology,38(1):1-17.
    Smialowski P, Pagel P, Wong P, et al.2010. The Negatome database:a reference set of non-interacting protein pairs[J]. Nucleic Acids Research,38(Database issue):D540-D544.
    Sokal R, Thomson B.2005. Population structure inferred by local spatial autocorrelation:an example from an Amerindian tribal population[J]. American journal of physical anthropology, 129(1):121-131.
    Soong T, Wrzeszczynski K, Rost B.2008. Physical protein-protein interactions predicted from microarrays[J]. Bioinformatics,24(22):2608-2614.
    Sousa M, Trame C, Tsuruta H, et al.2000. Crystal and solution structures of an HslUV protease-chaperone complex[J]. Cell,103(4):633-643.
    Sprinzak E, Margalit H.2001. Correlated sequence-signatures as markers of protein-protein interactionl [J]. Journal of molecular biology,331(4):681-692.
    Stevens F.1983. Calmodulin:an introduction[J]. Biochemistry and Cell Biology,61(8):906-910.
    Strong M, Mallick P, Pellegrini M, et al.2003. Inference of protein function and protein linkages in Mycobacterium tuberculosis based on prokaryotic genome organization:a combined computational approach[J]. Genome Biol,4(9):R59.
    Tanford C.1962. Contribution of hydrophobic interactions to the stability of the globular conformation of proteins[J]. Journal of the American Chemical Society,84(22):4240-4247.
    Thorn K, Bogan A.2001. ASEdb:a database of alanine mutations and their effects on the free energy of binding in protein interactions[J]. Bioinformatics,3:284-285.
    Tong J, Tammi M.2008. Prediction of protein allergenicity using local description of amino acid sequence[J]. Frontiers in bioscience:a journal and virtual library,13:6072-6078.
    Tuncbag N, Gursoy A, Keskin O.2009. Identification of computational hot spots in protein interfaces:combining solvent accessibility and inter-residue potentials improves the accuracy[J]. Bioinformatics,25(12):1513-1520.
    Tyers M, Mann M.2003. From genomics to proteomics[J]. Nature,422(6928):193-197.
    Vilalta R, Drissi Y.2002. A perspective view and survey of meta-learning[J]. Artificial Intelligence Review,18(2):77-95.
    Vo-Dinh T.2005. Protein nanotechnology:the new frontier in biosciences[J]. Methods in molecular biology,300:1-13.
    Walhout A, Sordella R, Lu X, et al.2000. Protein interaction mapping in C. elegans using proteins involved in vulval developmen[J]. Science,287(5450):116-122.
    Wang J, Li C, Wang E, Wang X.2009. Uncovering the rules for protein-protein interactions from yeast genomic data[J]. Proceedings of the National Academy of Sciences of the United States of America,106:3752-3757.
    Wang R, Wang Y, Wu L, et al.2007. Analysis on multi-domain cooperation for predicting protein-protein interactions[J]. BMC Bioinformatics,8(1):391.
    Waterston R, Lindblad-Toh K, Birney E, et al.2002. Initial sequencing and comparative analysis of the mouse genome[J]. Nature,420:520-562.
    Wells J.1991. Systematic mutational analyses of protein-protein interfaces[J]. Methods in enzymology,202:390-411.
    Wolpert D.1992. Stacked generalization[J]. Neural networks,5(2):241-259.
    Wolpert D, Macready W.1997. No free lunch theorems for optimization[J]. IEEE transactions on evolutionary computation,1(1):67-82.
    Wu F, Towfic F, Dobbs D, et al. Analysis of Protein Protein Dimeric Interfaces:proceedings of the 2007 IEEE International Conference on Bioinformatics and Biomedicine,2007[C]. IEEE Computer Society Washington, DC, USA.
    Wu Z, Zhao X, Chen L.2009. Identifying responsive functional modules from protein-protein interaction network[J]. Molecules and Cells,27(3):271-277.
    Xenarios I, Fernandez E, Salwinski L, et al.2001. DIP:the database of interacting proteins:2001 update[J]. Nucleic Acids Research,29(1):239.
    Young L, Jemigan R, Covell D.1994. A role for surface hydrophobicity in protein-protein recognition [J]. Protein Science,3(5):717-729.
    Yu H, Qian M, Deng M.2006. Using a Stochastic AdaBoost algorithm to discover interactome motif pairs from sequences[J]. Computational Intelligence and Bioinformatics:622-630.
    Zhu H, Bilgin M, Bangham R, et al.2001. Global analysis of protein activities using proteome chips[J]. Science,293(5537):2101.
    刘昆宏.2008.多分类器集成系统在基因微阵列数据分析中的应用[D].中国科学技术大学博士学位论文.
    史明光.2009.蛋白质相互作用预测方法的研究[D].中国科学技术大学博士学位论文.
    王兵.2006.蛋白质相互作用及其位点的预测方法研究[D].中国科学技术大学博士学位论文.
    王文馨,陈宇光,石铁流.2008.异源蛋白质相互作用数据整合算法的进展[J].生命科学,20(5):821-826.
    朱新宇,沈百荣.2004.预测蛋白质间相互作用的生物信息学方法[J].生物技术通讯,15(1):70-75.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700