用户名: 密码: 验证码:
图像检索中的特征表示模型和多信息源融合方式的研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
多媒体信息在人们的日常生活中起着越来越重要的作用。因而,如何在浩如烟海的多媒体数据库中快速、高效地检索到所需要的图像已成为一个非常有意义且具有挑战性的课题。本文着重研究了图像检索中的特征表示模型和多信息源融合的方式。
     通常来讲,图像检索主要可以分为以下两类:基于文本的图像检索和基于内容的图像检索。早期的图像检索系统大部分以基于文本的图像检索为主,该方法主要通过简单地匹配用户输入的关键词和数据库中图像相应的文本描述的方式搜索相关的图像。这种方法需要事先以文字信息的方式标注数据库中的每一幅图像,这是一个目前还有待于进一步解决的难题,而且文本标注的质量将会直接影响到后续图像检索精度。随后基于内容的图像检索逐渐得到发展,该方法从图像本身出发,直接提取图像的底层视觉特征,并对基于这些视觉特征建立索引,从而实现检索。
     本文首先重点研究了图像检索中图像的视觉特征表示方法。图像特征提取是图像检索中的关键步骤,如何从原始图像中提取具有较强表示能力的图像特征是图像检索技术的一个研究热点。目前用于表示图像的特征可以划分为底层视觉特征和高层语义特征。由于技术的限制,图像检索的特征一般是通过底层视觉特征来表达图像的高层语义。通常来讲,图像的底层视觉特征又可以分为全局特征和局部特征。与全局特征相比,大多数局部特征对图像的尺度缩放、旋转、仿射变换、光照变化等具有不变性,因此由局部特征来表示图像并建立相应的索引,可以取得比全局特征更准确的检索结果。其中SIFT特征因其卓越的性能和检测速度快的优点而被广泛使用,尤其是将SIFT特征与倒排文档技术(TF-IDF)相结合构成词袋模型(BoW)更是成为目前的主流方法。但原始BoW模型存在缺少空间信息和语义信息的不足的问题。为了缓解这一问题,本文深入探索BoW模型中视觉词汇之间的空间关系和语义关系,提出一个两级图像特征表示模型——短语袋模型,该模型不仅能够在BoW模型中增加空间信息,以及更好的表示图像所包含的语义信息,并且对于背景杂波也有一定的抑制作用。
     本文的另一个重点研究内容是图像检索中多信息源的融合方式,即如何通过结合网络中与图像相关的其他多媒体信息、,如文本、语音、视频等信息源,从而提高图像检索的性能。在多信息源融合方式的探索方面,鉴于聚类和分类是图像检索的关键步骤,对后续图像检索的精度和性能有较大程度的影响,本文首先探索了近年来5类比较典型并且被广泛接受的图像聚类/分类算法,包括2类单一信息源的算法:基于文本的图像聚类/分类和基于图像内容的图像聚类/分类,以及融合多种信息源的3类多视角学习算法:特征级融合、语义级融合和内核级融合,并在此基础上比较各种分类算法的性能。通过比较,本文得到了以下初步结论:基于单一信息源的两种算法由于所含的信息量有限,往往无法达到较高的正确率;一旦融合了多种可用的信息源,其性能会有较大幅度的提高。然而,上述三类多视角学习算法都是首先独立地处理多个信息源的数据,然后在三个不同级别上对多个信息源进行融合,共同之处是忽略了多种信息源之间的交互。因此,在本课题的研究中,本文首先探索了不同信息源(如文本信息和图像信息)之间进行相互指导和帮助的可行性,然后基于此提出了两种多视角的学习方法:动态加权和基于图像区域的语义概念融合,从而有效地融合了来自多种信息源的数据,提高网络图像聚类/分类的性能,达到更好的检索效果。为了进一步提高上述两种多视角学习方法处理大规模数据的性能,本文又提出了一种多媒体信息融合的框架,该框架通过分析不同网络图像的特殊性质有效地融合了上述两种多视角学习算法,能够以相对较少的计算时间获得较高的分类性能,从而较好地处理大规模的网络多媒体数据,还可以解决实际应用中最常见的某些网络图像缺失相应的文本描述的问题。
     除了研究上述问题,本文还进一步探索了网络图像与其相应的文本描述之间在语义层面上的“关联性”,并利用这种语义关联性增强图像分类的特征空间,本文将此过程称为迁移学习。该交叉域的迁移学习的方法可以通过未标注真实类别标签的网络多媒体数据完成有监督分类的学习任务。实验结果表明:通过迁移相关性知识,本文提出的基于交叉域相关性知识的分类方法不仅能够成功地将网络中多种相关的信息源有效地融合起来处理大规模的网络数据,而且可以较好地解决实际应用中部分多媒体目标对象存在的某种信息源缺失的问题。
Multimedia information plays an increasingly important role in human's daily activities. However, how to efficiently and effectively retrieve images that satisfy the needs of web users in large multimedia databases is becoming more and more important and challenging. The research on representation model of image feature and the fusion methods of multimedia information sources in image retrieval is presented in this thesis.
     In general, image retrieval can be categorized into the following two different groups:text-based image retrieval (TBIR) and content-based image retrieval (CBIR). Most of the early text-based image retrieval systems often focus on utilizing text descriptions of images to search images via simply matching keywords, which are input by users. This kind of methods requires that each image must be labeled with its corresponding text information beforehand, which is still a complex problem to be solved and the qulity of label has a direct impact to the accuracy of image retrieval. Subsequently the content-based image retrieval reveals its importance and focuses on the contents of the images themselves, which directly extracts the low-level visual features of images and then indexes and retrieves images based on the extracted visual features.
     An important and detailed research is given firstly on representation model of image feature in this thesis. Image feature extraction, as a crucial step of image retrieval, attracts much more attention and is very useful in the subsequent procedures. Currently, image feature can be categorized into the low-level visual feature and the high-level semantic feature. Due to technical limitations, in image retrieval, we usually use the low-level visual feature to represent the high-level semantic concept. In general, the low-level visual feature often includes global feature and local feature. Compared with the global feature, most of the local features are invariant to the scale, rotation, translation, affine transformation, and illumination changes, which could achieve better performance. Among these local features, SIFT (Scale Invariant Feature Transform) has been widely used to perform image retrieval task, especially combining the SIFT and the TF-IDF (Term Frequency-Inverse Document Frequency) techniques to form the classic bag-of-words model (BoW model). However, the basic BoW model has some limitations as it ignores he spatial information of visual words and only contains imited semantic relationships among them. In order to address these limitations, in this thesis, we firstly explore the spatial and semantic relationship between visual words, and propose a novel image representation model (i.e., bag-of-phrases (BoP)) to represent images at both the word-level and the phrase-level. Obviously, the spatial and semantic distinguishing power of image features can be enhanced via our proposed BoP model, and the BoP model is capable of handling the background clutter problem effectively.
     Another focus of our research is the fusion methods of multimedia information sources in image retrieval, i.e., how to combine other available information sources related to image in the web, such as text, video, audio,etc., to improve the performance of image retrieval. In the exploration of multiple information sources fusion methods, web image clustering/categorization, as the crucial step of image retrieval, has an important impact to the accuracy and performance of the subsequent image retrieval. Therefore, in this thesis, we firstly provide a comparative experimental study on the five classic and well-accepted clustering/classification methods in recent years, which include two single-model Clustering/Classification methods (Text-based method and Image-based method) and three multi-view learning methods (Feature Integration, Semantic Integration and Kernel Integration). From the comparison results, we observe that the performance of web image clustering/categorization using single-modal methods are relatively low; once the text and image data sources are integrated using multi-view learning methods, the performance can be dramatically improved. However, these three kinds of multi-view learning methods process each information source separately and then combine them together at either feature level, semantic level, or kernel level, all of which ignores the correlation and interaction between each information source. Therefore, in our research, we explore the feasibility of using text information as a "guidance" for image categorization by proposing two novel methods (Dynamic Weighting and Region-based Semantic Concept Integration), which achieve better performance comparing with the above five existing methods. In order to further improve our two proposed classification methods such that they could handle large scale datasets, we propose a novel multimedia information fusion framework in which these two proposed methods are seamlessly integrated by analyzing the special characteristics of different images. Specifically, the proposed framework can not only choose a suitable classification model for each testing image according to its special characteristics and consequently achieve better performance with relatively less computation time for large scale datasets, but also address the problem that the textual descriptions of a small portion of web images are missing.
     Besides exploring the interaction between images and texts, we further investigate the correlation between web images and their corresponding textual descriptions at semantic level, and then utilize the correlation to enrich the feature space where the supervised classification is performed, which could be called transfer learning. Our proposed cross-domain transfer learning method for utilizing web multimedia objects without true labels in performing supervised classification tasks. According to experiments, by transferring such correlation knowledge, our cross-domain transfer learning method can not only handle large scale web multimedia objects via effective multimedia information fusion, but also deal with the situation that one of the information sources of a small portion of web multimedia objects is missing.
引文
[1]Davis B.L., Rodriguez T.F., Rhoads G.B. Synchronizing Rendering of Multimedia Content, US Patent, 2010, App.12/652,678.
    [2]Kousiouris G, Checconi F, Mazzett A, et al. Distributed interactive real-time multimedia applications:A sampling and analysis framework,2010.
    [3]Clark R.C., Mayer R.E., E-learning and the science of instruction:Proven guidelines for consumers and designers of multimedia learning,2011.
    [4]郭健,基于内容的图像检索技术研究,[学位论文],贵州大学,2006.
    [5]刘伟,图像检索中研究的若干问题,[学位论文],浙江大学,2007.
    [6]Rui Y, Huang TS, Mehrotra S. Content-Based image retrieval with relevance feedback in MARS, in Proc of the IEEE Int'l Conf. on Image Processing,2,1997,815-818.
    [7]Pentland A, Pieard R, Sclaroff S. Photobook:Content-based manipulation of image databases, International Journal of Computer Vision,18(3),1996,233-254.
    [8]John R, Smith, Shih-Fu Chang. Visual Seek:A Fully Automated Content-based Image Query System, ACM Multimedia 96,1996:87-98.
    [9]章毓晋,基于内容的视觉信息检索,科学出版社,2003.
    [10]Lu W, Li L, Li T, et al. Web Multimedia Object Clustering via Information Fusion,2011 International Conference on Document Analysis and Recognition,2011,319-323.
    [11]Li L, Lu W, Li J, et al. Exploring Interaction Between Images and Texts for Web Image Categorization, in proceedings of FLAIRS,2011,45-50.
    [12]ZHU Y, CHEN Y, LU Z, et al. Heterogeneous transfer learning for image classification, in AAAI,2011.
    [13]Pan S.J., Yang. Q., A survey on transfer learning, in IEEE Transactions on Knowledge and Data Engineering,22(10),2010,1345-1359.
    [14]Raina R., Self-taught learning,[Dissertation], Stanford University,2010.
    [1]郭健.基于内容的图像检索技术研究,[学位论文],贵州大学,2006.
    [2]刘伟,图像检索中研究的若干问题,[学位论文],浙江大学,2007.
    [3]Niblaek W, Barber R, Equitz Wetal. The QBIC project:Querying images by content using color, texture, and shape, in Proc. SPIE, Storage and Retrieval for Image and Video Databases,1908 (1993):173-187.
    [4]Jeffrey R.Bae, h Charles Fuller, Amarnath Gupta. The Virago Image Search Engine:An Open Framework for Image Management, in Proc. SPIE, Storage and Retrieval for Still Image Video Databases,2670,1996,76-87.
    [5]Pentland A, Pieard R, Sclaroff S. Photobook:Content-based manipulation of image databases, International Journal of Computer Vision,18(3),1996,233-254.
    [6]John R, Smith, Shih-Fu Chang. Visual Seek:A Fully Automated Content-based Image Query System, ACM Multimedia 96,1996:87-98.
    [7]Blobworld, Image retrieval using regions [EB/OL], [2006-12-01], http://elib.cs.berkeley.edu/blobworld/.
    [8]Carson C, Belongie S, Greenspan H, et al. Blobworld:Image segmentation using expectation-maximization and its application to image querying, IEEE Transactions on Pattern Analysis and Machine Intelligence,24(8),2002,1026-1038.
    [9]Wang JZ, Li J, Wiederhold G. SIMPLIcity:Semantics-sensitive integrated matching for picture libraries, IEEE Transactions on Pattern Analysis and Machine Intelligence,23(9),2001,947-963.
    [10]Rui Y, Huang TS, Mehrotra S. Content-Based image retrieval with relevance feedback in MARS, In Proc of the IEEE Int'l Conf. on Image Processing,2,1997,815-818.
    [11]Ma WY, Manjunath BS. Netra:A toolbox for navigating large image databases, Multimedia Systems, 7(3),1999,184-198.
    [12]百度百科关于百度识图的介绍,2011.3.6,http://baike.baidu.com/view/4696786.ht
    [13]周明全,耿国华,韦娜.基于内容图像检索技术,清华大学出版社,2007.
    [14]章毓晋,基于内容的视觉信息检索,科学出版社,2003.
    [15]庄越挺,潘云鹤,吴飞.网上多媒体信息分析与检索,清华大学出版社,2002.
    [16]Tamura H, Mori S, Yamawaki T. Textural features corresponding to visual perception, IEEE Transactions on Systems, Man and Cybernetics,8(6),2007,460-473.
    [17]He DC, Wang L. Texture features based on texture spectrum, Pattern Recognition,24(5),1991,391-399.
    [18]Yang M., Zhang, L. Gabor feature based sparse representation for face recognition with gabor occlusion dictionary, ECCV,2010,448-461.
    [19]Harris C, Stephens M. A Combined Corner and Edge Detector[EB/OL], [2005-10-08], http:www.csse.uwa.edu.au/~pk/Research/MatlabFns/Spatial/Docs/Harris/.
    [20]Lindeberg T. Feature detection with automatic scale selection, International Journal of Computer Vision, 30(2),1998,79-116.
    [21]Lowe D.G, Object recognition from local scale-invariant features, ICCV,1999,1150.
    [22]Lowe D.G, Distinctive image features from scale-invariant keypoints, International journal of computer vision,60(2),2004,91-110.
    [23]Sivic J, Zisserman A. Video Google:A text retrieval approach to object matching in videos, ICCV,2003, 1470-1477.
    [24]Li FF, Visual Recognition:computational models and human psychophysics, California Institute of Technology Pasadena, CA, USA:California Institute of Technology,2005.
    [25]Bay H, Tuytelaars T, Van Gool L. SURF:Speeded Up Robust Features, the 9th European Conference on Computer Vision,2006.
    [26]Ke Y, Sukthankar R. PCA-SIFT:A more distinctive representation for local image descriptors, CVPR, 2004.
    [27]Mikolajczyk K, Schmid C. A performance evaluation of local descriptors, IEEE Transactions on Pattern Analysis and Machine Intelligence,27(10),2005.
    [28]J.M. Morel, G Yu. ASIFT:A new framework for fully affine Invariant Image Comparison, SIAM Journal on Imaging Sciences,2009.
    [29]J Matas, O Chum. Robust wide baseline stereo from maximally stable extremal regions, British Machine Vision Conference,2002.
    [30]王永明,王贵锦.图像局部不变性特征与描述.北京:国防工业出版社,2010.
    [31]Salton G, McGill M. Introduction to modern information retrieval,1986.
    [32]T Leung, J Malik. Representing and recognizing the visual appearance of materials using three-dimensional textons, International Journal of Computer Vision, vol.43, no.1,2001,29-44.
    [33]Zhang C., Liu J., Wang J., et al. Image classification using spatial pyramid coding and visual word reweighting, ACCV,2010.239-249.
    [34]张琳波.王春恒.肖柏华等.基于Bag-of-phrases的图像表示方法,自动化学报Vol.38. No.1, 2012,46-54.
    [35]S. Lazebnik. C. Schmid, J. Ponce. Beyond bags of features:Spatial pyramid matching for recognizing natural scene categories, in Computer Vision and Pattern Recognition,2006 IEEE Computer Society Conference on. IEEE, vol.2,2006,2169-2178.
    [36]K. Palander, S.S. Brandt. Epipolar geometry and log-polar transform in wide baseline stereo matching. in Pattern Recognition,2008.19th International Conference on. IEEE,2009,1-4.
    [37]ZHENG, Q. WANG, W, GAO W. Effective and efficient object-based image retrieval using visual phrases. In ACM Multimedia,2006,77-80.
    [38]YUAN J, WU Y, YANG M. Discovery of collocation patterns:from visual words to visual phrases, in CVPR, IEEE,2007,1-8.
    [39]ZHENG Y, ZHAO M, NEO S, et al. Visual synset:towards a higher-level visual representation, in CVPR, IEEE,2008,1-8.
    [40]SADEGHI M, FARHADI A. Recognition using visual phrases, in CVPR,2011,1745-1752.
    [41]Cao Y, Zhang H, Gao Y, etal. Matching Image with Multiple Local Features. ICPR,2010,519-522 2010.
    [42]Hare J., Lewis P. Automatically annotating the mir flickr dataset, in Proceedings of the 2nd ACM international Conference on Multimedia information Retrieval,2010.
    [43]Amir S., Bilasco I.M., Sharif M.H., et al. Towards a unified multimedia metadata management solution, Intelligent Multimedia Databases and Information Retrieval:Advancing Applications and Technologies, IGI Global,2010.
    [44]Kalva P, Enembreck F, Koerich A. Web image classification based on the fusion of image and text classifiers, in proceedings of the 9th International Conference on Document Analysis and Recognition, IEEE Computer Society,2007,561-568.
    [45]Zhu Q, Yeh M, Cheng K. Multimodal fusion using learned text concepts for image categorization, in proceedings of the 14th Annual ACM International Conference on Multimedia,2006,211-220.
    [46]Wu L, Oviatt S, Cohen P. Multimodal integration-a statistical view, IEEE Transactions on Multimedia, 1(4),2002,334-341.
    [47]Carter R, Dubchak I, Holbrook S. A computational approach to identify genes for functional RNAs in genomic sequences, Nucleic Acids Research,29(19),2001,3928.
    [48]Bishop C, Pattern recognition and machine learning,2006.
    [49]Jordan M, Jacobs R. Hierarchical mixtures of experts and the EM algorithm, Neural computation,6(2), 1994.181-214.
    [50]Li T, Ogihara M. Semisupervised learning from different information sources, Knowledge and Information Systems,7(3),2005,289-309.
    [51]Li T., and Li L. Music Data Mining:An Introduction, CRC Press,2011.
    [52]Sch"olkopf B, Smola A. Learning with kernels:support vector machines, regularization, optimization, and beyond, MIT Press,2002.
    [53]Lanckriet G, Cristianini N, Bartlett P, et al. Learning the kernel matrix with semidefinite programming, The Journal of Machine Learning Research,5,2004,27-72.
    [54]Lee W, Verzakov S, Duin R. Kernel combination versus classifier combination, Multiple Classifier Systems,2007,22-31.
    [55]J. B. MacQueen, Some Methods for classification and Analysis of Multivariate Observations, in proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, University of California Press,1,1967,281-297.
    [56]J. C. Dunn, A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters, Journal of Cybernetics,3,1973,32-57.
    [57]J. C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algoritms, Plenum Press, New York, 1981.
    [58]S. C. Johnson, Hierarchical Clustering Schemes, Psychometrika,2,1967,241-254.
    [59]Jia Li, Data Mining-Clustering by Mixture Models, http://www.stat.psu.edu/-jiali/course/stat597e/notes/mix.pdf
    [60]Lee D, Seung H. Algorithms for non-negative matrix factorization, Advances in neural information processing systems, vol.13,2001.
    [61]A. Ng, M. Jordan, Y. Weiss. On spectral clustering:Analysis and an algorithm, in proceedings of ANIPS, 2001,849-856.
    [62]T. Li, C. Ding, M. Jordan, Solving consensus and semi-supervised clustering problems using nonnegative matrix factorization, in Prceedings of ICDM,2007,577-582.
    [63]孙即祥,现代模式识别,国防科技大学出版社,2002.
    [64]RAINA R, NG A, KOLLER D. Constructing informative priors using transfer learning, in proceedings of ICML 2006,713-720.
    [65]RAINA R, BATTLE A. LEE H, et al. A. Self-taught learning:Transfer learning from unlabeled data, in proceedings of ICML,2007,759-766.
    [66]ZHANG J. SHAKYA S. Knowledge Transfer for Feature Generation in Document Classification, in ICMLA 2009,255-260.
    [67]DAI W, CHEN Y, XUE G, et al. Translated learning:Transfer learning across different feature spaces, NIPS,2008,353-360.
    [68]ZHU Y, CHEN Y, LU Z, et al. Heterogeneous transfer learning for image classification, in AAAI,2011.
    [69]Pan S.J., Yang, Q., A survey on transfer learning, in IEEE Transactions on Knowledge and Data Engineering,22(10),2010,1345-1359.
    [70]Raina R., Self-taught learning,[Dissertation], Stanford University,2010.
    [1]Salton G, McGill M. Introduction to modern information retrieval,1986.
    [2]Li FF, Visual Recognition:computational models and human psychophysics, California Institute of Technology Pasadena, CA, USA:California Institute of Technology,2005.
    [3]Lowe D.G Object recognition from local scale-invariant features, ICCV,1999,1150.
    [4]Lowe D.G, Distinctive image features from scale-invariant keypoints, International journal of computer vision,60(2),2004,91-110.
    [5]Lindeberg T, Feature detection with automatic scale selection, International Journal of Computer Vision, 30(2),1998,79-116.
    [6]陈泽志,吴成柯,刘勇.对极几何估计的鲁棒性新算法,计算机学报,23(6),2000.
    [7]Luo J., Dan T.X. Face Recognition Method Based on SIFT Feature, Computer Engineering, vol.36, 2010,173-174.
    [8]Valgren C, Lilienthal A.J. Sift, surf & seasons: Appearance-based long-term localization in outdoor environments, Robotics and Autonomous Systems, vol.58, 2010, 149-156.
    [9]Liu C, Yuen J., Torralba A. Sift flow: Dense correspondence across scenes and its applications, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.33, 2011, 978-994.
    [10]FaugerasO, Robert L. What can two images tell us about the third one, Proceedings of the Europe Conference on ComputerVision, Sweden, 1994.
    [11]Koenderink J, The structure of images, Biological Cybernetics, 50, 1984, 363 - 396.
    [12]Lindeberg T, Scale - Space for discrete Signals, IEEE Transactions on PAM I, 207, 1980, 187-217.
    [13]Babaud J, Witkin A P, BaudinMetal. Uniqueness of the Gaussian kernel for scale - space filtering, IEEE Transactions on Pattern Analysis andMachine Intelligence, 8 (1) , 1996, 26 - 33.
    [14]J. B. MacQueen. Some Methods for classification and Analysis of Multivariate Observations, Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, University of California Press, 1, 1967, 281-297 .
    [15]kmeans 百度: http://baike.baidu.com/view/.31854.htm.
    [16]HAN J, KAMBER M. Data mining: concepts and techniques, Morgan Kaufmann, 2006.
    [17]HUANG Y. SHEKHAR S, XIONG H. Discovering colocation patterns from spatial data sets: a general approach, IEEE Transactions on Knowledge and Data Engineering , 2004, 1472-1485.
    [ 18]G Griffin, A Holub, P Perona. The caltech-256, California Inst. Technol., Pasadena, CA, Tech. Rep, 2006.
    [19]Caltech 256 website: hup://www.vision.ealtech.edu/Image_Datasets/Caltech256/.
    [20]Caltech 101 website: http://www.vision.ealtech.edu/Image_Datasets/Caltech101
    [21]Chang C, Lin C. LIBSVM: a library for support vector machines.
    [1]Wu L, Oviatt S, Cohen P. Multimodal integration-a statistical view, IEEE Transactions on Multimedia, 1(4),2002,334-341.
    [2]Di Lollo V. The feature-binding problem is an ill-posed problem, Trends in Cognitive Sciences,2012.
    [3]Bishop C, Pattern recognition and machine learning,2006.
    [4]Lister A.L., Lord P., Pocock M., et al. Annotation of SBML models through rule-based semantic integration. Journal of biomedical semantics, vol.1,2010.
    [5]Ji Q., Haase P., Qi G. Combination of similarity measures in ontology matching using the OWA operator, Recent Developments in the Ordered Weighted Averaging Operators:Theory and Practice,2011, 281-295.
    [6]Lanckriet G, Cristianini N, Bartlett P, et al. Learning the kernel matrix with semidefinite programming. The Journal of Machine Learning Research,5,2004,27-72.
    [7]Lu W, Li L, Li T, et al. Web Multimedia Object Clustering via Information Fusion,2011 International Conference on Document Analysis and Recognition,2011,319-323.
    [8]Shao B, Ogihara M, Wang D, et al. Music recommendation based on acoustic features and user access patterns, IEEE Transactions on Audio, Speech, and Language Processing 17(8),2009,1602-1611.
    [9]P. Gill, W. Murray, M. Wright. Practical optimization,1981.
    [10]D. Wang, T. Li, S. Zhu, et al. Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization, in proceedings of SIGIR. ACM,2008,307-314.
    [11]D. Lee, H. Seung. Algorithms for non-negative matrix factorization, Advances in neural information processing systems, vol.13,2001.
    [12]C. Ding, T. Li, W. Peng, et al. Orthogonal nonnegative matrix tfactorizations for clustering, in proceedings of SIGKDD. ACM,2006,126-135.
    [13]CNN News website:http://www.cnn.com.
    [14]A. McCallum, MALLET:A Machine Learning for Language Toolkit,2002, http://mallet.cs.umass.edu.
    [15]S. Chatzichristofis, Y. Boutalis. Cedd:Color and edge directivity descriptor:A compact descriptor for image indexing and retrieval, in proceedings of the 6th International Conference on Computer Vision Systems,2008,312-322.
    [16]J. B. MacQueen, Some Methods for classification and Analysis of Multivariate Observations, Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, University of California Press,1,1967,281-297.
    [17]S. C. Johnson, Hierarchical Clustering Schemes, Psychometrika,2,1967,241-254.
    [18]D. Lee, H. Seung. Algorithms for non-negative matrix factorization. Advances in neural information processing systems, vol.13,2001.
    [19]A. Ng, M. Jordan, Y. Weiss. On spectral clustering:Analysis and an algorithm, in proceedings of ANIPS, 2001,849-856.
    [20]T. Li, C. Ding, M. Jordan, Solving consensus and semi-supervised clustering problems using nonnegative matrix factorization, in proceedings of ICDM,2007,577-582.
    [1]Wang JZ, Li J, Wiederhold G. SIMPLIcity:Semantics-sensitive integrated matching for picture libraries, IEEE Transactions on Pattern Analysis and Machine Intelligence,23(9),2001,947-963.
    [2]Li L, Lu W, Li J, et al. Exploring Interaction Between Images and Texts for Web Image Categorization. in proceedings of FLAIRS,2011,45-50.
    [3]S. Chatzichristofis, Y. Boutalis. Cedd:Color and edge directivity descriptor:A compact descriptor for image indexing and retrieval, in proceedings of the 6th International Conference on Computer Vision Systems,2008,312-322.
    [4]A. McCallum, MALLET:A Machine Learning for Language Toolkit,2002, http://mallet.cs.umass.edu.
    [5]Miller G, WordNet:a lexical database for English, Communications of the ACM 38(11),1995,39-41.
    [6]Arbelaez P., Maire M., Fowlkes C. Contour detection and hierarchical image segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence,2011.
    [7]Li L, Socher R, L Fei-Fei. Towards total scene understanding:Classification, annotation and segmentation in an automatic framework, in IEEE Conference on Computer Vision and Pattern Recognition,2009,2036-2043.
    [8]Deng Y, Unsupervised segmentation of color-texture regions in images and video, IEEE Transactions on Pattern Analysis and Machine Intelligence,2001,800-810.
    [9]Gionis A, Indyk P, Motwani R. Similarity search in high dimensions via hashing, in proceedings of the 25th International Conference on Very Large Data Bases, Morgan Kaufmann Publishers Inc,1999, 518-529.
    [10]Indyk P, A small approximately min-wise independent family of hash functions, in proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms,1999,454-456.
    [11]CNN News website:http://www.cnn.com.
    [12]Allan M, Verbeek J. Ranking user-annotated images for multiple query terms, in British Machine Vision Conference,2009, URL http://lear.inrialpes.fr/pubs/2009/AV09.
    [13]Chang C, Lin C. LIBSVM:a library for support vector machines.
    [14]A. McCallum, MALLET:A Machine Learning for Language Toolkit,2002. http://mallet.cs.umass.edu.
    [15]Salton G, McGill M. Introduction to modern information retrieval,1986.
    [16]Wu L, Oviatt S, Cohen P. Multimodal integration-a statistical view, IEEE Transactions on Multimedia. 1(4),2002,334-341.
    [17]Carter R, Dubchak I, Holbrook S. A computational approach to identify genes for functional RNAs in genomic sequences, Nucleic Acids Research,29(19),2001,3928.
    [18]Scholkopf B, Smola A. Learning with kernels:support vector machines, regularization. optimization, and beyond, MIT Press,2002.
    [1]RAINA R, NG A, KOLLER D. Constructing informative priors using transfer learning, in proceedings of ICML 2006,713-720.
    [2]RAINA R, BATTLE A, LEE H, et al. A. Self-taught learning:Transfer learning from unlabeled data, in proceedings of ICML,2007,759-766.
    [3]ZHANG J, SHAKYA S. Knowledge Transfer for Feature Generation in Document Classification, in ICMLA 2009,255-260.
    [4]DAI W, CHEN Y, XUE G, et al. Translated learning:Transfer learning across different feature spaces, NIPS,2008,353-360.
    [5]ZHU Y, CHEN Y, LU Z, et al. Heterogeneous transfer learning for image classification, in AAAI,2011.
    [6]Pan S.J., Yang, Q., A survey on transfer learning, in IEEE Transactions on Knowledge and Data Engineering,22(10),2010,1345-1359.
    [7]Raina R., Self-taught learning,[Dissertation], Stanford University,2010.
    [8]BRON C, KERBOSCH J. Algorithm 457:finding all cliques of an undirected graph. ACM.
    [9]A. McCallum, MALLET:A Machine Learning for Language Toolkit,2002, http://mallet.cs.umass.edu.
    [10]Salton G, McGill M. Introduction to modern information retrieval,1986.
    [11]Lowe D.G, Distinctive image features from scale-invariant keypoints, International journal of computer vision,60(2),2004,91-110.
    [12]CNN News website:http://www.cnn.com.
    [13]Allan M, Verbeek J. Ranking user-annotated images for multiple query terms, in British Machine Vision Conference,2009, URL http://lear.inrialpes.fr/pubs/2009/AV09.
    [14]Hadoop website:http://hadoop.apache.org/.
    [15]DEAN J, GHEMAWAT S. MapReduce:Simplified data processing on large clusters. Communications of the ACM 51,1,2008,107-113.
    [16]Chang C, Lin C. LIBSVM:a library for support vector machines.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700