联合耦合字典学习与图像正则化的跨媒体检索方法

设为首页

收藏本站

网站地图 | English | 公务邮箱

远程访问

NSTL服务站

联合耦合字典学习与图像正则化的跨媒体检索方法

详细信息查看全文 | 推荐本文 |

英文篇名：Cross-media Retrieval Method Fusing with Coupled Dictionary Learning and Image Regularization
作者：刘芸 ; 于治楼 ; 付强
英文作者：LIU Yun;YU Zhilou;FU Qiang;School of Information Science and Engineering,Shandong Normal University;Inspur Group Co.,Ltd.;
关键词：跨媒体检索 ; 特征选择 ; 耦合字典学习 ; 图像正则化 ; 特征映射
英文关键词：cross-media retrieval;;feature selection;;coupled dictionary learning;;graph regularization;;feature mapping
中文刊名：JSJC
英文刊名：Computer Engineering
机构：山东师范大学信息科学与工程学院;浪潮集团有限公司;
出版日期：2019-06-15
出版单位：计算机工程
年：2019
期：v.45;No.501
基金：国家自然科学基金(61373081)
语种：中文;
页：JSJC201906037
页数：7
CN：06
ISSN：31-1289/TP
分类号：236-242

摘要

跨媒体检索方法多数将2个模态的原始特征映射到公共子空间,在子空间中执行跨媒体检索,忽略了判别特征的选择以及模态间的关系。为此,提出一种基于耦合字典学习和图形正则化的新型跨模态检索方法。通过关联和联合更新不同模态的字典,为不同的模态生成均匀的稀疏表示。将不同模态的稀疏表示投影到由类标签信息定义的公共子空间中,以执行跨模态匹配,同时对投影矩阵施加21范数项,选择特征空间的相关和辨别性特征。在此基础上,利用图正则化项保留模态间和模态内相似关系。实验结果表明,与典型相关分析方法相比,该方法跨媒体检索精度较高。
The method of cross-media retrieval mostly maps the original features of two modalities to the common subspace,and performs cross-media retrieval in the subspace,ignoring the selection of discriminant features and the relationship between modalities.Therefore,a new cross-modal retrieval method based on coupled dictionary learning and graph regularization is proposed.A uniform sparse representation is generated for different modalities by associating and jointly updating dictionaries of different modalities.The sparse representations of the different modalities are then projected into the common subspace defined by the class label information to perform cross-modal matching while applying 21 norm terms to the projection matrix to select the correlation and discriminative features of the feature space.On this basis,the regularization term of the graph is used to preserve the inter-modal and intra-modal similar relationship.Experimental results show that compared with the Canonical Correlation Analysis(CCA) method,the method has higher accuracy in cross-media retrieval.

引文

[1] WANG Kaiye,YIN Qiyue,WANG Wei,et al.A comprehensive survey on cross-modal retrieval [EB/OL].[2018-08-02].https://arxiv.org/pdf/1607.06215.pdf.
    [2] HARDOON D R,SZEDMAK S,SHAWE-TAYLOR J.Canonical correlation analysis:an overview with application to learning methods[J].Neural Computation,2004,16(12):2639-2664.
    [3] ROSIPAL R,KR?MER N.Overview and recent advances in partial least squares [C]//Proceedings of International Conference on Subspace,Latent Structure and Feature Selection.Berlin,Germany:Springer,2005:34-51.
    [4] TENENBAUM J B,FREEMAN W T.Separating style and content with bilinear model[J].Neural Computation,2000,12(6):1247-1283.
    [5] MAHADEVAN V,PEREIRA J C,VASCONCELOSN,et al.Maximum covariance unfolding:manifold learning for bimodal data[EB/OL].[2018-08-02].http://www.svcl.ucsd.edu/publications/conference/2011/nips2011/mcu.pdf.
    [6] MAO Xiangbo,LIN Binbin,CAI Deng,et al.Parallel field alignment for cross media retrieval[C]//Proceedings of ACM International Conference on Multimedia.New York,USA:ACM Press,2013:897-906.
    [7] LIN Dahua,TANG Xiaoou.Inter-modality face recogni-tion[C]//Proceedings of the 9th European conference on Computer Vision.Berlin,Germany:Springer,2006:13-26.
    [8] SHARMA A.Generalized multiview analysis:a discriminative latent space[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Computer Society,2012:2160-2167.
    [9] 陈祥,于治楼.基于不同模态语义匹配的跨媒体检索[J].山东师范大学学报(自然科学版),2017,32(3):9-15.
    [10] ZHAI Xiaohua,PENG Yuxin,XIAO Jianguo.Learning cross-media joint representation with sparse and semisupervised regularization[J].IEEE Transactions on Circuits and Systems for Video Technology,2014,24(6):965-978.
    [11] XU Xing,SHIMADA A,TANIGUCHI R,et al.Coupled dictionary learning and feature mapping for cross-modal retrieval[C]//Proceedings of International Conference on Multimedia and Expo.Washington D.C.,USA:IEEE Press,2015:1-6.
    [12] ZHUANG Yueting,WANG Yanfei,WU Fei,et al.Supervised coupled dictionary learning with group structures for multi-modal retrieval [C]//Proceedings of the 27th AAAI Conference on Artificial Intelligence.[S.l.]:AAAI Press,2013:1070-1076.
    [13] ZHAI Xiaohua,PENG Yuxin,XIAO Jianguo.Heterogeneous metric learning with joint graph regularization for cross-media retrieval [C]//Proceedings of the 27th AAAI Conference on Artificial Intelligence.[S.l.]:AAAI Press,2013:1198-1204.
    [14] NGIAM J,KHOSLA A,KIM M,et al.Multimodal deep learning[EB/OL].[2018-08-02].https://www.mendeley.com/catalogue/multimodal-deep-learning/.
    [15] SRIVASTAVA N,SALAKHUTDINOV R.Multimodal learning with deep boltzmann machines [J].Journal of Machine Learning Research.2014,15:2949-2980.
    [16] ANDREW G,ARORA R,BILMES J,et al.Deep canonical correlation analysis[C]//Proceedings of the 30th International Conference on International Conference on Machine Learning.[S.l.]:JMLR.org,2013:1247-1255.
    [17] CAI Deng,HE Xiaofei,HAN Jiawei.Spectral regression for efficient regularized subspace learning[C]//Proceedings of the 11th International Conference on Computer Vision.Washington D.C.,USA:IEEE Press,2007:1-8.
    [18] ZHOU Jile,DING Guiguang,GUO Yuchen.Latent semantic sparse hashing for cross-modal similarity search[C]//Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval.New York,USA:ACM Press,2014:415-424.
    [19] YU Zhou,WU Fei,YANG Yi,et al.Discriminative coupled dictionary hashing for fast cross-media retrieval[C]//Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval.New York,USA:ACM Press,2014:395-404.
    [20] HE Ran,TAN Tieniu,WANG Liang,et al.L2,1 regularized correntropy for robust feature selection[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Computer Society,2012:2504-2511.
    [21] NIKOLOVA M,NG M K.Analysis of half-quadratic minimization methods for signal and image recovery[J].SIAM Journal on Scientific Computing,2005,27(3):937-966.
    [22] HUANG Dean,WANG Y F.Coupled dictionary and feature space Learning with applications to cross-domain image synthesis and recognition[C]//Proceedings of IEEE International Conference on Computer Vision.Washington D.C.,USA:IEEE Computer Society,2013:2496-2503.
    [23] LEE H,BATTLE A,RAINA R,et al.Efficient sparse coding algorithms[C]//Proceedings of the 19th International Conference on Neural Information Processing Systems.Cambridge,USA:MIT Press,2006:801-808.
    [24] MAIRAL J,BACH F,PONCE J,et al.Online dictionary learning for sparse coding[C]//Proceedings of the 26th Annual International Conference on Machine Learning.New York,USA:ACM Press,2009:689-696.
    [25] RASIWASIA N,PEREIRA J C,COVIELLO E,et al.A new approach to cross-modal multimedia retrieval[C]//Proceedings of the 18th ACM International Conference on Multimedia.New York,USA:ACM Press,2010:251-260.
    [26] HWANG S J,GRAUMAN K.Reading between the lines:object localization using implicit cues from image tags[J].IEEE Transactions on Software Engineering,2012,34(6):1145-1158.
    [27] GONG Yunchao,KE Qifa,ISARD M,et al.A multi-view embedding space for modeling internet images,tags,and their semantics[J].International Journal of Computer Vision,2014,106(2):210-233.
    [28] WANG Kaiye,HE Ran,WANG Wei,et al.Learning coupled feature spaces for cross-modal matching[C]//Proceedings of International Conference on Computer Vision.Washington D.C.,USA:IEEE Computer Society,2013:2088-2095.
    [29] LOWE D G.Distinctive image features from scale-invariant keypoints[J].International Journal of Computer Vision,2004,60(2):91-110.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700