Visual concept detection of web images based on group sparse ensemble learning

设为首页

收藏本站

网站地图 | English | 公务邮箱

读者指南

学术客户端

NSTL服务站

科技查新

Visual concept detection of web images based on group sparse ensemble learning

详细信息查看全文

作者：Yongqing Sun ; Kyoko Sudo ; Yukinobu Taniguchi
关键词：Ensemble learning ; Visual concept detection ; Semantic indexing ; Web image mining ; Sparse representation ; Dictionary learning
刊名：Multimedia Tools and Applications
出版年：2016
出版时间：February 2016
年：2016
卷：75
期：3
页码：1409-1425
全文大小：1,762 KB
参考文献：1.Amir A, Berg M, Chang S -F, Hsu W, Iyengar G, Lin C-Y, Naphade M, Natsev AP, Neti C, Nock H, Smith JR, Tseng B, Wu Y, Zhang D IBM research TRECVID-2003 video retrieval system. In: NIST TRECVID Workshop, Nov 2003
2.Bay H, Ess A, Tuytelaars T, Gool LV (2008) SURF: Speeded up robust features. Comp Vision Image Underst 110(3):346–359CrossRef
3.Bengio DSS , Pereira F, Singer Y (2009) Group Sparse Coding. In: Neural Information Processing Systems - NIPS
4.Bordes A, Ertekin S, Weston J, Bottou L (2005) Fast kernel classifiers with online and active learning. J Mach Learn Res 6:1579–1619MATH MathSciNet
5.Borth D, Ulges A, Breuel TM (2011) Automatic concept-to-query mapping for web-based concept detector training. In: ACM Multimedia 2011, pp 1453–1456
6.Cao J, Lan Y, Li J, Li Q, Li X, Lin F, Liu X, Luo L, Peng W, Wang D, Wang H, Wang Z, Xiang Z, Yuan J, Zhang B, Zhang J, Zhang L, Zhang X, Zheng W Intelligent multimedia group of Tsinghua University at TRECVID, 2006. In: NIST TRECVID Workshop, Nov 2006
7.Domingos P (2012) A few useful things to know about machine learning. Commun ACM 55(10):78–87CrossRef
8.Enzweiler M, Gavrila DM (2009) Monocular pedestrian detection: Survey and experiments. IEEE Trans Pattern Anal Mach Intell 31:2179–2195CrossRef
9.Huiskes MJ, Thomee B, Lew M S (2010) New trends and ideas in visual concept detection: the MIR Flickr retrieval evaluation initiative. In: Proceedings of the international conference on Multimedia Information Retrieval (MIR 2010), pp 527–536
10.Jiang Y-G, Yang J, Ngo C-W, Hauptmann AG (2010) Representations of keypoint-based semantic concept detection: A comprehensive study. IEEE Trans Multimed 12(1):42–53CrossRef
11.Li H, Wang X, Tang J, Zhao C (2013) Combining global and local matching of multiple features for precise retrieval of item images. ACM/Springer Multimed Syst J 19(1):37–49CrossRef
12.Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110CrossRef
13.Mairal J, Bach F, Ponce J, Sapiro G (2010) Online learning for matrix factorization and sparse coding. J Mach Learn Res 11:19–60MATH MathSciNet
14.Munder S, Gavrila D (2006) An experimental study on pedestrian classification. IEEE Trans Pattern Anal Mach Intell 28:1863–1868CrossRef
15.Over P, Awad G, Rose RT, Fiscus JG, Kraaij W, Smeaton AF (2008) Trecvid 2008 - goals, tasks, data, evaluation mechanisms and metrics. In: NIST TRECVID Workshop
16.Pytlik B, Ghoshal A, Karakos D, Khudanpur S TRECVID 2005 Experiment at Johns Hopkins University: Using Hidden Markov Models for Video Retrieval. In: NIST TRECVID Workshop, Nov 2005
17.Ramirez I, Sprechmann P, Sapiro G (2010) Classification and clustering via dictionary learning with structured incoherence and shared features. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2010), pp 3501–3508
18.Smeulders AWM, Worring M, Santini S, Gupta A, Jain R (2000) Content-based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22(12):1349–1380CrossRef
19.Song Y, Zheng Y-T, Tang S, Zhou X, Zhang Y, Lin S, Chua T-S (2011) Localized multiple kernel learning for realistic human action recognition in videos. IEEE Trans Circ Syst Vi Technol 21(9):1193–1202CrossRef
20.Sun Y, Kojima A (2011) A novel method for semantic video concept learning using web images. In: ACM Multimedia 2011, pp 1081–1084
21.Sun Y, Shimada S, Taniguchi Y, Kojima A (2008) A novel region-based approach to visual concept modeling using web images. In: ACM Multimedia 2008, pp 635–638
22.Tang S, Li J-T, Li M, Xie C, Liu Y, Tao K, Xu S-X Trecvid 2008 high-level feature extraction by MCG-ICT-CAS. In: NIST TRECVID Workshop, Nov 2008
23.Tang S, Li J-T, Zhang Y-D, Xie C, Li M, Liu Y, Hua X, Zheng Y-T, Tang J, Chua T-S PornProbe: an LDA-SVM based pornography detection system. In: ACM Multimedia 2009, Oct. 2009
24.Tang S, Zheng Y-T, Cao G, Zhang Y-D, Li J-T (2012) Ensemble learning with LDA topic models for visual concept detection. Multimedia - A Multidisciplinary Approach to Complex Issues, pp 175–200
25.Tang S, Zheng Y-T, Wang Y, Chua T-S (2012) Sparse ensemble learning for concept detection. IEEE Trans Multimed 14(1):43–54CrossRef
26.Wang F, Lee N, Sun J, Hu J, Ebadollahi S Automatic group sparse coding. In: Twenty-Fifth AAAI Conference on Artificial Intelligence, Aug 2011
27.Zha Z-J, Wang M, Zheng Y-T, Yang Y, Hong R, Chua T-S (2012) Interactive video indexing with statistical active learning. IEEE Trans Multimed 14(1):17–27CrossRef
28.Zha Z-J, Zhang H, Wang M, Luan H, Chua T-S (2013) Detecting group activities with multi-camera context. IEEE Transactions on Circ Syst Vi Technol 23(5):856–869CrossRef
29.Zhu S, Wang G, Ngo C-W, Jiang Y-G (2010) On the sampling of web images for learning visual concept classifiers. In: Proceedings of the 9th ACM International Conference on Image and Video Retrieval (CIVR 2010), pages 50–57, Xi’an, China
作者单位：Yongqing Sun (1)
Kyoko Sudo (1)
Yukinobu Taniguchi (1)

1. NTT Media Intelligence Laboratories, 1-1 Hikarinooka Yokosuka-Shi, Kanagawa, 239-0847, Japan
刊物类别：Computer Science
刊物主题：Multimedia Information Systems
Computer Communication Networks
Data Structures, Cryptology and Information Theory
Special Purpose and Application-Based Systems
出版者：Springer Netherlands
ISSN：1573-7721

文摘

Due to the huge intra-class variations for visual concept detection, it is necessary for concept learning to collect large scale training data to cover a wide variety of samples as much as possible. But it presents great challenges on both how to collect and how to train the large scale data. In this paper, we propose a novel web image sampling approach and a novel group sparse ensemble learning approach to tackle these two challenging problems respectively. For data collection, in order to alleviate manual labeling efforts, we propose a web image sampling approach based on dictionary coherence to select coherent positive samples from web images. We propose to measure the coherence in terms of how dictionary atoms are shared because shared atoms represent common features with regard to a given concept and are robust to occlusion and corruption. For efficient training of large scale data, in order to exploit the hidden group structures of data, we propose a novel group sparse ensemble learning approach based on Automatic Group Sparse Coding (AutoGSC). After AutoGSC, we present an algorithm to use the reconstruction errors of data instances to calculate the ensemble gating function for ensemble construction and fusion. Experiments show that our proposed methods can achieve promising results and outperforms existing approaches.

常见问题　|　交通位置　|　联系我们　|　OA远程办公

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700