基于内容的视频摘要系统研究与实现

设为首页

收藏本站

网站地图 | English | 公务邮箱

读者指南

学术客户端

NSTL服务站

科技查新

基于内容的视频摘要系统研究与实现

详细信息本馆镜像全文| 推荐本文 | | 获取CNKI官网全文

英文题名：Research and Implementation of Content-Based Video Abstraction System
作者：谢林江
论文级别：硕士
学科专业名称：电子与通信工程（专业学位）
中文关键词：视频摘要 ; 镜头边界检测 ; 镜头聚类 ; 人脸检测 ; 镜头重要度模型
英文关键词：Video abstraction ; Shot boundary detection(SBD) ;
英文关键词：Shot clustering ; Face detection ; Shot important degree model
学位年度：2013
导师：包秀国
学科代码：0852
学位授予单位：北京邮电大学
论文提交日期：2012-12-03

摘要

网络技术发展日新月异,多媒体技术应用领域也日益广阔。如何快速的浏览视频内容,如何简洁的表达视频的主要内容,以及如何实现对大容量视频库进行快速检索等方面的问题已经成为研究的热点。视频摘要技术在这种背景下应运而生,成为解决问题的有效手段。分析近年来国内外视频摘要相关文献,可以发现摘要关注的角度以及使用的方法都会影响视频摘要系统的处理效果。本文将研究和实现一种基于内容的视频摘要系统。
针对镜头边界检测中遇到的问题,本文提出了一种基于图像局部特征点匹配的镜头边界检测算法。该算法采用SURF算子进行图像局部特征检测,分为粗检测和细检测两个阶段进行镜头边界检测。其中粗检测目的是推荐可能存在边界的区间,而细检测的目的是判断推荐区间里是否存在镜头边界以及确定边界位置。在一系列测试实验中,该算法表现出了较好的稳定性以及准确性。针对镜头聚类的问题,本文将近邻传播聚类算法应用于镜头聚类。实验结果证明该方法有效而且合理。在设计镜头重要度模型过程中,针对图像背景比较复杂这一情况,本文提出一种基于多特征融合的人脸检测算法。通过对比试验以及在线检测实验,证明了该算法具有良好的稳定性以及准确性。为了实现视频摘要,以镜头重要度模型为基础设计了一种视频摘要生成方法。最后,本文结合研究成果设计和实现了一个基于内容的视频摘要系统。该系统采用C++语言,OpenCV开源库以及其它相关库实现,具备一定的应用价值。
As the rapid development of network technology, the application fields of multimedia technology are also increasing broadly. It has become a hot research topic that how to browse video content quickly, how to express the main content of the video succinctly as well as how to achieve fast retrieval of large-capacity video library and so on.In this context, the video abstract technology emerged as an effective method to solve the problems. The analysis of the domestic and foreign relevant literature about video abstract in recent years is that the perspective of video abstract concern about and methods used by video abstract have affected the treatment effect of video abstract system. This dissertation would research and implement a content-based video abstract system.
With some questions met in the shot boundary detection(SBD), this dissertation proposed a algorithm for SBD based on the local image feature points matching. local Image features are detected by using the detector of SURF in the algorithm and there are two phases of SBD, that is, rough detection and fine detection. The purpose of the rough detection recommended the possible existence of boundary interval and the fine detection is to judge whether there is shot boundary in the recommended boundary interval as well as to determine the location of the boundary. In a series of test experiments, the algorithm shows good stability and accuracy. In term of shot clustering, the affinity propagation clustering algorithm is applied to the shot clustering. Experimental results show that the method is effective and reasonable. In terms of the image with complex background during the design of shot important degree model, this dissertation proposed a face detection algorithm based on multi-feature fusion. Through the contrast test as well as online detection experiments, it proves that the algorithm has good stability and accuracy. The video abstract is created by a video abstract generation method based on shot important degree model. Finally, according to the research achievement, this dissertation designed and implemented a system of content-based video abstract. The system was realized by using the C++language, OpenCV open source library and other related libraries, with a certain application value.

引文

[1]朱明,数据挖掘(第二版)[M].合肥：中国科学技术大学出版社,2008.
    [2]欧阳建权,李锦涛,张勇东.视频摘要技术综述[J].计算机工程,2005,31(10)：7-9.
    [3]纪志胜.视频摘要自动生成技术研究[D].上海,上海交通大学,2009.
    [4]陈佳,滕东兴,杨海燕等.一种草图形式的视频摘要生成方法[J].中国图象图形学报A,2010,15(8)：1139-1144.
    [5]谭洁,吴玲达,应龙等.基于动画特征的视频摘要方法[J].计算机应用研究,2009,26(10)：3960-3962.
    [6]苗耀锋.基于音频分析的足球视频摘要系统分析研究[D].西安,西北大学,2010.
    [7]卜庆凯,胡爱群.一种新的视频兴奋内容建模和视频摘要提取方法[J].信号处理,2009,25(8)：1319-1324.
    [8]曹建荣,蔡安妮.基于支持向量机的语义视频摘要[J].北京邮电大学学报,2006,29(4)：94-98.
    [9]姜东明.基于聚类挖掘的视频摘要生成研究[D].杭州,浙江大学,2010.
    [10]Z. Li, A. K. Katsaggelos, G. Schuster and B. Gandhi. Rate-Distortion Optimal Video Summary Generation, IEEE Trans. on Image Processing.2005, vol.14, no.10, pp:1550-1560.
    [11]谷列先,丁晓青.基于人物关系分析的视频自动摘要算法[J].高技术通讯,2010,20(9)：929-933.
    [12]卜庆凯,胡爱群.一种新的镜头边界检测和静态视频摘要提取方法[J].东南大学学报(自然科学版),2007,37(4)：559-564.
    [13]张玉培,孔敏,翟素兰等.基于镜头标记与动态滑动窗口的视频摘要生成[J].计算机工程,2012,38(2)：256-258.
    [14]张剑.鲁棒的镜头边界检测与基于运动信息的视频摘要生成[J].计算机辅助设计与图形学学报,2010,22(6)：1023-1032.
    [15]Zhu Liu, David Gibbon, Eric Zavesky, Behzad Shahraray, Patrick Haffner. AT&TRESEARCH AT TRECVID 2006, Online Proc. Of TRECVID 2006.
    [16]Zhu Liu, Eric Zavesky, David Gibbon, Behzad Shahraray, Patrick Haffner. AT&T RESEARCH AT TRECVID 2007, Online Proc. Of TRECVID 2007.
    [17]谢毓湘,栾悉道,吴玲达等.一种基于EDU模型的新闻视频摘要方法[J]. 国防科技大学学报,2007,29(5):71-76.
    [18]陈秀新,邢素霞.图像／视频检索与图像融合[M].北京：机械工业出版社,2012.
    [19]Kikukawa T, Kawafuchi S. Development of an Automatic Summary Editing System for the Audio Visual Resources [J]. Transactions of the In statute of Electmnics, Information and Communication Enginee,1992, J75-A(2).
    [20]Jun Li, Youdong Ding, Wei Li and Yunyu Shi. DWT-based Shot Boundary Detection Using Support Vector Machine[J]. International Journal of Computer Information Systems and Industrial management Application(LJCISIM).2009, Vol.1, pp:214-221.
    [21]许剑锋,黎绍发.一种基神经网络的视频镜头边界检测算法[J].计算机工程,2006,32(5),pp:181-182.
    [22]陈曦,孙锬锋,蒋兴浩,方之盺.基于动态学习的镜头边界检测阈值设定算法.计算机应用与软件[J],2010,27(1),pp:1-2.
    [23]M.-H.Park, R.-H.Park, and S. W. Lee. Shot Boundary Detection using Scale Invariant Feature Matching, in Proc. SPIE Visual Communications and Image Processing,2006, vol.6077, pp:569-577.
    [24]D.G.Lowe.Distinctive Image Features from Scale-invariant Keypoints. International journal of computer vision,2004, vol.60, p:91-110.
    [25]C.-R.Huang, H.-P.Lee, and C.-S.Chen.Shot Change Detection via Local Keypoint Matching.IEEE Transactions on Multimedia, vol.10, Oct.2008, pp:1097-1108.
    [26]C.-R.Huang, C.-S.Chen, and P.-C.Chung, Contrast Context Histogram-An Efficient Discriminating Local Descriptor for Object Recognition and Image Matching. Pattern Recognition, vol.41, Oct.2008, pp:3071-3077.(SCI)
    [27]M.Birinci, S.Kiranyaz, M.Gabbouj.Video Shot Boundary Detection by Structural Analysis of Local Image Features.12th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS), April 2011, Delft, Netherlands.
    [28]Herbert Bay, Andreas Ess, Tinne Tuytelaars, Luc Van Gool.SURF:Speeded Up Robust Features.Computer Vision and Image Understanding (CVIU),2008, Vol.110, No.3, pp:346-359.
    [29]BAUER J, SUNDERHAUF N, PROTZEL P. Comparing Several Implementations of Two Recently Published Feature Detectors[C]. Proc.of the International Conference on In-telligent and Autonomous Systems, Toulouse, France:IAV,2007, pp:1140-1151.
    [30]The Open Video Project[Online].Available:http://www.open-video.org.
    [31]杨杰,姚莉秀.数据挖掘技术及其应用[M].上海：上海交通大学出版社,2010.
    [32]Frey B J, Dueck D. Clustering by Passing Messages between Data Points[J]. Science,2007,315(5814):972-976.
    [33]Frey Lab Probabilistic and Statistical Inference Group University of Toronto [Online]. Available:http://www.psi.toronto.edu.
    [34]Paul Viola, Michael Jones. Rapid Object Detection using a Boosted Cascade of Simple Features. Conference on Computer Vision and Pattern Recognition (CVPR), 2001, vol.1, pp:511-518.
    [35]高晓鹏,新型二阶统计描述子及其在物体检测与跟踪方面的应用[D].哈尔滨,哈尔滨工业大学,2010.
    [36]Rainer Lienhart and Jochen Maydt. An Extended Set of Haar-like Features for Rapid Object Detection[C]. IEEE ICIP,2002, Vol.1, pp:900-903.
    [37]T. Ojala, M. Pietikainen, T. Maenpaa. Multiresolution Gray-scale and Rotation Invariant Texture Classification with Local Binary Patterns [J]. IEEE Transactions on Pattern Analysis and Machine Intelli-gence,2002,24(7):971-986.
    [38]Shengcai Liao, Xiangxin Zhu, Zhen Lei, Lun Zhang and Stan Z. Li. Learning Multi-scale Block Local Binary Patterns for Face Recognition. International Conference on Biometrics (ICB),2007, pp:828-837.
    [39]Lun Zhang, Stan Z. Li, Xiaotong Yuan and Shiming Xiang. Real-time Object Classification in Video Surveillance Based on Appearance Learning. Conference on Computer Vision and Pattern Recognition(CVPR),2007, pp:1-8.
    [40]Ethan Rublee, Vincent Rabaud, Kurt Konolige, Gary R. Bradski:ORB:An efficient alternative to SIFT or SURF. ICCV 2011:2564-2571.
    [41]Alahi, R. Ortiz, and P. Vandergheynst. FREAK:Fast Retina Keypoint. In IEEE Conference on Com-puter Vision and Pattern Recognition(CVPR),2012.

常见问题　|　交通位置　|　联系我们　|　OA远程办公

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700