基于主题字幕提取的新闻视频检索研究

设为首页

收藏本站

网站地图 | English | 公务邮箱

远程访问

NSTL服务站

基于主题字幕提取的新闻视频检索研究

详细信息本馆镜像全文| 推荐本文 | | 获取CNKI官网全文

作者：王艳
论文级别：硕士
学科专业名称：系统工程
中文关键词：镜头分段 ; 关键帧 ; 主题字幕提取 ; 新闻故事分段 ; 新闻视频检索
英文关键词：Shot Subsection ; Key Frame ; Topic Caption Retrieval ; News Story Segmentation ; News Video Retrieval
学位年度：2008
导师：王建宇
学科代码：081103
学位授予单位：南京理工大学
论文提交日期：2008-06-01

摘要

近年来,基于内容的多媒体信息检索已经成为了一个热门的研究领域,新闻视频检索作为其中一部分也得到了广泛的研究。新闻视频是人们获取信息的主要媒体,但如何快速、准确地从海量的新闻视频数据中,找到所需的内容成为一个迫切需要解决的问题。本文以新闻视频为研究对象,完成了以下几个部分:镜头分段、关键帧提取、主题字幕提取、新闻故事分段和新闻视频检索,具体工作如下:
     (1)镜头分段是后续视频处理分析的前提,本文改进了基于双重比较的镜头分段算法,将自适应阈值运用于双重比较法,完成了对新闻视频的突变和渐变镜头检测,具有更高的准确性。由于主题字幕最大限度地反映了新闻的主要内容,本文提出了基于主题字幕帧的关键帧提取算法,提取的关键帧基本上可以代表镜头内容。
     (2)主题字幕提取包含三部分内容。首先根据新闻主题字幕的特点,设计了基于3/10时空切片的字幕帧检测算法,取得了较好的检测效果;然后采用基于小波变换和支持向量机的字幕区提取算法,实现了主题字幕的准确定位;最后对主题字幕进行插值放大、二值化、字符分割及OCR文字识别,识别效果较好。
     (3)针对新闻结构的特点,设计了基于主题字幕提取的新闻故事分段算法,在主题字幕提取的基础上,通过静音检测、主持人镜头检测,把连续的新闻视频分割成一个个的新闻故事,能够达到令人满意的效果;根据新闻分段结果,以及主题字幕的标注,实现了基于关键字的新闻视频检索。
Content-based multimedia retrieval has been a hot research field recent years. As a part, news video retrieval has been studied widely. News video is one of the most important media for users to get the information. It is an urgent problem to retrieve the interested news from a huge amount of news video efficiently and correctly. This paper takes news video as a research object, and discusses several problems in the process of content-based video retrieval, including shot segmentation based on cut detection, key frame selection, topic caption retrieval, news story segmentation, news video retrieval based on keywords.
     (1) Video segmentation is the first step, this paper offers a video shot detection method which is based on self-adapting dual-threshold compare, to complete the detection of the cut and the wipe, which achieves good experiment results. Because topic caption contains important semantic information, this paper offers a key frame selection method based on the frame of topic caption, which basically can show the content of the shot.
     (2) The retrieval of topic caption contains three parts. First based on analyzing of recent researches, as to the caption detection, the caption detection method is designed, which is based on texture features of 3/10 spatio-temporal slice. Secondly, the caption location method is adopted, which is based on wavelet transformation and SVM and achieves good experiment results. Thirdly, it employs the interpolation method to enhance the caption, then binary and partition the caption, and it uses the given Han Wang OCR software to recognize the characters.
     (3) Based on the research of news story segmentation, the method based on caption is discussed to the segmentation of news. Based on the retrieval of the caption, and the detection of slice and anchorperson shot, news video is segmented into a series of news story, and the experimental result indicate that this method is efficient. Based on the result of news story segmentation and the label of topic caption, the news retrieval based on key word is realized.

引文

[1]M.Flickner,H.Sawhney,W.Niblack,J.Ashley,Q.Huang,B.Dom,M.Gorkani,Query by Image and Video Content:the QBIC System.IEEE Computer,Septemper 1995.
    [2]Marco La,Edoardo Ardizzone,JACOB:Just a Content-Based Query System for Video Database.Proc.ICASSP-96,May 7.10,Atlanta.
    [3]庄越挺等.用语义联想支撑基于内容的视频检索.计算机研究与发展.Vol 36 NO.5
    [4]刘明宝等.复杂背景下的人脸检测与跟踪系统.计算机研究与发展.1997.Vol34.
    [5]庄越挺,潘云鹤,吴飞.网上多媒体信息分析与检索.第1版.北京:清华大学出版社,2002.
    [6]章毓晋.基于内容的视觉信息检索.第1版.北京:科学出版社,2003.
    [7]飞思科技产品研发中心MTALAB 6.5辅助图像处理电子工业出版社2003.6.
    [8]王保雄,余松煜.视频检索中的镜头边界检测.红外与激光工程.2000:Vol 29 NO.5
    [9]Zhang h j,Wu jianhua,Zhong di.An integrated system for content-based video retrieval and browsing pattern recognition.1997,30(4):643-657.
    [10]A.nagasaka and Y.tanaka.Atutomatic video indexing and full-video search for object appearances,second working conference on visual database systems,IFIP WG2.6,october 1991.119-133.
    [11]Arman F,Hsu A,Chiu M Y.Image processing on compressed video data for large video databases.ACM multimedia,1993,267-272.
    [12]Shih-Fu Chang and William Chen and Horace J.Meng and Hari Sundaram and Di Zhong,VideoQ:An Automated Content Based Video Search System Using Visual Cues,{ ACM }Multimedia,1997:313-324.
    [13]Atel Nilesh V,Sethi Ishawr K.Video shot detection and characterization for video databases.Pattern recognition,1997,30(4):583-592.
    [14]Stricker M,Orengo M.Similarity of color images.SPIE Storage and Retrieval for Image and Video Databases Ⅲ,Feb.1995,2185:381-392.
    [15]Kim E Y,Kim K I,Jung K and Kim H j,A video indexing system using character recognition.International Conference on Consumer Electronics,2000,pp.358-359.
    [16]Gargi U,Crandall D,Antani S,Gandhi T,Keener R and Kasturi,R.A system For automatic text detection in video.Proceedings of the Fifth International Conference on Document Analysis and Recognition,1999,pp.29-32.
    [17]Tang X A,Luo B,XGao X B,Pissaloux using temporal feature vectors.Proceedings Multimedia and Expo,Vol.1,2002,pp.85-88.
    [18]Luo B,Tang X 0,Liu J Z and Zhang H j.Video caption detection and extraction using temporal information.Proceedings of the 3rd International Symposium on Image and Signal Processing and Analysis,Vol.3,2003,pp.1723-1728.
    [19]蔡波,周洞汝,胡宏斌.数字视频中字幕检测及提取的研究和实现.计算机辅助设计与图形学学报,Vol.15,No.7,2003,pp.898-903.
    [20]Zhong Y,Zhang H J and Jain A K.Automatic Caption Localization in Compressed Video.IEEE Transactions on Pattern Analysis and Machine Intelligence,Vol.22,2000,pp.385-392.
    [21]何家颖,黎绍发.一种复杂背景图像文字分割算法.模式识别与人工智能,Vol 18,No 2,2005,pp.148-153.
    [22]Gllavat J,Ewerth R and Freislebe B A robust algorithm for text detection in images.Proceedings of the 3rd International Symposium on Image and Signal Processing and Analysis,Vol.2,2003,pp.611-616.
    [23]Mao W G,Chung F L,Lam K K M and Sun W C.Hybrid Chinese/English text detection in images and video frames.Proceedings of 16th International Conference on Pattern Recognition,Vol.3,2002,pp.1015-1018.
    [24]Jain A K and Zhong Y.Page segmentation using texture analysis.Pattern Recognition,Vol.29,1996,pp.743-770.
    [25]Kim K Z,detection in digital Jung K,Purk S Ⅱ and Kim H J.Support vector machinebased text video.Pattern Recognition,Vol.34,2001,pp.27-529.
    [26]Li H P,Doermann D and Kia 0.Automat is text detection and Tracking In digital video.IEEE Transactions on Image Processing,Vol.9,2000,pp.147-156.
    [27]Zhang D Q.Tseng B L and Chang S F.Accurate overlay text extraction for digital video analysis.Proceedings of International Conference on Information Technology:Research and Education,2003 pp.233-237.
    [28]Li H,Kia 0 and Doermann D.Text enhancement in digital video.Proceedings of SITE Document Recognition Ⅳ,1999,pp.1-8.
    [29]Kwak S,choi y and Chung K.Video caption image enhancement for an efficient character recognition.Proceedings of 15th International Conference on Pattern Recognition,Vol.2,2000,pp.606-609.
    [30]沈淑娟.基于时空域信息的视频字幕提取算法研究.西安电子科技大学硕士论文2004.1.
    [31]Zhang H J,Kankanhalli A,Smoilar S W.Automatic Partition of full-motion video, Multimedia system,1993:10-28.
    [32]马宇飞,白雪生等.新闻视频中口播帧检测方法的研究.软件学报,12(3):377-382,2001.
    [33]熊华,老松扬,吴玲琦等.News VideoCAR:一个基于内容的视频节目浏览检索系统.计算机工程,26(11):73-75,2000.
    [34]高新波.模糊聚类分析及其应用.西安电子科技大学出版社.2004,49-62.
    [35]刘华咏,周洞汝.一个基于内容的新闻视频浏览和查询系统:NewsBR.小型微型计算机系统,2004,25(4):535-539.
    [36]王润生.图像理解.长沙:国防科技大学出版社,1995
    [37]马宇飞等.新闻视频中的口播帧检测方法的研究.软件学报.2001(3)377-381
    [38]田捷等.实用图像分析与处理技术.北京:电子工业出版社,1995
    [39]徐峻等.新闻视频中主持人镜头识别方法的研究.计算机工程.2002:Vol 28 NO.3
    [40]赵亚琴.基于内容的视频片段检索技术研究.南京理工大学博士论文.2007.3,pp.67-73.
    [41]章东平.视频文本的提取.浙江大学博士论文.2006,5.
    [42]庄挺越,刘骏伟,吴飞等.基于支持向量机的视频字幕自动定位与提取[J].计算机辅助设计与图形学学报,2002,14(8):750-753,771
    [43]于俊清,周洞汝.基于文字和图像信息提取新闻视频关键帧.计算机工程与应用,2002,38(9):83-850
    [44]朱映映,周洞汝.一种基于视频聚类的关键帧提取方法.计算机工程,2004,30(4):12-13,121.
    [45]张继东,陈都.基于内容的视频检索技术.电视技术,2002(8):17-19,23.
    [46]沈帮乐.计算机图像处理.北京:解放军出版社,1995
    [47]朱曦,林行刚.视频镜头时域分割方法的研究.计算机学报,2004,27(8):1027-103.
    [48]王东辉,朱森良,吴春明.一种用于自动视频分段的WIPE转换检测和模式识别方法.计算机研究与发展,2002,39(2),247-253.
    [49]Ngo C W,Pong T C,Chin R T.Detection of gradual transition through temporal slice analysis.Computer Vision and Pattern Recognition,1999.IEEE Computer Society Conference on,23-25.June 1999,1.41.
    [50]I.Guyon.Applications of neural networks to character recognition.International journal of pattern rocgnition and artificial intelligence,1991(5):353-382.
    [51]J.Hernando.Voice signal processing and representation techniques for speech recognition in noisy environments.Signal processing,1994(36):393-341.
    [52]Shahraray B,Gibbon D C.Automatic Generation of Pictorial Transcripts of Video retrieval and browsing. Pattern Recognition, 1997(30): 643-648.

    [53] Smith J R, Chang S F. single color extraction and image query. In: Proc IEEE Int. Conf. on Image Proc, 1995:80-88.

    [54] WolfW. Key Frame selection by motion analysis. ICASSP 96, 1228-1231.

    [55] H.P.Li, D.Doemann, and O.Kia, Text extraction, enhancement and OCR in digital video, in Proc, 3rd IAPR Workshop, Nagoya, Japan, 1998. pp. 363-377.

    [56] Hua X S, Yin P and Zhang H J. Efficient video text recognition using multiple frame integration. Proceedings of 2002 International Conference on Image Processing, Vol. 2,2002, pp. 397-400.

    [57] J. T. Foote. An overview of audio information retrieval. Multimedia Systems. 1999.7(1): 2-11

    [58] Ngo C W. Analysis of Spatio-temporal Slices for Video Content Representation: PhD Thesis. Hong Kong University of Science and Technology, 2000.

    [59] Haritaoglu I. Scene text extraction and translation for handheld devices. Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognit(?)n, Vol.2, 2001, pp. 408-413.

    [60] Tang X A, Luo B, XGao X B, Pissaloux using temporal feature vectors. Proceedings Multimedia and Expo, Vol.1, 2002, pp. 85-88.

    [61] T.M.Cover. Geometrical and statistical properties of systems and linear inequalities with applications in patter recognition. IEEE Trans. On Electronic computers, 1965(3):326-334.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700