基于多颜色空间和统计直方图的场景分类和目标检测研究

设为首页

收藏本站

网站地图 | English | 公务邮箱

远程访问

NSTL服务站

基于多颜色空间和统计直方图的场景分类和目标检测研究

详细信息本馆镜像全文| 推荐本文 | | 获取CNKI官网全文

英文题名：Scene Classification and Objects Detection Based on Multi-color Space and Statistical Histogram
作者：刘林
论文级别：硕士
学科专业名称：计算机应用技术
中文关键词：多颜色空间 ; 场均直方图 ; 帧间差直方图 ; 典型场景分类 ; 目标检测
英文关键词：multiple color spaces ; average scene histogram ; difference histogram ; video scene classification ; object detection
学位年度：2011
导师：李金屏
学科代码：081203
学位授予单位：济南大学
论文提交日期：2011-06-03

摘要

随着计算机和通信技术的迅猛发展,多媒体技术也日新月异,网络娱乐节目的内容形式从由文字和图片为主逐渐向视频过渡。网络提供给人们享受丰富多彩视频节目的同时,也给色情、血腥和暴力等不良视频的传播提供了便利。为青少年营造一个和谐的开放式学习平台成为全社会所关注的焦点问题。目前,不良信息检测技术可以实现对网址、图片和文字等过滤,对视频和音频检测尚不成熟。不良视频检测是一个具有挑战性的课题,涉及到多学科和多领域的知识,对其有效快速的检测成为急需解决的难题。
     本文研究的典型场景分类和目标检测是不良视频检测中的基础性工作,不良视频通常都是在特定场景下发生的,由不同对象或者不同对象视角等相关镜头组成。场景分类将有助于理解视频内容,使视频内容分析工作更具有针对性。准确的分类便于确定事件发生的场合类型,从而指导调整视频的敏感度。尤其是室内场景,则需要特别关注。目标进出场景检测有助于分析同一场景中各个镜头的有关统计信息之间的关联性。目前,课题组在镜头分割和视频风格分类上取得较好的效果。镜头分割和场景分割是视频分析的基础,镜头分割的准确度将直接影响典型场景分类精度。视频风格分类对视频的整体颜色风格进行判断,便于有针对性的调整肤色模型等。
     本文重点研究不良视频检测中的几个基础性问题,主要研究内容如下:
     1、完善多颜色空间视频综合分析平台。平台可以显示打开的视频,通过选择不同的颜色空间分量,实时显示和计算每帧图像的单帧直方图、差分直方图和场均直方图等。场均直方图主要用于场景的分类,场景分类模块可以提取其峰参数特征,实现场景分类。差分直方图主要用于目标检测,目标检测模块可以统计相邻帧或相隔几帧的直方图的差值,设定差值阈值实现目标检测。本平台还可以用于检测镜头切换、视频风格分类和有效颜色分量选择等。
     2、基于多颜色综合分析平台实现视频典型场景的分类。典型场景往往包含多个镜头,而这些镜头通常会涵盖场景中的各个方面;于是我们提出一种新的直方图,它是由视频场景中所有帧图像的某种颜色直方图累计后获得的,具有非常好的稳定性,基本可以反映该典型场景的独特本质;而不同场景的该直方图,通常存在差异。为了应用方便,对于累计求和之后的直方图进行平均,简称为场均直方图,它可以简便和有效地描述场景。本文对直方图多峰参数提取方法做了改进,利用相关分类规则实现室外场景分类和室内场景的风格描述,并取得了较好的效果。
     3、基于多颜色空间综合平台和帧间差直方图实现目标检测。视频中的场景往往是缓慢变化的,目标是经常变化的。体现在直方图上,当没有目标进出场景时,相邻两帧图像直方图变化较小,当有目标进出场景时,相邻两帧图像直方图变化显著。利用直方图之间的叠加关系,对视频中背景均匀或变化较小的情况下实现目标进出检测和目标数量的判定,目前,研究比较初步,检测效果还不稳定,下一步将深入分析视频帧间差直方图存在的规律性,提高检测的精度。
With the rapid development of computer and communication technology, multimedia technology is also changing quickly; the content of entertainment on network is mainly from word and picture to video. Network can supply to people the rich and colorful platform of video program, but it also is convenient for the objectionable videos propagation. Now, building a harmonious and open platform for the youth becomes the focus problem. At present, objectionable information detection technology can filter the network address, picture, word, and so on. The detection of video and audio is not yet perfect. Detection of objectionable videos is a challenging task, including the knowledge of multi-disciplinary and multi-fields. Therefore, how to detect objectionable videos efficiently and quickly becomes an urgent problem.
     In this paper, typical scene classification and object detection are the basic work in objectionable videos detection. Objectionable videos usually occurs in a particular scene, it is usually formed by the perspective of different objects or different objects. Scene classification is beneficial to understand video content and pertinence analysis the video content. Accurate classification can easily determine the scene where the event occurred and guide to adjust the sensitivity of the video. Especially the interior scenes need to pay attention to. Detection of object in and out can help to analyze the information correlation among the shots in the same scene. Currently, Our research group have made good results in shots segmentation and video style classification. Shots segmentation and scene segmentation are the basis of video analysis, the accuracy of shot segmentation will directly affect the accuracy of the typical classification. Video style classification determines the overall color style, and convenient to pertinence adjust the skin color models.
     In this paper we focus on resolving several basal issues of objectionable videos detection, the main research contents are as follows:
     1. Perfecting the video comprehensive analysis platform video based on multiple color spaces. The platform can display the video, choose different color space components, real-time display and calculate each frame single frame histogram, difference histogram and average scene histogram. Average scene histogram is mainly used for scene classification, scene classification module can extract the peak parameters features and achieve scene classification. Difference histogram is mainly used for object detection, object detection module can count the adjacent frame, by setting the threshold can achieve object detection. The platform can also be used to detect shots switching, video color style classification and the selection of effective color components.
     2.We have realized the classification of typical video scene by using muli-color comprehensive analysis platform. Typical scene often contain multiple shots, and these shots often cover all aspects of the scene; so we propose a new histogram which is formed by calculating the sum total of a color histogram of all the frames image of the video, the new histogram has good stability and could reflect the unique nature of the typical scene. But the histogram of different scenes often are different. In order to apply easily, we calculate the mean of the cumulative histogram, also named average scene histogram. The histogram describe the scene easily and effectively. In this paper we improve on extraction method of multi-peak histogram parameters, and achieve outdoor scene classification and describe the style of indoor scene by using the relevant classification rules. The result is good.
     3. We have realized object detection by using muti-color comprehensive analysis platform and frame difference histogram. The scene in the video often change slowly, the object always changes. Reflected on the histogram, the histogram of two adjacent frames changes little when there is no object in the scene, otherwise, it will significant change. Use the relationship of the histogram, we realize detecting the object in or out and determining the number of objects in the uniform background or small changes in the background. Currently, we have done only basic research, and detection results are not stable. In the following work, we will thoroughly analyze the regularity of histogram of frame difference to improve detection precision.

引文

[1].王惠锋,孙正兴.基于内容的图象检索中的语义处理方法[J].中国图象图形学报, 2001,6(10): 945-952.
    [2]. Ji R, Yao H X, Liang D W.D M:dynamic region matching for image retrieval using probabilistic fuzzy matching and boosting feature selection [J].SIViP,2008,2:59-71.
    [3]. Fleck M M, Forsyth D A, Bregler C. Finding naked people[C]. Proceedings of the 4th European Conference on Computer Vision and Pattern Recognition, 1996.
    [4]. Jeong C Y, Kim J S, Hong K S. Appearance-Based Nude Image Detection[C]. Proceedings of the 17th International Conference on Pattern Recognition, 2004.
    [5]. Lopes,A.P.B,Deavila,S.E.F,Peixoto,A.N.A. Nude detection in video using bag-of-visual -features[J]. SIBGRAPI,2009:224-231.
    [6].吕丽,杨树堂,陆松年等.基于光流法的不良视频检测算法[J].计算机工程,2007,33(12): 220-221.
    [7].王辰,吴玲达,老松杨等.基于声像特征的场景检测[J].计算机应用研究,2008,25(10):3036-3038.
    [8].安红心,刘艳民.基于运动特征的不良视频检测算法[J].微计算机应用, 2010, 31(7): 32-37.
    [9]. Wang J Z, Wiederhold G and Firschein O. System for screening objectionable images[J]. Computer Communications, 1998,21:1355–1600.
    [10]. Wang J Z,Li J, Gio W. SIMPLIcity:Semantics-sensitive integrated matching for picture libraries[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2001,23 (9): 947-963.
    [11]. Arentz W A,Olstad B. Classifying offensive sites based on image content[J]. Computer Vision and Image Understanding, 2004,94(1-3):295-310.
    [12]. Zhu Q,Wu C T,Cheng K T,Wu Y L. An adaptive skin model and its application to objectionable image filtering[C].Proceedings of the 12th Annual ACM International Conference on Multimedia NewYork:ACM Press,2004.
    [13].徐庆,杨维维,陈生潭.基于内容的图像检索技术[J].计算机技术与发展,2008,18(1):126-131.
    [14].张景璐,黄元飞.绿色上网现状与相关问题探讨[J].现代电信科技,2006,8:51-54.
    [15].范晓,申铉京.基于IE浏览器的色情图片过滤器[J].吉林大学学报(信息科学版),2004,22(6):631-637.
    [16].马春晓.基于特征融合的手机彩信敏感文字过滤技术的研究与实现[D].北京:北京邮电大学,2010.
    [17].彭乐,薛一波,王春露.网络视频内容的识别和过滤综述[J].计算机工程与设计,2008,29(10):2587-2590.
    [18].谢志扬,史萍.网络不良视频信息过滤系统的研究与实现[J].中国传媒大学学报自然科学版,2009,16(4):67-71.
    [19].彭宇新, Ngo C W,董庆杰等.一种通过视频片段进行视频检索的方法[J].软件学报, 2003,14(8): 1409-1417.
    [20].朱映映,周洞汝,蔡波.基于DC系数和运动矢量的快速场景分割算法[J].小型微型计算机系统,2004,25(4):540-542.
    [21].曹建荣.一种基于语义的视频场景分割算法[J].中国图象图形学报,2006,11(11):1657-1660.
    [22].王学军,丁洪涛,陈贺新.一种基于镜头聚类的视频场景分割方法[J].中国图象图形学报,2007,12(12):2127-2131.
    [23].张振原,路红.一种基于视频结构的场景分割方法[J].中国图象图形学报, 2007, 12(10): 1913-1916.
    [24]. Jinhui Yuan, Huiyi Wang, Lan Xiao, et al. A formal study of shot boundary detection[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2007,17(2): 168-186.
    [25]. Arslan Basharat, Yun Zhai, Mubarak Shah. Content based video matching using spatiotemporal volumes[J]. Computer Vision and Image Understanding,2008,110(3): 360-377.
    [26]. Han Yanbin, Yin Jianqin, Li Jinping. Human face feature extraction and recognition based on SIFT[J]. Proceedings of 2008 International Symposium on Computer Science and Computational Technology (ISCSCT), 2008, 20-22 Dec, Vol.1, 719-722.
    [27].李松斌,王玲芳,王劲林.基于剧本及字幕信息的视频分割方法[J].计算机工程,2010,36(15):211-213.
    [28].林通,张宏江,封举富等.镜头内容分析及其在视频检索中的应用[J].软件学报,2002,13 (8):1577-1585.
    [29]. Liu Z, Wang Y, Chen T. Audio Feature Extraction and Analysis for Scene Segmentation and Classification[J]. Journal of VLSI Signal Processing Systems for Signal, Image and Video Technology,1998,20:61-80.
    [30].朱映映,明仲,周景洲.一种面向基于内容视频检索的音频场景分割方法[J].小型微型计算机系统,2008,29(3):557-562.
    [31].金鸣,邱锡鹏,吴立德.改进的AdaBoost分类器在视频中的体育场景检测[J].计算机工程,2006 , 32 (12) :229-231.
    [32]. Baluja S, Rowley H. Boosting sex identification performance[J].IJCV,2007,71:111-119.
    [33]. Yang M, Lv F J, Xu W, Yu K, Gong Y H. Human Action Detection by Boosting Efficient Motion Features[J].IEEE 12th International Conference on Computer Vision Workshops,2009:522-529.
    [34].殷慧,曹永锋,孙洪.基于多维金子塔表达和AdaBoost的高分辩率SAR图像城区场景分类算法[J].自动化学报,2010,36(8):1099-1106.
    [35].李睿,王彤,李明.一种基于粗糙集的视频分类方法[J].微计算机信息,2006,24(22):49-54.
    [36].崔艳云,程文刚.一种基于粗集理论的视频流派分类方法[J].信号处理,2009,25 (12):1977-1981.
    [37].储岳中.基于支撑矢量机的自动视频分类方法[J].安徽工业大学学报,2008, 25 (3):315-323.
    [38].袁勋,吴秀清,洪日昌等.基于主动学习SVM分类器的视频分类[J].中国科学技术大学学报,2009,39 (5):473-478.
    [39].覃丹,蒋兴浩,孙锬锋等.基于一对一支持向量机的视频自动分类算法[J].计算机应用与软件,2010, 27 (1):3-5.
    [40].张建宁,孙立峰,钟玉琢.基于最优化分类的视频镜头谱聚类算法[J].清华大学学报(自然科学版) ,2007, 47 (10):1700-1703.
    [41].刘扬,黄庆明,高文等.自适应高斯混合模型球场检测算法及其在体育视频分析中的应用[J].计算机研究与发展,2006,43(7):1207-1215.
    [42].谢昭,高隽.基于高斯统计模型的场景分类及约束机制新方法[J].电子学报,2009,37(4):733-738.
    [43].原野,宋擒豹,沈钧毅.一种集成数据挖掘的自动视频分类方法[J].计算机工程,2004,30 (14):11-13.
    [44].孙少卿.基于数据挖掘的视频镜头分类技术研究[D].北京:北京工业大学,2009.
    [45].程文刚,须德,郎丛妍.一种有效的视频场景检测方法[J].中国图象图形学报,2004,9(8):984-990.
    [46].黄少年,赵跃龙,邱建雄.一种基于镜头的视频场景检测方法[J].计算机工程与应用,2006,19:170-172.
    [47].王鹏,杨士强,刘志强.信息论联合聚类算法及其在视频镜头聚类中的应用[J].计算机学报,2005,28(10):1692-1698.
    [48]. Aditya V, Mario A T F, AnilK J, Zhang H J. Image Classification for Content-Based Indexing[J]. IEEE Transactions on Image Processing,2001,10(1):117-130.
    [49]. Simone B, Gianluigi C, Claudio C, et al. Improving color constancy using indoor-outdoor image classification[J]. IEEE Transactions on Image Processing,2008, 17(12): 2381-2392.
    [50].李锦锋,许勇.基于LBP和小波纹理特征的室内室外场景分类算法[J].中国图象图形学报,2010,15(5):742-748.
    [51].刘硕研,须德,冯松鹤.一种基于上下文语义信息的图像块视觉单词生成算法[J].电子学报,2010,38(5):1156-1161.
    [52].方帅,王东署,迟健男等.视频监控系统中小运动目标分类算法[J].信息与控制, 2005,2(34):201-204.
    [53].马力,张茂军,胡小佳等.视频对象分类特征评价与选择方法[J].小型微型计算机系统, 2009,10(30):2062-2068.
    [54].沙晓英.基于动态特征的视频监控系统中的前景分类[D],青岛:中国海洋大学,2009.
    [55].侯志强,韩崇昭.基于像素灰度归类的背景重构算法[J].软件学报,2005,16(9):1568-1576.
    [56]. Stauffer C,Grimson W. Learning Patterns of activity using real- time tracking[J].IEEE Transactions Pattern Analysis and Machine Intelligence,2000,22(8):747-757.
    [57]. Zoran Z, Ferdinand V D H. Recursive unsupervised learning of finite mixture models[J]. IEEE Transaction on Pattern Analysis and Machine Intelligence, 2004,26(5):651-656.
    [58].陈祖爵,陈潇君,何鸿.基于改进的混合高斯模型的运动目标检测[J].中国图象图形学报,2007,12(9):1585-1589.
    [59].张恒,胡文龙,丁赤飚.一种自适应学习的混合高斯模型视频目标检测算法[J].中国图象图形学报,2010,15(4):631-636.
    [60].丁德志,侯德文.具有自适应能力的背景模型构建算法[J].计算机工程与设计,2009,30(l):219-221.
    [61].刘艳丽,陈昊,余炎峰.基于帧间差分法和不变矩特征的运动目标检测与识别[J].工业控制计算机,2008,21(7):34-36.
    [62].潘翔鹤,赵曙光,柳宗浦等.一种基于梯度图像帧间差分和背景差分的运动目标检测新方法[J].光电子技术,2009,29(1):34-41.
    [63].刘鑫,刘辉,强振平等.混合高斯模型和帧间差分相融合的自适应背景模型[J].中国图象图形学报,2008,13(4):729-734.
    [64]. Barron J,Fleet D,Beauehemin S. Performance of optical flow techniques[J]. International Journal of Computer Vison ,1994,12(1):43-77.
    [65].屈有山,田维坚,李英才.基于并行隔帧差分光流场与灰度分析综合算法的运动目标检测[J],光子学报,2003,32(2):182-186.
    [66].董颖.基于光流场的视频运动检测[D].济南:山东大学,2008.
    [67].张磊,项学智,赵春晖.基于光流场与水平集的运动目标检测[J].计算机应用,2009,29(4):972-978.
    [68].张海英,温玄,张田文.低信噪比多目标检测的贪心算法[J].计算机学报,2008,31(1): 142-150.
    [69].孙瑾,顾宏斌,郑吉平.一种基于梯度方向信息的运动目标检测算法[J].中国图象图形学报,2008,13(3): 571-579.
    [70].丁莹,李文辉,范静涛.基于Choquet模糊积分的运动目标检测算法[J].中国图象图形学报,2010,38(2): 263-268.
    [71].王刚,周激流,吴俊强等.基于VFW和LLE的视频图像处理与特征提取技术[J].计算机应用与软件,2008,25(12):24-26.
    [72].刘忠伟,章毓晋.利用局部累加直方图进行彩色图象检索[J].中国图象图形学报,1998,3(7): 533-537.
    [73].庄莉,徐光祐,艾海舟等.视频中多线索的人脸特征检测与跟踪[J].计算机学报,2003,26(2): 160-167.
    [74].孟丽.基于镜头切换和文本检测的视频广告检测研究[D].济南:济南大学,2007.
    [75].谭丽娜.基于图分割模型的镜头分割和视频广告检测的研究[D].济南:济南大学,2009.
    [76].李玉倩.基于多颜色空间的叠加原理和场景分类的研究[D] .济南:济南大学,2010.
    [77].魏宝刚,鲁东明,潘云鹤等.多颜色空间上的交互式图像分割[J].计算机学报,2001,24(1):770-775.
    [78].李金屏,王磊,张中方.利用FPGA实现视频移动目标的有效检测[J].计算机工程与应用,2010, 46(26):162-165.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700