用户名: 密码: 验证码:
生物视觉启发的图像识别技术研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
图像识别是当前视觉领域的研究热点,其根本任务是借助计算机对图像所包含的场景或目标进行分类和辨识,在基于内容的图像检索、智能环境感知、军事目标识别等领域有着广泛的应用前景。传统的工程方法能够较好的处理结构化环境中的视觉识别任务,但在应对非结构化的自然场景分类和目标识别问题时会遇到很大的困难,有很多问题亟待解决,诸如:如何减弱甚至消除环境中的噪声、光照、遮挡等不确定因素的影响,实现自然图像的稳定感知;如何有效捕获图像的全局信息,实现快速的场景感知分类;如何将视觉注意机制融入图像识别过程,提高图像目标识别的性能等。
     本文以自然图像为研究对象,借鉴人类的视觉感知机理,结合大脑视皮层的生理结构和功能以及认知心理学的相关实验结论,围绕自然图像识别的上述问题开展了一些探索性研究,完成的主要工作如下:
     前注意的边界和表面感知:研究了如何鲁棒地检测自然图像的边界轮廓,以及在不同的光照条件下,如何稳定感知物体的表面亮度。本文主要针对Grossberg的BCS/FCS神经模型在处理自然图像时存在的问题进行了分析,提出了相应的改进和优化方案,使得自然图像的轮廓检测不受噪声和小范围遮挡的影响,增强了鲁棒性;表面亮度感知克服了原模型存在的亮度信息丢失、表面雾化、边缘模糊等问题,可以有效恢复物体表面的感知亮度,而且对于光照变化不敏感。此外,受表面恢复过程的神经元活性扩散机制和认知心理学启发,提出一种基于视觉掩蔽效应的图像扩散算法,可以在滤除噪声的同时有效地保留图像的重要结构特征。
     场景的快速感知分类:提出一种场景全局特征描述方法,能够捕获场景的全局结构特性,包含了场景中大致的几何信息,与人类通过快速获取场景的空间布局结构信息判断其语义内容的心理学观点一致,同经典的SIFT特征描述方法相比,更适合于场景图像的识别。该方法简单易实现,计算速度快。
     视觉空间注意机制建模:提出一个基于认知心理学和生理学的视觉注意计算模型。将输入图像映射到心理视觉空间,然后在每个特征图上构建一个全连接图,并利用基于图方法的随机游走模拟视皮层神经元间的信息传递,依据信息最大化原则和特征整合理论生成最终的显著图,模型在感兴趣区域检测和人眼注视预测方面优于现有模型。此外,场景信息对于目标的选择注意具有指导作用,结合本文提出的场景全局特征描述方法,建立了空间注意的上下文引导模型,引入自顶向下的注意调制机制,对于任务相关的主动视觉搜索过程具有较好的预测性能。
     引入注意机制的目标识别:将视觉注意机制融入目标识别过程,构建了一个基于注视转移的目标识别框架:NIMART。模拟人类在目标学习和识别过程中的注视转移,利用注意生成的显著图指导眼动,并考虑了注视转移过程中的返回抑制机制,利用自适应共振理论对提取的注视区域特征进行学习和决策。NIMART符合人脑学习和识别目标的机理,在通用图像数据集上的实验表明模型具有良好的目标识别性能。
Image recognition is one of the hotspots in the field of vision research. Itsfundamental task is to categorize the image scene or identify the objects in the scene bythe computer. It has wide application prospects in the field of content-based imageretrieval, intelligent environment perception, military target recognition, etc.Conventional engineering techniques could perform well in most visual recognitiontasks of structural scenes. However, when it comes to the issues about non-structuralscene categorization and object recognition, it will be very difficult to get satisfactoryresults using these traditional methods. There are many problems to be solved, such as:how to weaken or eliminate the influences of uncertain factors in the environment, suchas noise, illumination, occlusion, etc., to achieve stable perception of natural images;how to capture the scene gist effectively to achieve fast scene perception andclassification; how to incorporate visual attention mechanism into image recognition toimprove object recognition performance, etc.
     Inspired by the visual perception mechanism of human and primates, and in viewof the physiological properties of visual cortex and relevant conclusions of cognitivepsychology, we make some exploring study on several issues mentioned above ofnatural image recognition. The main work of this dissertation is as follows:
     Pre-attentive boundary and surface perception: we work over how to detect theboundary contours of natural images robustly, and how to stably generate surfacelightness percepts under variable illumination conditions. The problems of BCS/FCSneural model proposed by Grossberg in processing natural images are analyzed, andcorresponding modification and optimization schemes are proposed. As a result, wecould detect the contours of natural images, which is robust to noise and small occlusion.Also, the modified lightness perception model could effectively generate absolutelightness percepts, which is insensitive to illuminations and overcomes the problems oforiginal model such as lightness information loss, fogging, and blurring. Besides, anovel image diffusion algorithm based on visual masking effect is proposed inspired bythe neuron activity diffusion mechanism in surface recovering and cognitive psychology.The proposed algorithm could effectively smooth noises while preserving importantstructural properties of the image.
     Fast perception of scene images: we propose a novel visual descriptor for scenerecognition. It is a holistic representation and could capture the global structuralproperties and rough geometrical information of the scene. The proposed method isconsistent with the psychophysical findings, which suggests that human could quicklyget the scene gist and percept scene categories through the spatial layout information ofthe scenes. Experimental results show that it performs better than classical SIFT descriptor in recognizing scene categories. It is very easy to implement and could becomputed very fast.
     Modeling visual spatial attention mechanism: we put forward a computationalmodel of visual attention motivated by cognitive psychology and neurophysiology. First,the input image is transformed into a psychovisual space. Then we construct afully-connected graph on each feature map. A random walk is adopted on each sub-bandgraph to simulate the information transmission among the neurons in visual cortex.Consequently, we derive the activity maps corresponding to every feature map from theinformation maximization principle. We obtain the final saliency map by summing upall the activity maps according to Feature-Integrated-Theory. The proposed visualattention model performs better than existing models in detecting region of interest andpredicting human fixations. In addition, the scene gist information could guide theselective attention of objects in the scene. So we present a contextual guidance model ofspatial attention to introduce the top-down modulating influences. The gist informationis obtained through the proposed global image representation mentioned above. Thecontextual guidance model of attention predicts the image regions likely to be fixated byhuman observers well in active visual search tasks.
     Object recognition incorporating visual attention mechanism: we build aframework for object recognition inspired by saccade-based visual memory, namely,NIMART. This framework incorporates visual attention into object recognition.NIMART simulates the sequential visual attention of fixating salient locations whenobservers learn and recognize objects in a scene. The saliency map derived from visualattention model is used to guide eye movements. Inhibition of return has beenconsidered in the sequential fixations. We analyze the fixated regions for learning anddecision-making with adaptive resonance theory (ART). NIMART accords with themechanism of learning and recognizing objects of the brain. Experiments demonstratethat it could perform very well on widely-used datasets.
引文
[1]高隽,谢昭.图像理解理论与方法[M]:科学出版社,2009.
    [2]蔡红苹.基于局部特征的图像分类识别关键技术研究[D].长沙:国防科学技术大学,2010.
    [3] Teague M R. Image analysis via the general theory of moments[J]. OpticalSociety of America, Journal,1980,70:920-930.
    [4] Sheng Y L, Shen L X. Orthogonal Fourier-Mellin moments for invariantpattern recognition[J]. Optical Society of America, Journal,1994,11(6):1748-1757.
    [5] Ping Z L, Wu R G, Sheng Y L. Describing image with Chebyshev-Fouriermoment[J]. Optical Society of America, Journal,2002,19(9):1748-1754.
    [6] Amu G, Hasi S, Yang X, et al. Image analysis by pseudo-jacobi(p=4,q=3)-Fourier moments[J]. Applied Optics,2004,43(10):2093-2101.
    [7] Shen D, Horace H S. Discriminative wavelet shape descriptors for recognitionof2-D patterns[J]. Pattern Recognition,1999,32(2):151-165.
    [8]边肇祺,张学工.模式识别[M].北京:清华大学出版社,1999.
    [9]朱孝开.基于核方法的图像目标识别技术研究[D].长沙:国防科学技术大学,2009.
    [10]曾璞.面向语义提取的图像分类关键技术研究[D]: NUDT,2009.
    [11] Serre T, Oliva A, Poggio T. A feedforward architecture accounts for rapidcategorization[J]. PNAS,2007,104(15):6424-6429.
    [12]郭雷,郭宝龙.视觉神经系统与分布式推理理论[M]:西安电子科技大学出版社,1995.
    [13] Marr D. Vision: A computational investigation into the humanrepresentation and processing of visual information[M]. New York,1982.
    [14] Itti L. Visual attention[R]: University of Southern California,2002.
    [15] Ungerleider L G, Mishkin M. Two cortical visual systems[M]: MIT Press,1982.
    [16] Milner A D, Goodale M A. The visual brain in action[M]. Oxford: OxfordUniversity Press,1995.
    [17] Serre T, Poggio T. A neuromorphic approach to computer vision[J].Communications of the ACM,2010,53(10):54-61.
    [18] Logothetis N K, Pauls J, Poggio T. Shape representation in the inferiortemporal cortex of monkeys[J]. Current Biology,1995,5(5):552-563.
    [19] Hung C P, Kreiman G, Poggio T, et al. Fast read-out of object identityfrom macaque inferior temporal cortex[J]. Science,2005,310:863-866.
    [20] Fukushima K. Neocognitron: A self-organizing neural network model for amechanism of pattern recognition unaffected by shift in position[J]. BiologicalCybernetics,1980,36:193-202.
    [21] Poggio T, Edelman S. A network that learns to recognize3D objects[J].Nature,1990,343:263-266.
    [22] Riesenhuber M, Poggio T. Hierarchical models of object recognition incortex[J]. Nature Neuroscience,1999,2(11):1019-1025.
    [23] Serre T, Wolf L, Bileschi S, et al. Robust Object Recognition withCortex-Like Mechanisms[J]. IEEE TRANSACTIONS ON PATTERN ANALYSISAND MACHINE INTELLIGENCE,2007,29(3):411-426.
    [24] Wallis G, Rolls E T. Invariant Face and Object Recognition in The VisualSystem[J]. Progress in Neurobiology,1997,51:167-194.
    [25] Rolls E T, Stringer S M. Invariant visual object recognition: A model, withlighting invariance[J]. Journal of Physiology,2006,100:43-62.
    [26] Tang S M, Wolf R, Xu S P, et al. Visual pattern recognition in drosophilais invariant for retinal position[J]. Science,2004,305:1020-1022.
    [27] Li Y, Li C. A color-opponency based biological model for colorconstancy[J]. i-Perception,2011,2(4):384.
    [28] Chen L. Topological structure in visual perception[J].Science,1982,218:699-700.
    [29] Chen L, Zhang S W, Srinivasan M. Global perception in small brain:Topological pattern recognition in honeybees[J]. PNAS,2003,100:6884-6889.
    [30] Raizada R D S, Grossberg S. Context-Sensitive Binding by the LaminarCircuits of V1and V2: A Unified Model of Perceptual Grouping, Attention, andOrientation Contrast[R]. Boston University,2000:1-36.
    [31] Kelly F, Grossberg S. Neural Dynamics of3-D Surface Perception:Figure-Ground Separation and Lightness Perception[J]. Perception&Psychophysics,2000.
    [32] Grossberg S. How does the cerebral cortex work? Learning, attention, andgrouping by the laminar circuits of visual cortex[J]. Spatial Vision,1999,12:163-186.
    [33] Grossberg S, Yazdanbakhsh A. Laminar Cortical Dynamics of3D SurfacePerception: Stratification, Transparency, and Neon Color Spreading[J]. VisionResearch,2004.
    [34] Yazdanbakhsh A, Grossberg S. Fast Synchronization of PerceptualGrouping in Laminar Visual Cortical Circuits[R],2004.
    [35] Grossberg S, Kuhlmann L, Mingolla E. A neural model of3Dshape-from-texture: Multiple-scale filtering, boundary grouping, and surfacefilling-in[J]. Vision Research,2007,47:634-672.
    [36] Grossberg S, Mcloughlin N P. Cortical Dynamics of Three-DimensionalSurface Perception: Binocular and Half-Occluded Scenic Images[J]. NeuralNetworks,1997,10(9):1583-1605.
    [37] Grossberg S, Howe P D L. A laminar cortical model of stereopsis andthree-dimensional surface perception[J]. Vision Research,2003,43:801-829.
    [38] Neumann H, Sepp W. Recurrent V1-V2interaction in early visualboundary processing[J]. Biological Cybernetics,1999,81:425-444.
    [39] Neumann H, Pessoa L, Hansen T. Visual filling-in for computingperceptual surface properties[J]. Biological Cybernetics,2001,85:355-369.
    [40] Thielscher A, Neumann H. A computational model to link psychophysicsand cortical cell activation patterns in human texture processing[J]. J ComputNeurosci,2007,22:255-282.
    [41] Thielscher A, Kolle M, Neumann H, et al. Texture segmentation in humanperception: a combined modeling and fMRI study[J]. Neuroscience,2008,151:730-736.
    [42] Weidenbacher U, Neumann H. Extraction of Surface-Related Features in aRecurrent Model of V1-V2Interactions[J]. PLoS ONE,2009,4(6):1-22.
    [43] Hu D, Zhou Z, Wang Z. Processing real-world imagery withFACADE-based approaches[J]. Front Electr Electron Eng China,2011,6(1):120-136.
    [44]沈实,王正志,胡德文.基于生理视觉的计算机图象处理系统[J].高技术通讯,1995:6-10.
    [45]周宗潭.基于视觉认知生理学和心理学的遥感图像处理新方法研究[D].长沙:国防科学技术大学,1998.
    [46]彭雄宏.双目生物视觉系统神经动力学研究[D].长沙:国防科学技术大学,2000.
    [47] Grossberg S. Linking Attention to Learning, Expectation, Competition,and Consciousness[R],2003.
    [48] Bhatt R, Carpenter G A, Grossberg S. Texture segregation by visual cortex:Perceptual grouping, attention, and learning[R]. Boston University,2007.
    [49] Fazl A, Grossberg S, Mingolla E. View-invariant object category learning:How spatial and object attention are coordinated using surface-based attentionalshrouds[J]. Cognitive Psychology,2009,58:1-48.
    [50] Chikkerur S, Serre T, Tan C, et al. What and where: A Bayesian inferencetheory of attention[J]. Vision Research,2010.
    [51] Itti L, Koch C, Niebur E. A Model of Saliency-Based Visual Attention forRapid Scene Analysis[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS ANDMACHINE INTELLIGENCE,1998,20:1254-1259.
    [52] Walther D, Koch C. Modeling attention to salient proto-objects[J]. NeuralNetworks,2006,19:1395-1407.
    [53] Navalpakkam V, Itti L. An integrated model of top-down and bottom-upattention for optimizing detection speed[C].2006.
    [54] Elazary L, Itti L. A Bayesian model for efficient visual search andrecognition[J]. Vision Research,2010.
    [55] Einhauser W, Spain M, Perona P. Objects predict fixations better thanearly saliency[J]. Journal of Vision,2008,8(14):1-26.
    [56] Fei-Fei L. Visual recognition: Computational models and humanpsychophysics[D]: California Institute of Technology,2005.
    [57] Greene M R, Oliva A. Recognition of natural scenes from globalproperties: Seeing the forest without representing the trees[J]. CognitivePsychology,2009,58:137-176.
    [58] Torralba A, Oliva A, Castelhano M S, et al. Contextual guidance ofattention in natural scenes: the role of global features on object search[J]. PsychologicalReview,2006,113(4):766-786.
    [59] Grossberg S, Mingolla E. Neural dynamics of perceptual grouping:Textures, boundaries, and emergent segmentations[J]. Perception&Psychophysics,1985,38:141-171.
    [60] Grossberg S, Mingolla E. Neural Dynamics of Form Perception: BoundaryCompletion, Illusory Figures, and Neon Color Spreading[J]. PsychologicalReview,1985,92:173-211.
    [61] Grossberg S, Mingolla E, Williamson J. Synthetic aperture radarprocessing by a multiple scale neural system for boundary and surface representation[J].Neural Networks,1995,8:1005-1028.
    [62] Mingolla E, Ross W, Grossberg S. A neural network for enhancingboundaries and surfaces in synthetic aperture radar images[J]. NeuralNetworks,1999,12:499-511.
    [63]寿天德.视觉信息处理的脑机制[M]:上海科技教育出版社,1997.
    [64] Grossberg S, Todorovi'c D. Neural dynamics of1-D and2-D brightnessperception: A unified model of classical and recent phenomena[J]. Perception&Psychophysics,1988,43:241-277.
    [65] Neumann H, Pessoa L, Hansen T. Interaction of ON and OFF pathways forvisual contrast measurement[J]. Biological Cybernetics,1999,81:515-532.
    [66] Ferster D. Spatially opponent excitation and inhibition in simple cells ofthe cat visual cortex[J]. The Journal of Neuroscience,1988,8(4):1172-1180.
    [67] Hansen T, Neumann H. A simple cell model with dominating opponentinhibition for robust image processing[J]. Neural Networks,2004,17:647-662.
    [68] von der Heydt R, Peterhans E, Baumgartner G. Illusory contours andcortical neuron responses[J]. Science,1984,224:1260-1262.
    [69] Gove A, Grossberg S, Mingolla E. Brightness perception, illusory contours,and corticogeniculate feedback[J]. Visual Neuroscience,1995,12(6):1027-1052.
    [70] Martin D, Fowlkes C, Tal D, et al. A database of human segmented naturalimages and its application to evaluating segmentation algorithms and measuringecological statistics[C].2001:416-423.
    [71] Gilchrist A, Kossyfidis C, Bonato F, et al. An anchoring theory oflightness perception[J]. Psychological Review,1999,106(4):795-834.
    [72] Grossberg S, Hong S. A neural model of surface perception: Lightness,anchoring, and filling-in[J]. Spatial Vision,2006,19:263-321.
    [73] Vanleeuwen M, Fahrenfort I, Sjoerdsma T, et al. Lateral gain control in theouter retina leads to potentiation of center responses of retinal neurons[J]. The Journalof Neuroscience,2009,29(19):6358-6366.
    [74] Jouhou H, Yamamoto K, Iwasaki M, et al. Acidification decouples gapjunctions but enlarges the receptive field size of horizontal cells in carp retina[J].Neuroscience Research,2007,57:203-209.
    [75] Li C-Y, Pei X, Zhou Y-X, et al. Role of the extensive area outside thex-cell receptive field in brightness information transmission[J]. VisionResearch,1991,31(9):1529-1540.
    [76] Li C-Y, Zhou Y-X, Pei X, et al. Extensive disinhibitory region beyond theclassical receptive field of cat retinal ganglion cells[J]. VisionResearch,1992,32(2):219-228.
    [77] Keil M S. Neural architecture for unifying brightness perception and imageprocessing[D]: Ulm university,2002:204.
    [78] Keil M S. Local to global normalization dynamic by nonlinear localinteraction[J]. Physica D,2008,237:732-744.
    [79] Komatsu H. The neural mechanisms of perceptual filling-in[J].Neuroscience,2006,7:220-231.
    [80] Perona P, Malik J. Scale space and edge detection using anisotropicdiffusion[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINEINTELLIGENCE,1990,12(7):629-639.
    [81] Catte F, Lions P L, Morel J M, et al. Image selective smoothing and edgedetection by nonlinear diffusion[J]. SIAM Journal on NumericalAnalysis,1992,29(1):182-193.
    [82] Gilboa G, Sochen N, Zeevi Y Y. Image enhancement and denoising bycomplex diffusion processes[J]. IEEE TRANSACTIONS ON PATTERN ANALYSISAND MACHINE INTELLIGENCE,2004,26(8):1020-1036.
    [83] Gilboa G, Sochen N, Zeevi Y Y. Forward-and-backward diffusionprocesses for adaptive image enhancement and denoising[J]. IEEE TRANSACTIONSON IMAGE PROCESSING,2002,11(7):689-703.
    [84] Black M J, Sapiro G, Marimont D H, et al. Robust anisotropic diffusion[J].IEEE TRANSACTIONS ON IMAGE PROCESSING,1998,7(3):421-432.
    [85] Zhang F, Yoo Y M, Koh L M, et al. Nonlinear diffusion in laplacianpyramid domain for ultrasonic speckle reduction[J]. IEEE TRANSACTIONS ONMEDICAL IMAGING,2007,26(2):200-211.
    [86] Yu J, Wang Y, Shen Y. Noise reduction and edge detection via kernelanisotropic diffusion[J]. Pattern Recognition Letters,2008,29(10):1496-1503.
    [87] Prasath V B S, Singh A. Edge detectors based anisotropic diffusion forenhancement of digital images[C]. Bhubaneswar, India: IEEE Computer Society,2008:33-38.
    [88]王志明,张丽.局部结构自适应的图像扩散[J].自动化学报,2009,35(3):244-250.
    [89]王毅,牛瑞卿,喻鑫, et al.基于时间变化的鲁棒各向异性扩散[J].自动化学报,2009,35(9):1253-1256.
    [90]余庆军,谢胜利.基于人类视觉系统的各向异性扩散图像平滑方法[J].电子学报,2004,32(1):17-20.
    [91] Chen K. Adaptive smoothing via contextual and local discontinuities[J].IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINEINTELLIGENCE,2005,27(10):1552-1567.
    [92] Saha P K, Udupa J K, Odhner D. Scale-based fuzzy connected imagesegmentation: theory, algorithms, and validation[J]. Computer Vision and ImageUnderstanding,2000,77(2):145-174.
    [93] Ciesielski K C, Udupa J K. Affinity functions in fuzzy connectednessbased image segmentation I: equivalence of affinities[J]. Computer Vision and ImageUnderstanding,2010,114(1):146-154.
    [94] Ciesielski K C, Udupa J K. Affinity functions in fuzzy connectednessbased image segmentation II: defining and recognizing truly novel affinities[J].Computer Vision and Image Understanding,2010,114(1):155-166.
    [95] Anderson G L, Netravali A N. Image restoration based on a subjectivecriterion[J]. IEEE TRANSACTIONS ON SYSTEMS, MAN, ANDCYBERNETICS,1976,SMC-6(12):845-853.
    [96] Wang Z, Bovik A C, Sheikh H R, et al. Image quality assessment: Fromerror visibility to structural similarity[J]. IEEE TRANSACTIONS ON IMAGEPROCESSING,2004,13(4):600-612.
    [97] Oliva A, Torralba A. The role of context in object recognition[J]. Trends inCognitive Sciences,2007,11(12):520-527.
    [98] Oliva A, Torralba A. Modeling the shape of the scene: A holisticrepresentation of the spatial envelope[J]. International Journal of ComputerVision,2001,42(3):145-175.
    [99] Oliva A, Torralba A. Building the gist of a scene: The role of global imagefeatures in recognition[J]. Progress in Brain Research,2006,155:23-36.
    [100] Vailaya A, Figueiredo M, Jain A, et al. Content-based hierarchicalclassification of vacation images[C]. Florence,1999:518-523.
    [101] Vailaya A, Figueiredo M, Jain A, et al. Image classification forcontent-based indexing[J]. IEEE TRANSACTIONS ON IMAGEPROCESSING,2001,10(1):117-130.
    [102] Chang E, Goh K, Sychay G, et al. Cbsa: Content-based soft annotation formultimodal image retrieval using bayes point machines[J]. IEEE Transactions onCircuit and Systems for Video Technology,2006,13(1):26-38.
    [103] Shen J, Shepherd J, Ngu A H H. Semantic-sensitive classification for largeimage libraries[C]. Melbourne,2005:340-345.
    [104] Szummer M, Picard R W. Indoor-outdoor image classification[C].1998:42-51.
    [105] Paek S, Chang S F. A knowledge engineering approach for imageclassification based on probabilistic reasoning systems[C].2000:1133-1136.
    [106] Serrano N, Savakis A, Luo J. Improved scene classification using efficientlow-level features and semantic cues[J]. Pattern Recognition,2004,37(9):1773-1784.
    [107] Mojsilovic A, Gomes J, Rogowitz B. Isee: Perceptual features for imagelibrary navigation[C]. San Jose,2002:266-277.
    [108] Fan J, Gao Y, Luo H, et al. Statistical modeling and conceptualization ofnatural images[J]. Pattern Recognition,2005,38(6):865-885.
    [109] Aksoy S, Koperski K, Tusk C, et al. Learning bayesian classifiers for sceneclassification with a visual grammar[J]. IEEE Transactions on Geoscience and RemoteSensing,2005,43(3):581-589.
    [110] Vogel J, Schiele B. Semantic modeling of natural scenes for content-basedimage retrieval[J]. International Journal of Computer Vision,2007,72(2):133-157.
    [111] Fei-Fei L. Recognizing and learning object categories. Slides of the shortcourse at ICCV2005[cited; Available from:http://people.csail.mit.edu/torralba/shortCourseRLOC/index.html
    [112] Qin J, Yung N H C. Scene categorization via contextual visual words[J].Pattern Recognition,2010,43:1874-1888.
    [113] Gemert J C v, Veenman C J, Smeulders A W M, et al. Visual wordambiguity[J]. IEEE Transactions on Pattern Analysis and MachineIntelligence,2010,32(7):1271-1283.
    [114] Horster E, Lienhart R, Slaney M. Continuous visual vocabulary models forpLSA-based scene recognition[C].2008:319-328.
    [115] Bosch A, Zisserman A, Munoz X. Scene classification using a hybridgenerative/discriminative approach[J]. IEEE Transactions on Pattern Analysis andMachine Intelligence,2008,30(4):712-727.
    [116] Quelhas P, Monay F, Odobez J-M, et al. A thousand words in a scene[J].IEEE Transactions on Pattern Analysis and MachineIntelligence,2007,29(9):1575-1589.
    [117] Renninger L W, Malik J. When is scene identification just texturerecognition[J]. Vision Research,2004,44:2301-2311.
    [118] Fei-Fei L, Perona P. A bayesian hierarchical model for learning naturalscene categories[C].2005:524-531.
    [119] Bosch A, Zisserman A, Munoz X. Scene classification via pLSA[C].2006:517-530.
    [120] Lazebnik S, Schmid C, Ponce J. Beyond bags of features: Spatial pyramidmatching for recognizing natural scene categories[C].2006:2169-2178.
    [121] Gemert J C v, Geusebroek J M, Veenman C J, et al. Kernel Codebooks forScene Categorization[C].2008:696-709.
    [122] Zhou X, Zhuang X, Tang H, et al. Novel guassianized vectorrepresentation for improved natural scene categorization[J]. Pattern RecognitionLetters,2010,31:702-708.
    [123] Siagian C, Itti L. Rapid biologically-inspired scene classification usingfeatures shared with visual attention[J]. IEEE Transactions on Pattern Analysis andMachine Intelligence,2007,29(2):300-312.
    [124] Song D, Tao D. Biologically inspired feature manifold for sceneclassification[J]. IEEE Transactions on Image Processing,2010,19(2):174-184.
    [125] Serre T, Wolf L, Poggio T. Object Recognition with Features Inspired byVisual Cortex[C]. San Diego,2005:994-1000.
    [126] Grossberg S, Huang T-R. ARTSCENE: A neural system for natural sceneclassification[J]. Journal of Vision,2009,9(4):1-19.
    [127] Wu J, Rehg J M. CENTRIST: A visual descriptor for scenecategorization[R]: Georgia Institute of Technology,2009.
    [128] Ojala T, Pietikainen M, Maenpaa T T. Multiresolution gray-scale androtation invariant texture classification with Local Binary Pattern[J]. IEEETRANSACTIONS ON PATTERN ANALYSIS AND MACHINEINTELLIGENCE,2002,24(7):971-987.
    [129] Li L-J, Fei-Fei L. What, where and who? Classifying events by scene andobject recognition[C].2007.
    [130] Vapnik V N. The nature of statistical learning theory[M]: Springer,2000.
    [131] Guo Z, Zhang L, Zhang D. A completed modeling of local binary patternoperator for texture classification[J]. IEEE Transactions on ImageProcessing,2010,19(6):1657-1663.
    [132] Liu J, Shah M. Scene modeling using co-clustering[C].2007.
    [133] Zhou X, Cui N, Li Z, et al. Hierarchical gaussianization for imageclassification[C].2009:1971-1977.
    [134] Qi G-J, Hua X-S, Rui Y, et al. A joint appearance-spatial distance forkernel-based image categorization[C].2008.
    [135] Castelhano M S, Henderson J M. The influence of color on the perceptionof scene gist[J]. Journal of Experimental Psychology: Human Perception andPerformance,2008,34(3):660-675.
    [136] Itti L, Koch C. A saliency-based search mechanism for overt and covertshifts of visual attention[J]. Vision Research,2000,40:1489-1506.
    [137]杨俊.图像数据的视觉显著性检测技术及其应用[D]:国防科学技术大学,2007.
    [138] Treisman A M, Gelade G. A Feature-Integration Theory of Attention[J].Cognitive Psychology,1980,12(1):97-136.
    [139] Koch C, Ullman S. Shifts in selection in visual attention: Toward theunderlying neural circuitry[J]. Human Neurobiology,1985,4(4):219-227.
    [140] Bruce N D B, Tsotsos J K. Saliency based on informationmaximization[M]. Cambridge: MIT Press,2006:155-162.
    [141] Zhang L, Tong M H, Marks T K, et al. SUN: A Bayesian framework forsaliency using natural statistics[J]. Journal of Vision,2008,8(7):1-20.
    [142] Harel J, Koch C, Perona P. Graph-Based Visual Saliency[C].2006.
    [143] Wang W, Wang Y, Huang Q, et al. Measuring visual saliency by siteentropy rate[C].2010.
    [144] Gao D, Mahadevan V, Vasconcelos N. On the plausibility of thediscriminant center-surround hypothesis for visual saliency[J]. Journal ofVision,2008,8(7):1-18.
    [145] Seo H J, Milanfar P. Static and space-time visual saliency detection byself-resemblance[J]. Journal of Vision,2009,9(12):1-27.
    [146] Hou X, Zhang L. Saliency detection: A spectral residual approach[C].2007.
    [147] Guo C, Ma Q, Zhang L. Spatio-temporal saliency detection using phasespectrum of quaternion fourier transform[C].2008.
    [148] Bian P, Zhang L. Biological plausibility of spectral domain approach forspatiotemporal visual saliency[C].2008.
    [149] Achanta R, Hemami S, Susstrunk S. Frequency-tuned salient regiondetection[C].2009.
    [150] Marat S, Phuoc T H, Granjon L, et al. Modelling spatio-temporal saliencyto predict gaze direction for short videos[J]. International Journal of ComputerVision,2009,82(3):231-243.
    [151] Kienzle W, Wichmann F, Scholkopf B, et al. A nonparametric approachto bottom-up visual saliency[C].2007:689-696.
    [152] Meur O L, Callet P L, Barba D, et al. A coherent computational approachto model bottom-up visual attention[J]. IEEE TRANSACTIONS ON PATTERNANALYSIS AND MACHINE INTELLIGENCE,2006,28(5):802-817.
    [153] Watson A B. The cortex transform: rapid computation of simulated neuralimages[J]. Computer Vision, Graphics, and Image Processing,1987,39:311-327.
    [154] Senane H, Saadane A, Barba D. Visual bandwidths estimated bymasking[C].1993.
    [155] Daly S. A visual model for optimizing the design of image processingalgorithms[C].1994:16-20.
    [156] Meur O L, Callet P L, Barba D, et al. Masking effect in visual attentionmodeling[C].2004.
    [157] Callet P L, Saadane A, Barba D. Interactions of chromatic components onthe perceptual quantization of the achromatic component[J]. SPIE Human Vision andElectronic Imaging,1999,3644.
    [158] Itti L, Baldi P. Bayesian surprise attracts human attention[C].2006:1-8.
    [159] Gao D, Vasconcelos N. Discriminant saliency for visual recognition fromcluttered scenes[C].2005:481-488.
    [160] Rao R, Ballard D. An active vision architecture based on iconicrepresentations[J]. Artificial Intelligence,1995,78:461-505.
    [161] Kanan C, Tong M H, Zhang L, et al. SUN: Top-down saliency usingnatural statistics[J]. Visual Cognition,2009:1-25.
    [162] Ehinger K A, Hidalgo-Sotelo B, Torralba A, et al. Modelling search forpeople in900scenes: A combined source model of eye guidance[J]. VisualCognition,2009,17:945-978.
    [163] Hubel D H, Wiesel T N. Receptive fields and functional architecture intwo nonstriate visual areas of the cat[J]. Journal ofNeurophysiology,1965,28(2):229-289.
    [164] Hubel D H, Wiesel T N. Receptive fields and functional architecture ofmonkey striate cortex[J]. Journal of Physiology,1968,195:215-243.
    [165] Serre T, Kouh M, Cadieu C, et al. A theory of object recognition:computations and circuits in the feedforward path of the ventral stream in primate visualcortex[R],2005.
    [166] Mutch J, Lowe D G. Object class recognition and localization using sparsefeatures with limited receptive fields[J]. International Journal of ComputerVision,2008,80(1):45-57.
    [167] Huang Y, Huang K, Tao D, et al. Enhanced biologically inspired model[C].2008:1-8.
    [168] Raj R, Geisler W S, Frazor R A, et al. Contrast statistics for foveatedvisual systems: Fixation selection by minimizing contrast entropy[J]. Journal of theOptical Society of America A, Optics, Image Science, and Vision,2005,22:2039-2049.
    [169] Ullman S, Vidal-Naquet M, Sali E. Visual features of intermediatecomplexity and their use in classification[J]. Nature Neuroscience,2002,5:682-687.
    [170] Lacroix J, Murre J, Postma E. Modeling recognition memory using thesimilarity structure of natural input[J]. Cognitive Science,2006,30:121-145.
    [171] Barrington L, Marks T K, Hsiao J H-w, et al. NIMBLE: A kernel densitymodel of saccade-based visual memory[J]. Journal of Vision,2008,8(14):1-14.
    [172] Amis G P, Carpenter G A. Self-Supervised ARTMAP[R],2009.
    [173] Carpenter G A, Gaddam S C. Biased ART: A neural architecture that shiftsattention toward previously disregarded features following an incorrect prediction[R],2009.
    [174] Carpenter G A, Grossberg S, Rosen D B. Fuzzy ART: Fast StableLearning and Categorization of Analog Pattern by an Adaptive Resonance System[J].Neural Networks,1991,4:759-771.
    [175] Fei-Fei L, Fergus R, Perona P. Learning generative visual models fromfew training examples: an incremental bayesian approach tested on101objectcategories[C].2004.
    [176] Griffin G, Holub A, Perona P. The Caltech-256[R],2007.
    [177] Gehler P V, Nowozin S. On feature combination for multiclass objectclassification[C].2009.
    [178] Boiman O, Shechtman E, Irani M. In defense of Nearest-Neighbor basedimage classification[C].2008.
    [179] Gu C, Lim J, Arbelzaez P, et al. Recognition using regions[C].2009.
    [180] Pinto N, Cox D D, Dicarlo J J. Why is real-world visual object recognitionhard?[J]. PLoS Computational Biology,2008,4(1):0151-0156.
    [181] Yang J, Yu K, Gong Y, et al. Linear spatial pyramid matching using sparsecoding for image classification[C].2009.
    [182] Kanan C, Cottrell G. Robust classification of objects, faces, and flowersusing natural image Statistics[C].2010.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700