低质量文档图像的二值化研究

设为首页

收藏本站

网站地图 | English | 公务邮箱

读者指南

学术客户端

NSTL服务站

科技查新

低质量文档图像的二值化研究

详细信息本馆镜像全文| 推荐本文 | | 获取CNKI官网全文

作者：胡丽娜
论文级别：硕士
学科专业名称：模式识别与智能系统
中文关键词：低质量文档图像 ; 二值化 ; 梯度归一化 ; 视觉注意机制 ; 显著图
英文关键词：degraded document ; binarization ; gradient standardization ; visual attention ; saliency map
学位年度：2012
导师：张重阳
学科代码：081104
学位授予单位：南京理工大学
论文提交日期：2012-01-01

摘要

二值化是文档自动处理系统的一个关键预处理过程,直接影响系统的整体性能。低质量文档是由复杂背景和弱笔画等诸多因素引起的,其二值化是当前文档处理研究的热点和难点。本论文分析了文档质量下降的主要原因,重点对具有弱笔画、墨迹浸润现象以及背景亮度深浅不一的低质量文档图像二值化方法进行研究。
     本文研究了Su提出的基于局部最大值和最小值的文档图像二值化方法,针对其处理弱笔画的不足提出了一种新的基于梯度归一化的文档图像二值化方法。首先根据归一化梯度检测字符笔画的边缘点；然后通过极值滤波获得笔画的边缘区域；最后计算笔画边缘区域的局部阈值并进行二值化。与Otsu方法、Niblack方法以及Su方法进行了对比实验,结果表明,本文提出的基于梯度归一化的二值化方法不仅能够有效的检测出字符信息,而且产生的噪声较少。
     视觉注意机制在目标检测、图像压缩和图像检索等领域中得到了广泛的应用,但是在文档处理领域中的应用却鲜有报道。本文从视觉注意机制的角度出发,分析了文档图像的特征,并对视觉注意机制在文档图像二值化上的应用进行了探索,提出了基于显著图的区域全局阈值和局部阈值两种二值化方法。其中,区域全局阈值方法是对字符区域采用统一的阈值进行二值化,由于字符区域大小与字符的分布有关,所以该方法的效果不太理想,实验结果表明该方法优于常用的Otsu方法和Niblack方法,但是劣于Su方法；局部阈值方法是对字符区域采用局部阈值进行二值化,实验结果表明,该方法的处理效果要优于Otsu方法、Niblack方法以及Su方法。
Binarization is a key pre-processing of document automatic processing system. It affects the overall performance of the system directly. Degraded document image is caused by complex background, weak strokes and many other factors. Its binarization is still a focus and unsolved research. This paper analyzes the main reason for the decline on quality of document, and focuses on how to binary a document image which has a weak stroke, ink infiltration phenomenon as well as uneven background.
     Firstly, we study the document binarization algorithm based on the local maximum and minimum which was proposed by Su. Then a new improved algorithm which is based on gradient standardization is proposed. The method first detects the edge points of character strokes according to the gradient standard. Then obtain the edge region of strokes by extreme filter. Finally, binary the document image according the local threshold which is calculated by the strokes' edge region. In this paper, we do the experiment using Otsu algorithm, Niblack algorithm, Su algorithm and our method on the document images provided by the paper. The results show that the proposed gradient standardization method not only can detect the target character information effectively, but also produce less noise.
     As is known to us all, the visual attention has been widely used in the target detection field, natural image compression field, image searching field, visual interface designing field and so on. However, there are few reports about the application of document processi-ng system. This paper analyzes the banarization of the document image from the perspecti-ve of visual attention, and proposes two methods which are both based on saliency map. Global threshold method is to use the threshold to do the binarization for the charater region. As the character size and character region related to the distribution, the effect of this method is not very well. The result shows that this method is better than the Otsu method and Niblack method, but worse than the Su method. Local threshold method is to use local threshold to do the binarization for character regions. The result shows that this method is better than the Otsu method, Niblack method and Su method.

引文

[1]文颖. 数字、字符识别及其应用研究[D]. 上海：上海交通大学电子信息与电气工程学院,2009.
    [2]朱小燕.手写体字符识别研究[J].模式识别与人工智能,2000,13(2)：174-180.
    [3]催锦梅.汉字文本自动录入系统[J].交通与计算机,1999,16(4)：34-35.
    [4]任柯星,唐丹,尹显东.基于字符结构知识的车牌汉字快速识别技术[J].计算机测量与控制,2005,13(6)：592-594.
    [5]娄震,胡种山,杨静宇.支票自动处理系统中的图像处理及知识[J].南京理工大学学报,1999,23(3)：273-277.
    [6]蒋焰,丁晓青,任征.基于地址结构匹配的手写中文地址与识别[J].清华大学学报,2006,46(7)：1235-1238.
    [7]孙羽菲.低质量文本图像OCR技术的研究[D].北京：中国科学院计算技术研究所,2005.
    [8]陈杰等.基于遗传算法的图像分割的研究[J].现代电子技术,2010,33(14)：42-44.
    [9]童立靖.文档图像二值化方法VFCM[J].计算机工程与设计,2009,30(13)：3216-3243.
    [10]P. K. Sahoo, S. Soltani, A. K. C. Wong, and Y. C. Chen. A survey of thresholding techniques [J]. Computer Vision, Graphics and Image Processing,1988, vol.41: 233-260.
    [11]N. Otsu. A threshold selection method from gray-level histograms[J]. IEEE Transactions on Systems,1979,9(1):62-66.
    [12]W. Tsai. Moment-preserving thresholding:A new approach[J]. Computer Vision, Graphics and Image Processing,1985, vol.29:77-393.
    [13]G. Johannsen and J. Bille. A threshold selection method using information measures [C]. Proceedings Sixth International Conference Pattern Recognition,1982:140-143.
    [14]J. N. Kapur, P. K. Sahoo and A. K. C. Wong. A new method for graylevel picture thresholding using the entropy of the histogram[J]. Computer Vision, Graphics and Image Processing,1985, vol.29:273-285.
    [15]M. Sezgin and B. Sankur. Survey over image thresholding techniques and quantitative performance evaluation[J]. Journal of Electronic Imaging,2004,13(1):146-165.
    [16]J. Sauvola and M. Pietikainen. Adaptive document image binarization[J]. Pattern Recognition,2000, vol.33:225-236.
    [17]O. Trier and T. Taxt. Evaluation of binarization methods for document images[J]. IEEE Transactions on Pattern Analysis And Machine Intelligence,1995, 17(3):312-315.
    [18]Yi-Fan Chang, Yu-Ting Pai, Shanq-Jang Ruan. An Efficient Thresholding Algorithm for Degraded Document Images Based on Intelligent Block Detection[C]. IEEE International Conference on Systems, Man and Cybernetics,2008:667-672.
    [19]Bolan Su, Shijian Lu, Chew Lim Tan. Binarization of Historical Document Images Using the Local Maximum and Minimum [C].9th IAPR International Workshop on Document Analysis Systems,2010:159-165.
    [20]Bolan Su, Shijian Lu, Chew Lim Tan. Combination of Document Image Binarization Techniques[c]. IEEE International Conference on Document Analysis and Recognition,2011:22-26.
    [21]Basilis Gatos, Konstantinos Ntirogiannis and Ioannis Pratikakis. ICDAR 2009 Document Image Binarization Contest(DIBCO 2009)[C].9th International Conference on Document Analysis and Recognition,2009:1375-1382.
    [22]陈侃.基于模糊计算的文本图像二值化方法研究与应用[D].北京：北方工业大学信息工程学院,2009.
    [23]王香菊.基于中值滤波和小波变换的图像去噪方法研究[D].西安：西安科技大学,2008.
    [24]童立靖,张艳,占国亮等.几种文本图像二值化方法的对比和分析[J].北方工业大学学报,2011,23(1)：25-33.
    [25]赵善龙,刘明勇.图像二值化时阈值自适应选取方法及其Visual C++实现[J].哈尔滨铁道科技,2006,vol.1：8-10.
    [26]李佐勇.基于统计和谱图的图像阈值分割方法研究[D].南京：南京理工大学计算机科学与技术学院,2010.
    [27]刘红霞.图像分割算法的研究与实现[D].上海：华东师范大学,2004.
    [28]W. Niblack. An Introduction to Digital Image Processing[M]. Denmark:Strandberg Publishing Company,1985.
    [29]Bemsen J.. Dynamic thresholding of gray-level image[C].8th International Conference Pattern Recognitionn, 1986:1251-1255.
    [30]刘凡等.基于最小误判准则的LED管芯数字图像分割[J].长春工业大学学报,2003,24(4)：62-64.
    [31]刘风杰等.基于遗传算法的图像阈值分割研究[J].电力系统通信,2008,29(193)：35-39.
    [32]孙淑绒.基于熵的深海资源图像处理算法研究与应用[D].长沙：中南大学,2008.
    [33]周成.基于进化规划的聚类算法研究[D].北京：北京交通大学,2006.
    [34]行长印.基于图像信息的自动视频调光调焦[D].长春：长春理工,2009.
    [35]米林等.一种基于Canny理论的边缘提取改进算法[J].重庆理工大学学报,2010,24(5)：54-63.
    [36]蔡建新,杨磊等.一种新的快速中值滤波方法[J].计算机时代,2008,vol.12：58-59.
    [37]贺俊.基于视觉注意机制的物体显著性研究[D].上海：上海交通大学计算机科学与工程系,2009.
    [38]L. Itti, C. Koch, E. Niebur. A Model of Saliency-Based Visual Attention for Rapid Scene Analysis[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998,20(11):1254-1259.
    [39]Xiaodi Hou, Liqing Zhang. Saliency Detection:A Spectral Residual Approach [C]. Computer Vision and Pattern Recognition,2007:1-8.
    [40]Ming-Ming Cheng, Guo-Xin Zhang, Niloy J.Mitra, Xiaolei Huang, Shi-Min Hu. Global Cintrast based Salient Region Detection [C]. IEEE CVPR,2011:409-416.
    [41]李敏学.基于注意力机制的图像显著区域提取算法分析与比较[D].北京：北京交通大学计算机科学与技术,2011.
    [42]李传祥.视觉选择性注意研究[D].长沙：国防科学技术大学,2010.
    [43]Moray, Neville. Attention, Selective Processes in Vision and Hearing[M]. London: Hutchinson Educational Ltd,1969:136-1274.
    [44]Michael Jenkin, Laurence Harris. Vision and Attention[M]. Berlin:Springer,2001: 124-146.
    [45]兰仟.基于显著特征描述的复杂场景中目标检测方法[D].武汉：华中科技大学,2007.
    [46]福赛斯等著,林学阎等译.计算机视觉—一种现代方法[M].北京：水利水电出版社,2004.

常见问题　|　交通位置　|　联系我们　|　OA远程办公

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700