用户名: 密码: 验证码:
局部关联度最优的手写汉字骨架提取
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Skeleton extraction algorithm based on optimum local correlation degree for handwritten Chinese characters
  • 作者:周正扬 ; 詹恩奇 ; 郑建彬 ; 胡华成
  • 英文作者:Zhou Zhengyang;Zhan Enqi;Zheng Jianbin;Hu Huacheng;School of Information Engineering,Wuhan University of Technology;Key Laboratory of Fiber Optic Sensing Technology and Information Processing (Wuhan University of Technology) ,Ministry of Education;
  • 关键词:手写汉字 ; 细化 ; 骨架形变 ; 复杂区域 ; 局部关联度
  • 英文关键词:handwritten Chinese character;;thinning;;skeleton distortion;;complex area;;local correlation degree
  • 中文刊名:ZGTB
  • 英文刊名:Journal of Image and Graphics
  • 机构:武汉理工大学信息工程学院;光纤传感技术与信息处理教育部重点实验室(武汉理工大学);
  • 出版日期:2017-06-16
  • 出版单位:中国图象图形学报
  • 年:2017
  • 期:v.22;No.254
  • 基金:国家自然科学基金项目(61303028)~~
  • 语种:中文;
  • 页:ZGTB201706013
  • 页数:9
  • CN:06
  • ISSN:11-3758/TB
  • 分类号:129-137
摘要
目的研究手写汉字图像时,骨架是最为常见的切入点之一。利用传统细化算法提取手写汉字骨架,容易在笔画交叉等情况复杂的区域产生形变。针对此问题,提出一种基于局部关联度的手写汉字骨架提取算法。方法首先对手写汉字图像进行细化以获取原始骨架,按照端点、普通点和复杂点3种类别标注骨架点;利用8邻域窗口扫描相互连通的复杂点,检测并提取复杂区域;删除复杂区域,将原始骨架拆分为若干简单笔画段,形变部分在此过程中被一并移除;提取局部子段,根据笔画段间的方向差异程度和曲率变化程度,计算局部关联度;制定一种局部关联度最优的连接策略,对满足连接条件的笔画段进行插值补偿,从而修正形变,并得到完整的汉字骨架。结果对于600个实验样本,从骨架直接检测复杂区域所得结果十分接近理想情况,而轮廓法所得数量是理论值的2.5倍;基于局部关联度重组笔画段,绝大多数形变得到修正,重组后的骨架符合真实拓扑结构;以标准骨架为参考,骨架提取准确率达到了98.41%。结论局部关联度最优的手写汉字骨架提取算法,能够有效检测复杂区域,对形变具有良好的修正作用,提取所得骨架能够正确反映复杂笔画间的位置结构关系,是一种实用有效的骨架提取方法。
        Objective Studies on handwritten Chinese characters,such as those on signature verification and text recognition,have been conducted for many years. The skeleton is a key point in these studies. It reduces redundant information but retains a complete topology structure. Using a thinning algorithm to extract a skeleton from a handwritten Chinese character image is a traditional approach. However,distortions exist in the extracted skeleton primarily because the complex areas are not well detected nor processed. Complex areas are the intersections and junctions of strokes. Considering that characters are saved as static images,a computer cannot recognize the existence of these areas with more than one stroke. The computer still regards these areas as an entirety,so the thinning algorithm does not perform well. To solve distortion,this study proposes a skeleton extraction algorithm based on the optimum local correlation degree for handwritten Chinese characters. Method A simple and effective method to extract complex areas is designed. This method uses a thinning algorithm toobtain the original skeleton. The points on the skeleton are classified as end,common,and complex points. Complex areas are extracted by detecting connected complex points with an eight-neighbor window. Afterward,the information on complex areas is used to modify the original skeleton. The modification algorithm is based on a strategy involving split and reconstruction. The skeleton is split into several stroke segments because all complex areas are removed. Distortions are also eliminated in the removal. The reconstruction step focuses on the reconnection of stroke segments; it analyzes the relationship among stroke segments to restore the skeleton. The directional relationship is considered. The slope between two end points of a segment may not accurately represent the correct direction because the stroke segments are not always straight.Sub-segments adjacent to a complex area can provide the required directional information. In most cases,two stroke segments that are originally connected possess similar directions. However,in several situations,obtaining the direction is insufficient when determining whether two stroke segments belong to one natural stroke. Consequently,the curvature relationship should also be considered. A concept of local correlation degree is proposed based on the relationship of direction and curvature between sub-segments. The correlation degree is designed to be sensitive to the change in direction. The correlation degrees of any two stroke segments in one complex area are calculated. When two stroke segments share the optimal local correlation degree,they are regarded as a pair of continuous segments. The connection step uses interpolation to restore the removed part between continuous segments. Discontinuous segments are provided a proper extension to prevent an incorrect connection. By connecting the stroke segments,the split skeleton is reconstructed,and distortions are modified.Result Twenty people are asked to write 600 Chinese character samples for the experiment using different pens. All images are denoised and binarized. The use of the eight-neighbor window to detect complex areas in the skeleton provides a good effect. The number of detected complex zones in the 600 samples is close to the theoretical value,whereas that obtained with the contour method is 2. 5 times the theoretical value. Most distortions are modified with the local correlation degree,and the reconstructed skeleton approximates the real topology. With the standard skeleton as a criterion,the accuracy of skeleton extraction is 98. 41%. Conclusion The proposed skeleton extraction algorithm for handwritten Chinese characters uses a strategy involving split and reconstruction. Reconstruction is based on the optimum local correlation degree. The proposed method has two main advantages over other methods. First,complex area detection is considerably improved. Other methods detect complex areas mainly through the analysis of turning points on the contour. Unlike these methods,the proposed method implements detection directly from the skeleton. The method is simple and avoids excessive detection. Second,the stroke extraction algorithm provides a good result on distortion modification. Removing complex areas with distortions and reconnecting stroke segments through interpolation provide an efficient solution. The extracted skeletons retain good shapes,and the position relationships among strokes are correct. To conclude,the proposed stroke extraction method demonstrates high accuracy and processing speed. It is an effective and useful method for applications dealing with handwritten Chinese characters.
引文
[1]Wang Y W,Fu Q,Ding X Q,et al.Importance sampling based discriminative learning for large scale offline handwritten Chinese character recognition[J].Pattern Recognition,2015,48(4):1225-1234.[DOI:10.1016/j.patcog.2014.09.014]
    [2]Du J,Huo Q.A discriminative linear regression approach to adaptation of multi-prototype based classifiers and its applications for Chinese OCR[J].Pattern Recognition,2013,46(8):2313-2322.[DOI:10.1016/j.patcog.2013.01.021]
    [3]Liu C L.Normalization-cooperated gradient feature extraction for handwritten character recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2007,29(8):1465-1469.[DOI:10.1109/TPAMI.2007.1090]
    [4]Vargas J F,Ferrer M A,Travieso C M,et al.Off-line signature verification based on grey level information using texture features[J].Pattern Recognition,2011,44(2):375-385.[DOI:10.1016/j.patcog.2010.07.028]
    [5]Zeng J,Feng W,Xie L,et al.Cascade Markov random fields for stroke extraction of Chinese characters[J].Information Sciences,2010,180(2):301-311.[DOI:10.1016/j.ins.2009.09.011]
    [6]Tan J,Lai J H,Zheng W S,et al.A novel approach for stroke extraction of off-line Chinese handwritten characters based on optimum paths[C]∥Proceedings of International Conference on Frontiers in Handwriting Recognition.Bari,Italy:IEEE,2012:786-790.[DOI:10.1109/ICFHR.2012.165]
    [7]Lam L,Lee S W,Suen C Y.Thinning methodologies-a comprehensive survey[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1992,14(9):869-885.[DOI:10.1109/34.161346]
    [8]Wang J L,Guo C A.An improved image template thinning algorithm[J].Journal of Image and Graphics,2004,9(3):297-301.[王家隆,郭成安.一种改进的图像模板细化算法[J].中国图象图形学报,2004,9(3):297-301.][DOI:10.11834/jig.20040354]
    [9]Huang L,Wan G X,Liu C P.An improved parallel thinning algorithm[C]∥Proceedings of the 7th International Conference on Document Analysis and Recognition.Edinburgh,UK:IEEE,2003:780-783.[DOI:10.1109/ICDAR.2003.1227768]
    [10]Su Z W,Cao Z S,Wang Y Z.Stroke extraction based on ambiguous zone detection:a preprocessing step to recover dynamic information from handwritten Chinese characters[J].International Journal on Document Analysis and Recognition(IJDAR),2009,12(2):109-121.[DOI:10.1007/s10032-009-0085-9]
    [11]Bag S,Harit G.An improved contour-based thinning method for character images[J].Pattern Recognition Letters,2011,32(14):1836-1842.[DOI:10.1016/j.patrec.2011.07.001]
    [12]You X G,Tang Y Y.Wavelet-based approach to character skeleton[J].IEEE Transactions on Image Processing,2007,16(5):1220-1231.[DOI:10.1109/TIP.2007.891800]
    [13]Han J F,Song L L.An improved thinning algorithm for character image[J].Journal of Computer-Aided Design&Computer Graphics,2013,25(1):62-66.[韩建峰,宋丽丽.改进的字符图像细化算法[J].计算机辅助设计与图形学学报,2013,25(1):62-66.][DOI:10.3969/j.issn.1003-9775.2013.01.009]
    [14]Plamondon R,Privitera C M.The segmentation of cursive handwriting:an approach based on off-line recovery of the motor-temporal information[J].IEEE Transactions on Image Processing,1999,8(1):80-91.[DOI:10.1109/83.736691]
    [15]Yu Q,Nishiara M,Yasuhara M.A framework toward restoration of writing order from single-stroked handwriting image[J].IEEETransactions on Pattern Analysis and Machine Intelligence,2006,28(11):1724-1737.[DOI:10.1109/TPAMI.2006.216]
    [16]He R,Yan H.Stroke extraction as pre-processing step to improve thinning results of Chinese characters[J].Pattern Recognition Letters,2000,21(8):817-825.[DOI:10.1016/S0167-8655(00)00039-8]
    [17]Chiu H P,Tseng D C.A novel stroke-based feature extraction for handwritten Chinese character recognition[J].Pattern Recognition,1999,32(12):1947-1959.[DOI:10.1016/S0031-3203(99)00003-5]
    [18]Liu X B,Jia Y D.An Algorithm of line-segment extraction and thinning for character images[J].Journal of Image and Graphics,2005,10(1):48-53.[刘峡壁,贾云得.一种字符图像线段提取及细化算法[J].中国图象图形学报,2005,10(1):48-53.][DOI:10.11834/jig.20050110]

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700