用户名: 密码: 验证码:
维文Unicode在线处理技术与实现
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
近几年来,随着计算机软硬件技术和Internet技术的迅速发展,计算机已经深入到了人类社会的每一个角落。新疆是个少数民族聚居的地区,很多方面都有使用少数民族语言文字的要求,最近的Windows环境下多文种信息处理平台的开发研究不仅丰富了多文种平台的开发理论和方法,而且有助于少数民族语言信息的计算机处理的推广应用。另一方面,民文信息处理过程中所使用的没有规范化的字符编码方案给民文信息资源的资源共享和Internet技术的普及带来了很大麻烦。这种情况下,为了促进实行已进入Unicode的维文编码方案和实现维文信息处理的标准化,开发一个高性能智能化的维文Unicode在线输入技术成为维文信息处理中十分关键和活跃的领域。在我区,民文信息处理技术的发展过程大致经历了以下几个阶段:
     ● DOS下文字处理和排版阶段(1984-1994),其主要特点是初步研究并理解计算机的体系结构和工作原理。在这一时期,主要为政府部门和媒体部门开发了一些小型文字输入软件和中型多文种排版系统软件,并进行了研究工作,为少数民族的信息化事业奠定了基础。其主要技术要点是改变键盘驱动程序和调用DOS中断来处理民文信息。
     ● Windows平台的研究阶段(1994-1999),最早对在Windows上开发维文平台进行研究的是新疆大学计算机系863课题组,他们于1996年7月研制成功Windows3.x多语种和纯维文两种处理平台。其主要技术要点是采纳阿拉伯语Windows3.1的技术,修改Windows有关输出的GDI函数,键盘布局和阿拉伯字库实现维文和西文的混合处理。
     ● 应用软件和网络化阶段(1999至今),其代表性的技术结果有维文Windows98/2000输入法,各种教学软件,专用软件和维文(拉丁文,老文)网站等。其输入法的实现技术是从系统底层修改阿拉伯语输入法或者挂钩控制键盘布局和覆盖阿拉伯字库。因浏览器提供的阿拉伯字符的编码方式和文档的从右到左编码方向为民文信息的网上处理和共享带来了很大的方便,一些单位和个人使用浏览器的此长处建设了自己的纯维文网站为我区信息化事业的发展做出了相当大的贡献。这一时期是民文处理的辉煌阶段,在市场上出现了形形色色的民文输入法和其它软件。
     为了能够在任何Windows平台上在线处理维文,实现维文信息在Internet环境中资源的最大共享,本文提出了一种全新的借助于Unicode编码方案来实现Windows平台下的网络环境中维文(老)文字的显示技术和多编码转换技术,并在此基础上促进和实现维文信息处理的标准化。为了实现这一系统,我们的总体设计原则和目标是:
     1、创建纯维文Unicode标准字库实行国际标准编码方案,而不是像以前那样替换原有阿拉伯字库;
    
     维文Unicode在线处理技术与实现 摘要
    2、通过测览器挂钩处理方式
    3、来实现维、中、英等文的在线混合输入和显示:
    3、采用输入时自动选型技术与键盘切换技术实现智能输入,以提高录入速
    度,使得维文字符都能准确相连并不影响其他文种的正确显示;
    4、设计并实现多方向文字转换门E标推维吾尔老文字,Unicode维文,维
    吾尔拉丁文)技术给用户提供极大方便并保留用老系统输入的数据:
    5、为了实现在线输入法的共享,在系统的设计中充分考虑了程序的移植性、
    通用性和安全性。
     本文从各个方面探讨了维文信息处理的现状和存在的问题,研究和解决
    了很多关键问题,所做的工作如下:
    .首次创建了纯维文Unicode标准True Type字库,实行国际标准编码方
     案。
    .通过在线处理,提出了维文Unicode字符显示方案。
    .首次提出了测览器挂钩处理方式,实现维、中、英等文的在线混合编辑,
     解决了维文字符的正确插入问题。
    .采用输入时自动选型技术与键盘切换技术实现了维文字符的正确选型,
     通过智能输入方案提高录入速度。
    .实现了多方向文字转换技术,给用户提供方便并能保留用非标准系统输
     入的数据。
     此项研究得到国家863课题组的资助。
Recently, with the rapid development of computer software and hardware technology, the computer has pervaded every corner of our society. Xinjiang is a multi-national region mainly inhabited by ethnic minorities, and there is a frequent need to use minority scripts. The research on developing a Windows-based multiscript operating platform does not only aim at enriching the theories and methodologies of multiscript platform development technology but it also seeks to make the computerizing of minority language information easier. On the other hand, non-standardized character encoding schemes used for minority-language information processing cause many problems hindering the sharing of minority information resources and slowing the penetration of the Internet. Under such circumstances, developing a highly efficient and intellectualized Uyghur Unicode on-line inputting technique and implementing the norms accepted by Unicode for Uyghur encoding have become urgent matters. In our region, the development of minor
    ity-language information processing techniques roughly followed the following phases:
    ? DOS-based character processing and publishing software (1984-1994)
    One of the peculiarities of this stage was the effort to study and understand the structure and principles of computers. Minitype-character inputting software and multiscript publishing systems were developed for government offices and for the media. These research established a foundation for minority-language information processing projects. The main technical point of the research was changing keyboard installation programs and using DOS interrupts.
    ? Research on the Windows Platform (1994-1999)
    The 863 Research group, Computer Department of Xinjiang University, took the first step in the research on Windows-based Uyghur platforms, and it successfully developed a multiscript version and a purely Uyghur version of Windows 3.X in July, 1996. The main technical point of the research was adopting Arabic Windows 3.1's new techniques. It was necessary to change Windows' output controlling function GDI, adapt the keyboard layout and substitute Uyghur fonts for Arabic fonts in order to implement Uyghur and Western-language multi-processing.
    ? Application Software and Networking (1999-todate)
    The characteristic technical achievements are Windows 98/2000 IME, CA1 software, purpose-oriented software and Uyghur (Latin-based, Traditional Uyghur) websites. The implementing techniques of IME are system-low-level changing Arabic IME or controlling keyboard layouts with plug-in programs, then screwing Arabic fonts. Since browsers provide enormous convenience for the minority-language information online processing with the Arabic encoding function and right-to-left encoding function, some government offices and individuals created their pure Uyghur websites using the strong points of the new browsers,
    W, 3 ft ?4 5;
    
    
    Online Uyghur Unicode Processing Techniques and its Implementation______________________Abstract
    thus contributing to information technology in our region. This is an unprecedented period for minority-language information processing, and all sorts of IME and other software are now available on the market.
    This paper puts forward a brand new implementation scheme for online Uyghur Unicode display techniques and multi-coding converting techniques based on the Windows platform and network. This will facilitate the setting up and and dissemination of a unique Uyghur information processing standard. These techniques will be very helpful when dealing with Uyghur-language information on any platform, be it sharing Uyghur-language information on the Internet, creating websites or any other such activities. To actualize this system, our general design principles and objectives have been:
    1. Creating pure Uyghur Unicode fonts using the international standard, and not covering the Arabic fonts with Uyghur fonts like in the old ways;
    2. Solving the issue of online multiscript inputting (Uyghur, Chinese, English etc.) and displayin
引文
Proceedings 2000 International Conference on Multilingual Information Processing,August 2000
    Windows 95/NT国际软件开发指南,Nadine Kano[美],清华大学出版社,1998年3月
    Windows 2000编程,李多多等,人民邮电出版社,2000年12月
    Windows 95开发者必读,Stefano Maruzzi,电子工业出版社,1996年2月
    最新Java 2和新类库详解(上),王克宏,清华大学出版社,1999年7月
    Active Server Pages网页制作教程,王国荣,人民邮电出版社,2000年6月
    ASP网络开发技术
    Http://www.unicodc.org,The Unicode Standard 2.1/3.0/3.1
    Http://www.iso.ch
    Http://www.google.com
    Http://www.vbscript.com
    Microsoft MSDN光盘(2CD)和http://www.msdn.com

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700