A Mechanism for Solving the Unencoded Chinese Character Problem on the Web

The unencoded Chinese character problem that occurs when digitizing historical Chinese documents makes digital archiving difficult. Expanding the character coding space, such as by using the Unicode Standard, does not solve the problem completely due to the extensibility of Chinese characters. In th...

Full description

Saved in:
Bibliographic Details
Published inResearch and Advanced Technology for Digital Libraries pp. 419 - 422
Main Authors Lin, Te-Jun, Huang, Jyun-Wei, Lin, Christine, Li, Hung-Yi, Wang, Hsiang-An, Chiu, Chih-Yi
Format Book Chapter
LanguageEnglish
Published Berlin, Heidelberg Springer Berlin Heidelberg
SeriesLecture Notes in Computer Science
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The unencoded Chinese character problem that occurs when digitizing historical Chinese documents makes digital archiving difficult. Expanding the character coding space, such as by using the Unicode Standard, does not solve the problem completely due to the extensibility of Chinese characters. In this paper, we propose a mechanism based on a Chinese glyph structure database, which contains glyph expressions that represent the composition of Chinese characters. Users can search for Chinese characters through our web interface and browse the search results. Each Chinese character can be embedded in a web document using a specific Java Script code. When the web document is opened, the Java Script code will load the image of the Chinese character in an appropriate font size for display. Even if the Chinese characters are not available in the database, their images can be generated through the dynamic character composition function. As the proposed mechanism is cross-platform, users can easily access unencoded Chinese characters without installing any additional font files in their personal computers. A demonstration system is available at http://char.ndap.org.tw.
ISBN:9783540875987
3540875980
ISSN:0302-9743
1611-3349
DOI:10.1007/978-3-540-87599-4_51