A Chinese character system modeling and development

Chinese character system is the most complicated character system in the world. There are over ten thousand general Chinese characters. The system has a large amount of information and knowledge, which includes Pinyin, structure, component, words, calligraphy and so forth. Therefore, fast searching from the massive information is a meaningful and challenging work. In this thesis, we first analyze and design the system using UML, and then implement it in Java. A variety of information of Chinese characters can be searched in the system efficiently with multiple conditions. A standard XML schema is designed for documenting the whole information of 6763 general Chinese characters (GB2312-80 of the People’s Republic China) by focusing on structural information. Based on the established character information database, a group of searching and browsing functions are implemented in the software system. In addition, a character synthesis algorithm is designed and implemented, so that people can compose a new Chinese character interactively by the components from a given set of Chinese characters. Compared with the existing systems, our system is particularly suitable for searching and browsing the structural information of Chinese characters conveniently. The developed system would be helpful for people to analyze and study Chinese characters.

Chen, Xiao Jian


Faculty of Science and Technology


Department of Computer and Information Science




Chinese characters

Chinese language -- Etymology

UML (Computer science)


Li, Xiao Shan

