UM E-Theses Collection (澳門大學電子學位論文庫)
- Title
-
Improving inter-language links between different languages of Wikipedia
- English Abstract
-
Show / Hidden
Improving Inter-Language Links between Different Languages of Wikipedia by Kueng-Chon Cheang Thesis Supervisor: Assistant Professor Robert P. Biuk-Aghai This thesis describes a method for improving the inter-language links between two languages of Wikipedia. Inter-language links are used primarily to link a Wikipedia page to a corresponding page in another language Wikipedia [1]. We have two ways to improve the inter-language links: increase the total number of inter-language links between two languages and improve the quality of inter-language links. I have chosen four sample languages. They are Chinese, Simple English, Swedish and Norwegian Nynorsk. In the sample from Simple English Wikipedia to Swedish Wikipedia, my application adds 1178 new inter-language links to 12870 existing categories. That means 9% of existing categories could be added a new inter-language link. I also tried to eliminate some low quality matching to improve the quality of inter-language links. The works of eliminating low quality matching included proving the value of the link reciprocity ratio and the minimum number of link from a candidate. Moreover, I proposed a workflow to push my suggested inter-language links to Wikipedia. The average precision of 12 sampled pairs of Wikipedia language editions is 77.94%.
- Issue date
-
2015.
- Author
-
Cheang, Kueng Chon
- Faculty
-
Faculty of Science and Technology
- Department
-
Department of Computer and Information Science
- Degree
-
M.Sc.
- Subject
-
Computational linguistics
Multilingual computing
Wikipedia
- Supervisor
-
Biuk-Aghai, Robert P.
- Files In This Item
- Location
- 1/F Zone C
- Library URL
- 991000756599706306