UM E-Theses Collection (澳門大學電子學位論文庫)
- Title
-
Automatic hidden-web database classification
- English Abstract
-
Show / Hidden
In this thesis, we address a novel method for automatic classification of Hidden-Web Databases. In our approach, the classification tree for Hidden Web Databases is selected by tailoring some existing classification tree for Web documents. Then the feature for each class is extracted from randomly selected Web documents in the corresponding category of the classification hierarchy. For each Web database, query terms are selected from the class features based on their weights. A hidden-web database is then detected by analyzing the results of the class-specific query. To raise the performance further, we also use Web pages which have links pointing to the hidden-web database (HW-DB) as another important source to represent the database. We combine link-based evaluation and query-based detection as our final classification solution. Our experiments show that the combined method can produce much better performance for classification of Hidden-Web Databases.
- Issue date
-
2007.
- Author
-
Zhang, Jing Bai
- Faculty
-
Faculty of Science and Technology
- Department
-
Department of Computer and Information Science
- Degree
-
M.Sc.
- Subject
-
Web databases -- Data processing
Data mining
Electronic data processing
- Supervisor
-
Gong, Zhi Guo
- Files In This Item
- Location
- 1/F Zone C
- Library URL
- 991000563499706306