school

UM E-Theses Collection (澳門大學電子學位論文庫)

Title

Automatic hidden-web database classification

English Abstract

In this thesis, we address a novel method for automatic classification of Hidden-Web Databases. In our approach, the classification tree for Hidden Web Databases is selected by tailoring some existing classification tree for Web documents. Then the feature for each class is extracted from randomly selected Web documents in the corresponding category of the classification hierarchy. For each Web database, query terms are selected from the class features based on their weights. A hidden-web database is then detected by analyzing the results of the class-specific query. To raise the performance further, we also use Web pages which have links pointing to the hidden-web database (HW-DB) as another important source to represent the database. We combine link-based evaluation and query-based detection as our final classification solution. Our experiments show that the combined method can produce much better performance for classification of Hidden-Web Databases.

Issue date

2007.

Author

Zhang, Jing Bai

Faculty

Faculty of Science and Technology

Department

Department of Computer and Information Science

Degree

M.Sc.

Subject

Web databases -- Data processing

Data mining

Electronic data processing

Supervisor

Gong, Zhi Guo

Files In This Item

View the Table of Contents

View the Abstract

Location
1/F Zone C
Library URL
991000563499706306