school

UM E-Theses Collection (澳門大學電子學位論文庫)

check Full Text
Title

BLS-based 3D object recognition approaches for LiDAR point clouds

English Abstract

3D (3-Dimensional) object recognition is a widely researched vision task that bene- fits massive industry applications, including environment perception in autonomous driving and scene understanding in mobile robots. Accurate object recognition of sur- rounding obstacles provides significant information for the unmanned ground vehicle (UGV) to local path planning. Velodyne light detection and ranging (LiDARs) is a kind of popular range sensor carried on UGV to collect high-precision point clouds without being influenced by varying illuminations. However, most 3D point cloud pub- lic datasets are generated by standard CAD models with uniformed distribution (e.g., ModelNet10/40, ShapeNet), which are different from the real LiDAR point clouds collected by LiDAR sensors. Thus, I collect a real LiDAR object dataset named Li-DARNet for increasing the authenticity of object recognition results in real-world environment perception applications. Because all the raw data collected from LiDAR sensors belong to whole scenes, an elevation-reference connected component labeling (ER-CCL) algorithm is adopted to fast cluster individual obstacles from unknown terrains. Based on these clustering results of individual objects, a semi-auto labeling tool is developed to store object samples and add their corresponding manual labels. To fasten the labeling process of point cloud samples, the ER-CCL algorithm is accelerated by GPU programming technology. After storing sizeable samples in the LiDARNet dataset, these labeled object samples are used to train different classifiers and test their corresponding recognition and speed performances. LiDAR point clouds own unstructured distribution that usually causes a common challenge in feature extraction processes. To extract the geometry information from point clouds, this dissertation proposed a 3D object recognition method named Dynamic Graph Convolutional Broad Network (DGCB-Net). In the DGCB-Net, multiple- layer perceptrons (MLPs) in weight-shared form as edge convolutional layers to extract local features based on graph structure. Because these local features in different edge convolutional layers represent different levels of information, thus, the DGCB-Net concatenate them together to improve the feature utilization efficiency. Different from the popular backbone of deep learning that stack many layers in-depth, our DGCB-Net adopts a broad architecture structured in a flat way to realize point cloud feature aggregation and object recognition. Our broad architecture consists of multiple feature layers and enhancement layers, which are concatenated flatly to enrich and explore the potential information from point cloud features. Features both obtained from multiple edge convolutional layers and extended by feature and enhancement layers work on the object recognition results, thus, our proposed DGCB-Net shows better recognition performance than other popular algorithms on both ModelNet10/40 and our LiDARNet dataset. Considering the huge model size and training time consumption during 3D object recognition, we proposed another broad learning system (BLS) variant with multiple unified space autoencoders (USAEs) named USAE-BLS. The proposed USAE-BLS is a lightweight model with small parameter counts and fast training time. The input of the USAE-BLS is raw object point clouds, which are fed into USAEs directly to form discriminant features from uncertain dimension to unified dimension. Unified dimensional features are extended into a high dimension space by using multiple flatly arranged enhancement layers, thus, the features generated by enhancement layers own more descriptive capability on object point clouds. We evaluated the proposed USAE- BLS algorithm on a LiDAR point cloud object dataset and ModelNet10 dataset, which proves the recognition performance of our USAE-BLS is similar to other popular 3D object recognition algorithms with deep learning architectures. In addition, the model size is much smaller and training time is much shorter than these deep learning models.

Issue date

2022.

Author

Tian, Yi Fei

Faculty

Faculty of Science and Technology

Department

Department of Computer and Information Science

Degree

Ph.D.

Subject

Computer vision

Three-dimensional imaging

Supervisor

Chen, Long

Song, Wei

Files In This Item

Full-text (Intranet only)

Location
1/F Zone C
Library URL
991010074922906306