원문정보
초록
영어
Three-dimensional (3D) point clouds provide detailed geometric understanding of real-world environments but remain challenging to process due to their sparse and unordered nature. Contrastive learning has emerged as a powerful self-supervised approach for learning representations from unlabeled 3D point cloud data. At the core of these methods lie encoder architectures that project raw points into discriminative latent spaces. This brief survey highlights major encoder families used in 3D contrastive learning and analyzes their design principles, strengths, and limitations. We further discuss how encoder choice influences downstream performance and outline research trends toward efficient, multimodal, and real-time contrastive frameworks.
목차
I. INTRODUCTION
II. ENCODER ARCHITECTURES
A. Point-based Encoder
B. Voxel-based Encoders
C. Graph-based Encoders
D. Transformer-based Encoders
E. Multi-modal Encoders
III. CHALLENGES AND FUTURE DIRECTIONS
IV. CONCLUSION
REFERENCES
