원문정보
모바일 기반 단일 영상 3D 재구성 모델 경량화
초록
영어
This study proposes a mobile-based lightweight deep learning model (Lite-MCC) capable of reconstructing three-dimensional (3D) spatial structures from a single RGB image. Conventional 3D reconstruction models require multi-view inputs or point cloud data and depend on large-scale computational resources, which limits their real-time applicability in practical environments. To address this limitation, the proposed Lite-MCC model simplifies the existing Multiview Compressive Coding (MCC) architecture, enabling accurate 3D reconstruction using only a single image. The model adopts a parallel structure consisting of a Vision Transformer (ViT-Tiny) and a Geometry Encoder to extract visual and spatial features simultaneously, while a Transformer Decoder generates the corresponding 3D point cloud. Furthermore, depth map–based input transformation and ONNX-based optimization are employed to achieve efficient real-time inference on edge devices. Experimental results on the CO3D dataset demonstrate that Lite-MCC reduces computational cost by 87% and memory usage by 65%, while maintaining a Chamfer Distance of 0.045, comparable to the original MCC model. These results indicate that the proposed method provides a promising direction for lightweight AI models enabling low-cost, real-time 3D recording and visualization.
목차
1. 서론
2. 관련 연구
2.1 기존 3D Reconstruction 방법론
2.2 MCC(Multiview Compressive Coding) 기반 모델
2.3 모델 경량화 및 모바일 최적화
3. 제안하는 방법
3.1 전체 구조 개요
3.2 경량화 전략
3.3 학습 데이터셋 및 손실 함수
4. 실험 및 결과
5. 논의 및 연구
Acknowledgement
참고문헌
