earticle

논문검색

Session Ⅰ: Computer Vision and Image Analysis

Improving Global and Local Feature Extraction with Swin Transformer on Monocular Depth Estimation

초록

영어

Global-Local Path Network is a monocular depth estimation network. It presents a new method for integrating global features from an encoder and local features from a decoder through a Selective Feature Fusion module. In this paper, we propose that replacing the SegFormer encoder with the Swin Transformer leads to an improved GLPN, called Swin Transformer-Global-Local-Path-Network. We train the network with modified NYU Depth V2 datasets. Therefore, with the 0.034 RMSE, 0.075 AbsRel, 0.033 log10, 0.951 Delta 1, 0.994 Delta 2, 0.999 Delta 3, our network using a tiny version of Swin Transformer outperforms the previous GLPN model.

목차

Abstract
I. INTRODUCTION
II. RELATED WORKS
A. Monocular Depth Estimation
B. GLPN
C. SegFormer
D. Swin Transformer
III. METHODS
A. Overall Architecture
B. Light and Strong Encoder
IV. EXPERIMENTS
A. Datasets
B. Settings
C. Results
V. CONCLUSION
ACKNOWLEDGMENT
REFERENCES

저자정보

  • Yun-Young Chang School of Computing Gachon University
  • Joo-Hee Oh School of Computing Gachon University
  • Abrar Alabdulwahab School of Computing Gachon University
  • Chan-Young Choi School of Computing Gachon University
  • Sang-Woong Lee School of Computing Gachon University

참고문헌

자료제공 : 네이버학술정보

    함께 이용한 논문

      0개의 논문이 장바구니에 담겼습니다.