earticle

논문검색

Human-Machine Interaction Technology (HIT)

Paradigm Shifts in Computer Vision over the Last Five Years

초록

영어

We present a systematic review of paradigm shifts in computer vision from 2020 to 2025. The survey centers on Vision Transformers(ViT), large-scale self-supervised learning contrastive, MAE/BEiT, multimodal pretraining CLIP, SAM, diffusion-based generation, and 3D representations via NeRF. Using a literature-synthesis framework, we compare architectures, training regimes, and transfer benefits and limits across major tasks. Evidence shows transformer families rival or surpass CNNs on dense-prediction task detection, segmentation, while diffusion models enable stabler training and higher-quality generation than GANs. Self-supervised learning reduces labeling cost and improves generalization in low-label regimes. Multimodal models unlock zero-shot and open-vocabulary recognition; foundation models such as SAM demonstrate general-purpose segmentation. Persisting challenges include data bias, substantial compute/energy demand, and limited explainability. We recommend efficiency-oriented compression distillation, pruning, quantization, green-AI practices, and guidelines for responsible use of foundation models. The outlook highlights edge/embedded realtime vision, 3D/video understanding, and applications in healthcare, remote sensing, and AR/metaverse. Overall, the period is defined by large-scale pretraining, a shift to transformers, multimodal integration, and advances in 3D—pointing to the next goal: responsible and efficient vision AI.

목차

Abstract
1. Introduction
2. Methods
2.1 Major Research Trends in the Last 5 Years
2.2 Vision Transformer(VIT)
2.3 Rise of Self-supervised Learning
2.4 Multimodal Learning and Vision-Language Models
2.5 Innovation in Generative Model: Spread Model in GAN
2.6 3D Vision and Neural Radiance Fields
2.7 Advanced Object Detection and Image Segmentation
3. Results
4. Discussion
5. Conclusion
References

저자정보

  • Chan-Ho Lee Department of Computer Engineering, Honam University, Korea
  • Dae-Hyeok Jun Department of Computer Engineering, Honam University, Korea
  • Lee Hye-Min JTOMORROWONE CO.,LTD
  • Kyu-Ha Kim Assistant Prof., Department of Computer Engineering, Honam University, Korea

참고문헌

자료제공 : 네이버학술정보

    함께 이용한 논문

      ※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

      0개의 논문이 장바구니에 담겼습니다.