earticle

논문검색

Convergence of Internet, Broadcasting and Communication

MRO: Multimodal Routing Optimization via Neural Architecture Search of Fusion Paths

초록

영어

Effective fusion of heterogeneous modalities is a critical factor for improving performance in multimodal learning. However, existing Cross-Modal Transformer (CMT)-based fusion methods are constrained by fixed attention paths, which reduce adaptability to diverse inputs and restrict flexible exploration of optimal fusion strategies. Furthermore, as the number of modalities increases, the computational complexity grows exponentially, leading to scalability bottlenecks. To address these limitations, we propose a Multimodal Routing Optimization (MRO) framework that restructures cross-modal attention paths as a Supernet structure, drawing inspiration from the Once-for-All (OFA) paradigm. Leveraging Neural Architecture Search (NAS), Multimodal Routing Optimization (MRO) dynamically selects optimal routing paths that balance accuracy and computational cost (FLOPs), enabling scalable and efficient multimodal fusion.

목차

Abstract
1. Introduction
2. Related Works
2.1 Neural Architecture Search (NAS)
2.2 Lightweight Supernet-based NAS: Once-for-All (OFA)
2.3 Prior-guided Constrained Search
3. MRO : Multimodal Routing Optimization
3.1 MRO Overview
3.2 Supernet Architecture for Cross-Modal Attention Path Exploration
3.3 Prior-Guided Search Using a Modal Relation Graph
3.4 Prior-Guided Search with Rank Regularization for Consistency Alignment
4. Evaluation
4.1 Learning Dataset
4.2 Model Hyperparameter
4.3 Ablation Study on Rank Regularization Loss Functions
4.4 Performance Results
5. Conclusions
Acknowledgement
References

저자정보

  • Jeong-Hun Kim Undergraduate Researcher, Division of Computer Engineering, Hansung University
  • Mi-Hwa Song Associate Professor, Division of Computer Engeneering, Hansung University

참고문헌

자료제공 : 네이버학술정보

    함께 이용한 논문

      ※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

      0개의 논문이 장바구니에 담겼습니다.