earticle

논문검색

Convergence of Internet, Broadcasting and Communication

Optimizing Information Retrieval in Dark Web Academic Literature : A Study Using KeyBERT for Keyword Extraction and Clustering

초록

영어

The exponential increase in publications and the interconnected nature of sub-domains make traditional methods of information extraction and organization inadequate. This inefficiency can impede scientific progress and innovation. To address these challenges, this research leverages the ability of Bidirectional Encoder Representations from Transformers for keyword extraction (KeyBERT) and integrates with K-Means clustering to organize topics from large datasets effectively. Analyzing a dataset of 47,627 articles from SCOPUS in the domains of Reinforcement Learning and Computer Vision. An ablation study demonstrates the generalizability of the approach across these fields, with the optimal number of clusters determined to be three using the Elbow Method. The results demonstrate that KeyBERT is effective in extracting and organizing topics within these domains, with a particular focus on applications such as medical imaging, autonomous driving, and real-time detection systems. This methodology offers a scalable solution for organizing vast academic datasets, enabling researchers to extract meaningful insights efficiently and apply this approach to other domains.

목차

Abstract
1. Introduction
2. Literature Review
2.1 Keyword Extraction with BERT
2.2 Topic Clustering
3. Methodology
3.1 Data Collection and Preprocessing
3.2 Keyword Extraction and Topic Clustering
4. Result and Discussion
5. Conclusion
Acknowledgement
References

저자정보

  • Yosua Setyawan Soekamto PhD Student at Department of Computer Engineering, Dongseo University, Busan, South Korea/Lecturer at Department of Information Systems, Universitas Ciputra Surabaya
  • Leonard Christopher Limanjaya Master Student at Department of Computer Engineering, Dongseo University, Busan, South Korea
  • Yoshua Kaleb Purwanto Master Student at Department of Computer Engineering, Dongseo University, Busan, South Korea
  • Bongjun Choi Professor at Department of Software, Dongseo University, Busan, South Korea
  • Seung-Keun Song Professor at Department of Visual Contents, Graduate School, Dongseo University, Busan, South Korea
  • Dae-Ki Kang Professor at Department of Computer Engineering, Dongseo University, Busan, South Korea

참고문헌

자료제공 : 네이버학술정보

    함께 이용한 논문

      ※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

      0개의 논문이 장바구니에 담겼습니다.