Modular Approaches to Ethical AGI : Integrating Reasoning, Metacognition, and Alignment for Safe and Trustworty Intelligence

Jae Yoo Lee; Hojung Lim; Seong Joon Yoo

Oral Session A-1: Computer Vision

Modular Approaches to Ethical AGI : Integrating Reasoning, Metacognition, and Alignment for Safe and Trustworty Intelligence

원문정보

한국차세대컴퓨팅학회 한국차세대컴퓨팅학회 학술대회 ICNGC 2025 The 11th International Conference on Next Generation Computing 2025 2025.12 pp.7-10

피인용수 : 0건 (자료제공 : 네이버학술정보)

초록

영어

Artificial General Intelligence (AGI) introduces a new class of ethical and technical challenges because it is expected to operate with autonomous goal formation, extended temporal reasoning, and reflective metacognition that go far beyond the constraints of current narrow AI systems. These capabilities imply that ethical safeguards cannot remain external layers or post-hoc filters; instead, they must function as internal cognitive components embedded within the AGI’s core architecture. To address this need, this paper proposes a Modular Ethical AGI Framework composed of three foundational subsystems: a Hybrid Alignment Stack that unifies top-down normative principles with bottom-up, data-driven moral priors; a Moral Reflection Module capable of contextual ethical assessment, symbolic interpretation, and counterfactual reasoning; and a Metacognitive Consistency Layer that performs coherence evaluation, reflective self-correction, and justification generation. To operationalize these subsystems, we introduce an Ethical Deliberation Cycle, which provides a structured sequence for moral feature extraction, normative activation, action evaluation, conflict resolution, reflective consistency checking, and explanation generation. This framework directly addresses limitations widely observed in current alignment research, including rule brittleness [1], lack of contextual nuance [2], dataset bias [3], and absence of principled coherence mechanisms [4]. It further identifies potential failure modes such as value–rule conflicts, cultural narrowness in moral datasets, symbolic grounding gaps, and metacognitive overconfidence. We argue that ethical reasoning is not an optional enhancement but a structural necessity for AGI safety, and that the proposed modular architecture offers a viable starting point for designing trustworthy and value-aligned autonomous intelligence.

Abstract
I. INTRODUCTION
II. SYSTEM OVERVIEW
A. Norm Retrieval Layer
B. Moral Reflection Module
C. Hybrid Alignment Stack
D. Metacognitive Consistency Layer
E. Explanation Generator
F. Implementation Considerations for Ethical AGI Modules
III. ETHICAL DELIBERATION CYCLE
A. Illustrative Example of the Ethical Deliberation Cycle
IV. LIMITATIONS AND FAILURE MODES
V. CONCLUSION
ACKNOWLEDGMENT
REFERENCES

키워드

저자정보

Jae Yoo Lee Department of Computer Science Sejong University Seoul, Repulic of Korea
Hojung Lim Intelligence Integrated Software Research Center Korea Electronics Technology Institute Seongnam, Republic of Korea
Seong Joon Yoo Department of AI Data Science Sejong University Seoul, Republic of Korea

참고문헌

자료제공 : 네이버학술정보

함께 이용한 논문

0개의 논문이 장바구니에 담겼습니다.

earticle

Modular Approaches to Ethical AGI : Integrating Reasoning, Metacognition, and Alignment for Safe and Trustworty Intelligence

원문정보

초록

목차

키워드

저자정보

참고문헌

함께 이용한 논문