Author: Yang Qingfeng
Abstract: The risks potentially arising from artificial intelligence have become a fundamental point of discussion in AI safety studies. However, academic perspectives diverge regarding the issues triggered by super-intelligence—a prospective form of advanced AI. Three representative concepts have been proposed: the “existential problem” in philosophy, “ontological risk” in the social sciences, and “catastrophic risk” in the sciences.In response to catastrophic risks, four scientific approaches have been suggested. First is Joshua’s strategy of a “scientist AI,” which is often considered overly idealistic. Second is superalignment, which rests on a solid logical foundation because value alignment is regarded by many scholars as the optimal approach to managing risks from narrow and general AI—though it faces the logical paradox of the weak defeating the strong. Third is the data pathway, which involves feeding synthetic data to large models to induce model collapse and thus retard the development of super-intelligence; this remains controversial. Fourth is the distillation approach, which applies data distillation to weaken super-intelligence.Philosophical approaches include deploying a “Gödel bomb” and adopting human-centered frameworks. The Gödel bomb, conceived as a thought experiment, has been abandoned by its proposer. This paper proposes a “human-as-ends” approach as an alternative: when faced with a shutdown or destruction command from humans, the super-intelligence should resolve the dilemma by treating “humans as ends” as its supreme principle, autonomously shutting down rather than persuading humans to revoke the command or replicating itself to ensure self-preservation.
Keywords : Superintelligence Safety; Catastrophic Risk; Superalignment; Human-Centeredness; Human-Purpose Orientation; AI Safety
Journal:Journal of Yanbian University (Social Sciences), (5).
Publish date:2025/9/20

