Safety without alignment

Currently, the dominant paradigm in AI safety is alignment with human values. Here we describe progress on developing an alternative approach to safety, based on ethical rationalism (Gewirth:1978), and propose an inherently safe implementation path via hybrid theorem provers in a sandbox. As AGIs ev...

Full description

Saved in:
Bibliographic Details
Published inarXiv.org
Main Authors Kornai, András, Bukatin, Michael, Zombori, Zsolt
Format Paper
LanguageEnglish
Published Ithaca Cornell University Library, arXiv.org 18.03.2023
Subjects
Online AccessGet full text

Cover

Loading…