OpenAI’s Superalignment team will address the core technical challenges of controlling superintelligent AI systems and ensuring their alignment with human values and goals.

To accomplish this, they are developing a ‘human-level automated alignment researcher,’ which itself is an AI. This automated researcher will utilize human feedback and assist in evaluating other AI systems, playing a critical role in advancing alignment research. The ultimate aim is to build AI systems that can conceive, implement, and develop improved alignment techniques.

OpenAI’s hypothesis is that AI systems can make faster and more effective progress in alignment research compared to humans. Through collaboration between AI systems and human researchers, continuous improvements will be made to ensure AI alignment with human values.

So, using AI to control other AI; what do you think?

  • @mackwinston
    link
    English
    311 months ago

    Given that human values can be pretty screwed up, what could possibly go wrong?

    • @Tatters
      link
      English
      211 months ago

      A lot of politics seems to be a clash of diametrically opposed value systems, so I wonder which ones the AI will have, and will they by any chance be close to the researchers own?