Apply Now: $2 Million for Research to Control Artificial Intelligence

By Wayan Vota on January 29, 2024

ai superalignment fast grants

Artificial intelligence systems that are much smarter than humans could arrive in the next 10 years. To manage potential risks these systems could pose, we need to solve a key technical problem: superhuman AI alignment (superalignment). How can we steer and control AI systems much smarter than us?

Reinforcement learning from human feedback (RLHF) has been very useful for aligning today’s models. But it fundamentally relies on humans’ ability to supervise AI models. Humans won’t be able to reliably supervise AI systems much smarter than us. On complex tasks we won’t understand what the AI systems are doing, so we won’t be able to reliably evaluate it.

$2 Million Superalignment Fast Grants

OpenAI Superalignment Fast Grants will support technical research towards the alignment and safety of superhuman AI systems, including weak-to-strong generalization, interpretability, scalable oversight, and more.

The $10 million grants program will funding the following research directions:

  • Weak-to-strong generalization: Humans will be weak supervisors relative to superhuman models. Can we understand and control how strong models generalize from weak supervision?
  • Interpretability: How can we understand model internals? And can we use this to e.g. build an AI lie detector?
  • Scalable oversight: How can we use AI systems to assist humans in evaluating the outputs of other AI systems on complex tasks?
  • Many other research directions, including but not limited to: honesty, chain-of-thought faithfulness, adversarial robustness, evals and testbeds, and more.

OpenAI is offering $100,000–$2 million grants for academic labs, nonprofits, and individual researchers. Graduate students can receive a one-year $150,000 OpenAI Superalignment Fellowship. No prior experience working on alignment is required.

Apply Now! Deadline is February 18

Written by
