Scott EmmonsI am a research scientist at Google DeepMind focused on AI safety and alignment. I am wrapping up my PhD at UC Berkeley’s Center for Human-Compatible AI, advised by Stuart Russell. I previously cofounded far.ai, a 501(c)3 research nonprofit that incubates and accelerates beneficial AI research agendas. I am interested in both the theory and practice of AI alignment. I have helped characterize how RLHF can lead to deception when the AI sees more than the human, develop multimodal attacks and benchmarks for open-ended agents, and use mechanistic interpretability to find evidence of learned look-ahead in a chess-playing neural network. scott at scottemmons dot com |