About
Hello! I am a second-year CS PhD student at Berkeley, where I am fortunate to be advised by Jacob Steinhardt. I am interested in developing safe ML systems, especially sequential decision-making agents. I am grateful to be supported by a FLI PhD fellowship.
I studied mathematics and computer science at Caltech, where I worked with Anima Anandkumar and Yuanyuan Shi.
See here for my CV.
Publications
Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark
Alexander Pan*, Chan Jun Shern*, Andy Zou*, Nathaniel Li, Steven Basart, Thomas Woodside, Jonathan Ng, Hanlin Zhang, Scott Emmons, Dan Hendrycks
ICML 2023, Oral
pdf / code / website
The Effects of Reward Misspecification: Mapping and Mitigating Misaligned Models
Alexander Pan, Kush Bhatia, Jacob Steinhardt
ICLR 2022
pdf / code
Improving Robustness of RL for Power System Control with Adversarial Training
Alexander Pan, Yongkyun (Daniel) Lee, Huan Zhang, Yize Chen, Yuanyuan Shi
ICML RL4RL Workshop 2021
pdf / code
Projects
I was lucky to work on some interesting hackathon projects with Yongkyun (Daniel) Lee and Evan Yeh.
Pagechat
Best social network hack - Stanford Hackathon 2021
chrome extension / code
homES ReInvented
Best use of ESRI technology - Caltech Hackathon 2020
code
Contact
Feel free to email me at aypan (dot) 17 (at) berkeley (dot) edu.