Jiayi Geng

 

ph.jpg

jiayig@princeton.edu

I am a first year PhD student at the Language Technologies Institute at Carnegie Mellon University, advised by Prof. Graham Neubig.

My research explores how to build reliable machine intelligence that can advance toward and beyond human-level capabilities. I primarily focus on: (1) understanding and ensuring reliability in long-horizon interactions, (2) designing agent memory systems that adaptively update with experience while avoiding unintended shifts, (3) enabling multi-agent collaboration through effective coordination and communication mechanisms, and (4) advancing AI scientists through rigorous evaluation and safe deployment for conducting autonomous research.

Before CMU, I received my Master’s degree at Princeton University, advised by Prof. Danqi Chen and Prof. Thomas L. Griffiths, and my Bachelor’s degree at McGill University, advised by Prof. Xue (Steve) Liu, and Prof. Eric D. Kolaczyk.

News

2025-05 Graduated from Princeton University and started my PhD at LTI CMU!
2025-05 Our paper: Mind Your Step (by Step): Chain-of-Thought can Reduce Performance on Tasks where Thinking Makes Humans Worse has been accepted by ICML 2025!
2025-01 Our paper: Large Language Models Assume People are More Rational than We Really are has been accepted by ICLR 2025!
2024-05 Our paper Language Models as Science Tutors has been accepted by ICML 2024!
2023-09 Started my Master study at Princeton University! :tiger:

Selected publications

(* indicates equal contribution)

  1. llm_scientists.png
    Are Large Language Models Reliable AI Scientists? Assessing Reverse-Engineering of Black-Box Systems
    Jiayi Geng*, Howard Chen*, Dilip Arumugam, and Thomas L Griffiths
    arXiv preprint arXiv:2505.17968, 2025
  2. self-evolving.png
    A survey of self-evolving agents: On path to artificial super intelligence
    Huan-ang Gao, Jiayi Geng, Wenyue Hua, Mengkang Hu, Xinzhe Juan, Hongzhang Liu, Shilong Liu, Jiahao Qiu, Xuan Qi, Yiran Wu, and 1 more author
    arXiv preprint arXiv:2507.21046, 2025
  3. cognitive.png
    Using the tools of cognitive science to understand large language models at different levels of analysis
    Alexander Ku, Declan Campbell, Xuechunzi Bai, Jiayi Geng, 8 authors, and Thomas L Griffiths
    arXiv preprint arXiv:2503.13401, 2025
  4. cm.png
    Continual Memorization of Factoids in Large Language Models
    Howard Chen*Jiayi Geng*, Adithya Bhaskar, Dan Friedman, and Danqi Chen
    arXiv preprint 2411.07175, 2024
  5. mind_your_step.png
    Mind your step (by Step): Chain-of-thought Can Reduce Performance on Tasks Where Thinking Makes Humans Worse
    Ryan Liu*Jiayi Geng*, Addison J Wu, Ilia Sucholutsky, Tania Lombrozo, and Thomas L Griffiths
    ICML, 2025
  6. treebon.png
    TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling
    Jiahao Qiu, Yifu Lu, Yifan Zeng, Jiacheng Guo, Jiayi Geng, Huazheng Wang, Kaixuan Huang, Yue Wu, and Mengdi Wang
    arXiv preprint 2410.16033, 2024
  7. llm_rationality.png
    Large language Models Assume People are More Rational than We Really are
    Ryan Liu*Jiayi Geng*, Joshua C Peterson, Ilia Sucholutsky, and Thomas L Griffiths
    ICLR, 2025
  8. science_tutor.png
    Language Models as Science Tutors
    Alexis Chevalier, Jiayi Geng, Alexander Wettig, Howard Chen, 16 authors, Sanjeev Arora, and Danqi Chen
    ICML, 2024