Jiayi Geng

jiayig@princeton.edu
I am a first year PhD student at the Language Technologies Institute at Carnegie Mellon University, advised by Prof. Graham Neubig.
My research explores how to build reliable machine intelligence that can advance toward and beyond human-level capabilities. I primarily focus on: (1) understanding and ensuring reliability in long-horizon interactions, (2) designing agent memory systems that adaptively update with experience while avoiding unintended shifts, (3) enabling multi-agent collaboration through effective coordination and communication mechanisms, and (4) advancing AI scientists through rigorous evaluation and safe deployment for conducting autonomous research.
Before CMU, I received my Master’s degree at Princeton University, advised by Prof. Danqi Chen and Prof. Thomas L. Griffiths, and my Bachelor’s degree at McGill University, advised by Prof. Xue (Steve) Liu, and Prof. Eric D. Kolaczyk.
News
2025-05 | Graduated from Princeton University and started my PhD at LTI CMU! |
---|---|
2025-05 | Our paper: Mind Your Step (by Step): Chain-of-Thought can Reduce Performance on Tasks where Thinking Makes Humans Worse has been accepted by ICML 2025! |
2025-01 | Our paper: Large Language Models Assume People are More Rational than We Really are has been accepted by ICLR 2025! |
2024-05 | Our paper Language Models as Science Tutors has been accepted by ICML 2024! |
2023-09 | Started my Master study at Princeton University! ![]() |
Selected publications
(* indicates equal contribution)
- Are Large Language Models Reliable AI Scientists? Assessing Reverse-Engineering of Black-Box SystemsarXiv preprint arXiv:2505.17968, 2025
- A survey of self-evolving agents: On path to artificial super intelligencearXiv preprint arXiv:2507.21046, 2025
- Using the tools of cognitive science to understand large language models at different levels of analysisarXiv preprint arXiv:2503.13401, 2025
-
- Mind your step (by Step): Chain-of-thought Can Reduce Performance on Tasks Where Thinking Makes Humans WorseICML, 2025
- TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N SamplingarXiv preprint 2410.16033, 2024
-
-