publications
(* indicates equal contribution)
2025
-
Physics Supernova: AI Agent Matches Elite Gold Medalists at IPhO 2025arXiv preprint arXiv:2509.01659, 2025 -
Are Large Language Models Reliable AI Scientists? Assessing Reverse-Engineering of Black-Box SystemsarXiv preprint arXiv:2505.17968, 2025 -
A survey of self-evolving agents: On path to artificial super intelligencearXiv preprint arXiv:2507.21046, 2025 -
Using the tools of cognitive science to understand large language models at different levels of analysisarXiv preprint arXiv:2503.13401, 2025 -
Mind your step (by Step): Chain-of-thought Can Reduce Performance on Tasks Where Thinking Makes Humans WorseICML, 2025 -
2024
-
-
TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N SamplingarXiv preprint 2410.16033, 2024 -
Dr. GPT in Campus Counseling: Understanding Higher Education Students’ Opinions on LLM-assisted Mental Health ServicesarXiv preprint 2409.17572, 2024 -
2023
-
Corgi-pm: A Chinese Corpus for Gender Bias Probing and MitigationarXiv preprint 2301.00395, 2023