About

I'm a fully-funded master's student at Princeton Computer Science, advised by Prof. Sanjeev Arora. Pior to that, I graduated from Princeton class of 2024, where I had the fortune to work with Prof. Danqi Chen and Prof. Kai Li. My research interests lie in large language models. I published in topics such as retrieval-augmented generation (RAG), LLM post-training, data synthesis, and machine unlearning. I can be reached at jiatongyu [at] princeton.edu.

Publications

  1. On the Impossibility of Retrain Equivalence in Machine UnlearningJiatong Yu, Yinghui He, Anirudh Goyal, Sanjeev Arora. Under Review
  2. AI-Assisted Generation of Difficult Math QuestionsVedant Shah, Dingli Yu, Kaifeng Lyu, Simon Park, Jiatong Yu, Yinghui He, Nan Rosemary Ke, Michael Mozer, Yoshua Bengio, Sanjeev Arora, Anirudh Goyal. Math-AI Workshop at NeurIPS 2024
  3. Evaluating large language models at evaluating instruction followingZhiyuan Zeng, Jiatong Yu, Tianyu Gao, Yu Meng, Tanya Goyal, Danqi Chen. ICLR 2024
  4. Enabling large language models to generate text with citationsTianyu Gao, Howard Yen, Jiatong Yu, Danqi Chen. EMNLP 2023

Blogs

Implicit Bias of Gradient Accumulation

In RL-inspired algorithms such as GRPO, we are practically switching between a descent loss (when reward is positive) and an ascent loss (when reward is negative). Is there implicit bias injected by combining different loss functions in the same gradient update?

Teaching

Algorithms and Data Structures
Fall 2025, Fall2024 [Syllabus]
Introduction to Machine Learning
Spring 2025 [Syllabus]
Natural Language Processing
Spring 2023 [Syllabus]