I am currently a fourth-year PhD student in the Department of Computer Science at Princeton University.
My advisor is Prof. Danqi Chen and I am a member
of Princeton Natural Language Processing Group. Prior to
this, I was a master's student at Carnegie Mellon University, advised by Prof. Graham Neubig. I obtained my Bachelor's degree from Fudan
University's School of Data Science in China.
I interned at Meta AI, Bloomberg AI, and Microsoft Research throughout my PhD years. I am interested in improving the efficiency of training large language models, leveraging textual data structures to enhance model training and understanding the effect of model scales.
I'd love to dedicate time to offering support to junior students passionate about research, esepcially students from underrepresented groups. Please feel free to contact me via email to set up a chat!
[Google Scholar]
[Semantic Scholar]
[Github]
[Twitter]
[CV]
News
- [03/2024] I received the Apple Scholars in AIML PhD fellowship!
- [01/2024] Three papers are accepted to ICLR 2024! See you in Vienna 🥳
- [11/2023] Gave a talk on Sheared-LLaMA at Tencent AI.
- [10/2023] Released the Sheared LLaMA paper on pruning LLMs and the models in 1.3B and 2.7B sizes (downloaded >30K times monthly 🔥)!
- [10/2023] Released a preprint on jailbreaking open-source LLMs by simply changing generation parameters.
- [10/2023] Released a preprint on detecting pre-training data from language models along with UW folks.
-
[09/2023] Served as an Area Chair for the Efficient Methods for NLP track for EMNLP 2023.
- [07/2023] Awarded as a Qualcomm Innovation Fellowship finalist.
-
[02/2023] Gave a talk at ML Collective on Training Trajectories of LLMs.
-
[02/2023] Gave a lightning talk on Efficient Language Models at Workshop on Natural Language Understanding at Apple.
-
[09/2022] Gave a talk at Google Research on Structured Pruning.
-
[07/2022] Awared a Bloomberg Data Science Ph.D. Fellowship.
Selected Publications and Preprints
- LESS: Selecting Influential Data for Targeted Instruction Tuning
Mengzhou Xia*, Sadhika Malladi*, Suchin Gururangan, Sanjeev Arora, Danqi Chen
Preprint.;
[arXiv]
[Code]
- Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning
Mengzhou Xia, Tianyu Gao, Zhiyuan Zeng, Danqi Chen
ICLR 2024;
[arXiv]
[Code]
[Blog]
- Detecting Pretraining Data from Large Language Models
Weijia Shi*, Anirudh Ajith*, Mengzhou Xia, Yangsibo Huang, Daogao Liu, Terra Blevins, Danqi Chen, Luke Zettlemoyer
ICLR 2024;
[arXiv]
[Code]
[Project Page]
- Catastrophic Jailbreak of Open-source LLMs via Exploiting Generation
Yangsibo Huang, Samyak Gupta, Mengzhou Xia, Kai Li, Danqi Chen
ICLR 2024 (Spotlight);
[arXiv]
[Code]
[Project Page]
- Training Trajectories of Language Models Across Scales
Mengzhou Xia, Mikel Artetxe, Chunting Zhou, Xi Victoria Lin, Ramakanth Pasunuru, Danqi Chen, Luke Zettlemoyer, Ves Stoyanov
ACL 2023;
[arXiv]
[Code]