Using LESS Data to Tune Models

Data Selection in the Era of LLMs

How to Scale Hyperparameters as Batch Size Increases

Understanding Optimization using Stochastic Differential Equations