COS568: Systems
and Machine Learning (Spring 2026)
Summary
This course explores the fast-evolving intersection
of systems and machine learning. It combines lectures, student presentations,
and discussions to cover both foundational concepts and the latest research
developments.
We study several modern systems for ML with topics including:
· Deep learning pipeline for computer vision and large language models (LLMs)
· Model compression, quantization, distillation, data selection, data sequencing, mixture-of-experts, retrieval augmentation generation
· Parallel and distributed training, as well as inference at scale
·
Hardware accelerators including
GPUs and TPUs
·
Explore systems of modern LLMs
such as Gemini 2.5, GPT-5, Deepseek-3, Apple foundation model.
·
Methods to optimize serving
software stack
· Emerging AI methods and learned data structures for building systems
The course features guest lectures from
industry, student-led presentations, assignments, and a final project. Active
participation in discussions is strongly emphasized.
The course is intended primarily for graduate students and satisfy the systems breath requirements for PhD students. It is also open to advanced undergraduates with a strong interest in the subject and adequate background, subject to the instructor’s approval. Please note that Pass/D/Fail (P/D/F) grading is not permitted. All students are expected to present papers, engage in discussions, complete assignments, and carry out a project.
.
Administrative
Information
· Classes Times: Friday 1:20 – 4:20 pm
· Classroom: Friend Center 006 (FC006)
· Instructor: Professor Kai Li (li@cs.princeton.edu), office hours by appointments
· Teaching assistants:
·
Wentao Guo, wg0420@princeton.edu, office hours: Mon-Wed
4:30-5:30pm, FC-010
·
Haoran Wan, haoran.w@princeton.edu, office hours: Tue-Thu
4:30-5:30pm, FC-010
· Yulai Zhao, yz6292@princeton.edu, office hours: Friday 7:00-9:00pm, FC-010
Requirements
and Grading
Your final
grade in this course is based on a 100-point scale, distributed across the
following components:
·
Presentation (10%):
Students will work in teams to present primary research papers on a selected
topic. This includes preparing the presentation, posting discussion questions
in advance, and leading the subsequent in-class discussions.
·
Readings & Summaries (10%): For each topic, you are required to read the primary papers and
submit concise reading notes. This requirement also applies to notes for
invited guest lectures.
·
Participation (10%):
Active engagement is essential. This includes consistent attendance and
meaningful contributions to our seminar discussions.
·
Programming Assignment 1 (20%): Details to be provided on its Github.
·
Programming Assignment 2 (20%): Details to be provided on its Github.
· Final Project (30%): You will work to design, execute, and document a system-level solution at the intersection of ML and infrastructure.
Policies
· You are welcome to discuss conceptual ideas, or system architectures
with your peers.
· All submitted code, write-ups, and project reports must be your
original work.
· Every submission must strictly adhere to Princeton University’s academic integrity
policies.
· When in doubt, explicitly cite your sources and consult the instructor
or TAs for clarification.
·
Extensions are granted only for
documented medical illness (physician’s note required) or verified
extraordinary personal circumstances.
·
External pressure, such as
conference submission deadlines, job interviews, or heavy workloads from other
concurrent courses, does not qualify for an extension.
·
Please plan your project
milestones early to account for potential system-level bottlenecks or cluster
downtime.
· The permissible use of AI coders will be clearly defined
within the instructions for each programming assignment and the final project.
·
Unless explicitly
stated otherwise in the assignment prompt, all logic and implementation should
be your own.
·
If AI tools are
permitted, you must disclose their use as specified in the assignment
guidelines.
This
warmup is for those who would like to get familiar with ML frameworks with
computing resources. Since some students may have done
programming in Pytorch, this assignment is not
required and will not be graded. We strongly encourage all students without experience to complete the MNIST classification warmup
exercise. A related paper is Gradient-Based Learning Applied to Document Recognition. 1998.
MNIST
dataset will be automatically downloaded if using MNIST classification. We have also set up
the same on Google Colab: https://colab.research.google.com/drive/1wUWfzIY81x7R6Xe8WZUkgzFoKmTws7co?usp=sharing
Adroit Cluster: Princeton Research Computing Cluster (for development)
Della Cluster:
Princeton Research Computing Cluster (for serial and parallel production jobs)
Tiger
Cluster: Princeton Research Computing Cluster (for running large parallel jobs)
Google Colab: Free compute
resources
|
Event |
Date |
|
Spring-term
classes begin |
Mon,
Jan 26, 2026 |
|
Undergraduate
add/drop ends |
Fri,
Feb 6, 2026 (11:59pm) |
|
Spring
recess |
March
7–15, 2026 |
|
Deadline
to drop spring-term courses |
Fri,
Apr 3, 2026 (11:59pm) |
|
Last
class meet |
Wed,
Apr 22, 2026 |
|
Dean’s
Date |
Tue,
May 5, 2026 |
Tentative
Syllabus (including assignments and project)