COS568: Systems and Machine Learning (Spring 2026)

 

Summary

This course explores the fast-evolving intersection of systems and machine learning. It combines lectures, student presentations, and discussions to cover both foundational concepts and the latest research developments.


We study several modern systems for ML with topics including:

·       Deep learning pipeline for computer vision and large language models (LLMs)

·       Model compression, quantization, distillation, data selection, data sequencing, mixture-of-experts, retrieval augmentation generation

·       Parallel and distributed training, as well as inference at scale

·       Hardware accelerators including GPUs and TPUs

·       Explore systems of modern LLMs such as Gemini 2.5, GPT-5, Deepseek-3, Apple foundation model.

·       Methods to optimize serving software stack

·       Emerging AI methods and learned data structures for building systems

 

The course features guest lectures from industry, student-led presentations, assignments, and a final project. Active participation in discussions is strongly emphasized.

 

The course is intended primarily for graduate students and satisfy the systems breath requirements for PhD students.  It is also open to advanced undergraduates with a strong interest in the subject and adequate background, subject to the instructor’s approval. Please note that Pass/D/Fail (P/D/F) grading is not permitted. All students are expected to present papers, engage in discussions, complete assignments, and carry out a project.

.

Administrative Information

·       Classes Times: Friday 1:20 – 4:20 pm

·       Classroom: Friend Center 006 (FC006)

·       Instructor: Professor Kai Li (li@cs.princeton.edu), office hours by appointments

·       Teaching assistants:

·       Wentao Guo, wg0420@princeton.edu, office hours: Mon-Wed 4:30-5:30pm, FC-010

·       Haoran Wan, haoran.w@princeton.edu, office hours: Tue-Thu 4:30-5:30pm, FC-010

·       Yulai Zhao, yz6292@princeton.edu, office hours: Friday 7:00-9:00pm, FC-010

 

Requirements and Grading

Your final grade in this course is based on a 100-point scale, distributed across the following components:

·       Presentation (10%): Students will work in teams to present primary research papers on a selected topic. This includes preparing the presentation, posting discussion questions in advance, and leading the subsequent in-class discussions.

·       Readings & Summaries (10%): For each topic, you are required to read the primary papers and submit concise reading notes. This requirement also applies to notes for invited guest lectures.

·       Participation (10%): Active engagement is essential. This includes consistent attendance and meaningful contributions to our seminar discussions.

·       Programming Assignment 1 (20%): Details to be provided on its Github.

·       Programming Assignment 2 (20%): Details to be provided on its Github.

·       Final Project (30%): You will work to design, execute, and document a system-level solution at the intersection of ML and infrastructure.

 

Policies

Collaboration & Academic Integrity. We encourage intellectual exchange; however, it must be conducted responsibly:

·       You are welcome to discuss conceptual ideas, or system architectures with your peers.

·       All submitted code, write-ups, and project reports must be your original work.

·       Every submission must strictly adhere to Princeton University’s academic integrity policies.

·       When in doubt, explicitly cite your sources and consult the instructor or TAs for clarification.

Late Work & Extensions. To ensure fairness and maintain the course schedule, our policy is as follows:

·       Extensions are granted only for documented medical illness (physician’s note required) or verified extraordinary personal circumstances.

·       External pressure, such as conference submission deadlines, job interviews, or heavy workloads from other concurrent courses, does not qualify for an extension.

·       Please plan your project milestones early to account for potential system-level bottlenecks or cluster downtime.

Generative AI & AI Coding Assistants.  We recognize the role of these tools while prioritizing learning.

·       The permissible use of AI coders will be clearly defined within the instructions for each programming assignment and the final project.

·       Unless explicitly stated otherwise in the assignment prompt, all logic and implementation should be your own.

·       If AI tools are permitted, you must disclose their use as specified in the assignment guidelines.

Warmup

This warmup is for those who would like to get familiar with ML frameworks with computing resources.  Since some students may have done programming in Pytorch, this assignment is not required and will not be graded.  We strongly encourage all students without experience to complete the MNIST classification warmup exercise.  A related paper is Gradient-Based Learning Applied to Document Recognition. 1998.

 

MNIST dataset will be automatically downloaded if using  MNIST classification. We have also set up the same on Google Colab: https://colab.research.google.com/drive/1wUWfzIY81x7R6Xe8WZUkgzFoKmTws7co?usp=sharing

Computing Resources

Adroit Cluster: Princeton Research Computing Cluster (for development)

Della Cluster: Princeton Research Computing Cluster (for serial and parallel production jobs)

Tiger Cluster: Princeton Research Computing Cluster (for running large parallel jobs)

Google Colab: Free compute resources

Key University Dates

Event

Date

Spring-term classes begin

Mon, Jan 26, 2026

Undergraduate add/drop ends

Fri, Feb 6, 2026 (11:59pm)

Spring recess

March 7–15, 2026 

Deadline to drop spring-term courses

Fri, Apr 3, 2026 (11:59pm)

Last class meet

Wed, Apr 22, 2026

Dean’s Date

Tue, May 5, 2026

 

 

Tentative Syllabus (including assignments and project)