COS 484: Natural Language Processing
[Calendar]   [Logistics]   [Content]   [Schedule]   [Coursework]   [Projects]

Logistics

Instructors:

TAs:


Willie Chang
Pranay Manocha

Pranay Manocha
Runzhe Yang

Runzhe Yang

Time/Location:

You can check this calendar for updated times of all lectures, sections, OHs and due dates.
How to contact us: Please use Piazza for all questions related to lectures, homeworks, and projects, and to find announcements. For external queries, emergencies, or personal matters, you can use a private Piazza post visible only to Instructors.
Content

What is this course about?

Recent advances have ushered in exciting developments in natural language processing (NLP), resulting in systems that can translate text, answer questions and even hold spoken conversations with us. This course will introduce students to the basics of NLP, covering standard frameworks for dealing with natural language as well as algorithms and techniques to solve various NLP problems, including recent deep learning approaches. Topics covered include language modeling, representation learning, text classification, sequence tagging, syntactic parsing, machine translation, question answering and others.

Prerequisites:

Reading:

There is no required textbook for this class, and you should be able to learn everything from the lectures and assignments. However, if you would like to pursue more advanced topics or get another perspective on the same material, here are some books (all of them can be read free online):
Schedule

Readings for future lectures are tentative and subject to change.
*: All assignments are due 11:59pm.

Date Topics Readings Assignments
Sep 12 Introduction & Overview A1 out
Sep 17 Language modeling J&M Chapter 3
Sep 19 Text classification (Naive Bayes) J&M Chapter 4
Sep 23 A1 due*
Sep 24 Log-linear models J&M Chapter 5 A2 out
Sep 26 Word embeddings J&M Chapter 6
Oct 1 Neural network basics J&M Chapter 7
A Primer on Neural Network Models for Natural Language Processing
Oct 3 Sequence modeling (HMMs, Viterbi) J&M Chapter 8
Notes from Michael Collins [1][2]
Oct 7 A2 due*
Oct 8 Expectation Maximization (EM) Notes from Michael Collins A3 out
Oct 10 Recurrent NNs, neural language models J&M Chapter 9
Understanding LSTM Networks
Oct 15 Constituency Parsing J&M Chapter 13
J&M Chapter 14
Notes from Michael Collins [PCFGs][Lexicalized PCFGs]
Oct 17 Dependency Parsing J&M Chapter 15
Oct 22 Midterm Review
Oct 24 Midterm (in-class)
Oct 25 A3 due*
A4 out
Oct 29 NO CLASS (Fall Recess)
Oct 31 NO CLASS (Fall Recess)
Nov 5 1) Final project advice
2) PyTorch tutorial
3) PyTorch demo
Nov 7 Machine translation Eisenstein 18.1,18.2
Nov 11 Project proposal due
Nov 12 Neural machine translation Eisenstein 18.3
Koehn, 2017
Nov 14 Contextualized word embeddings ELMo paper
BERT paper
Transformer paper
The Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer Learning)
The Annotated Transformer
Nov 18 A4 due*
Nov 19 Question answering J&M Chapter 25
Nov 21 Dialogue J&M Chapter 26
Nov 26 Guest lecture: Jason Weston
Nov 28 NO CLASS (Thanksgiving)
Dec 3 Guest lecture: Tom Kwiatkowski
Dec 5 Coreference resolution J&M Chapter 22
Eisenstein Chapter 15
Dec 10 Grounding
Jan 14 Final project due

Coursework

Grading

Assignments

All assignments are due at 11:59pm on the due date (usually Mondays). There are no late days. Late submissions incur a penalty of 10% for each day, up to a maximum of 4 days beyond which submissions will not be accepted. The only exception to this rule is if you have a note from your Dean of Studies. In this case, you must notify the instructors via email. For students with a dean’s note, the weight of their missed/penalized assignment will be added to the midterm and your midterm score will be scaled accordingly (for homeworks 1,2 and 3) (e.g. if you are penalized 2 points overall, your midterm will be worth 27 and your score will be multiplied by 27/25). Missing homework 4 (after the midterm) can only be compensated by arranging an oral exam on the pertinent material.
Writeups: Homeworks should be written up clearly and succinctly; you may lose points if your answers are unclear or unnecessarily complicated. Using LaTeX is recommended (here's a template), but not a requirement. Hand-written assignments must be scanned and uploaded as a pdf.
Collaboration policy and honor code: You are free to form study groups and discuss homeworks and projects. However, you must write up homeworks and code from scratch independently, and you must acknowledge in your submission all the students you discussed with. The following are considered to be honor code violations (in addition to the Princeton honor code): When debugging code together, you are only allowed to look at the input-output behavior of each other's programs (so you should write good test cases!). It is important to remember that even if you didn't copy but just gave another student your solution, you are still violating the honor code, so please be careful. If you feel like you made a mistake (it can happen, especially under time pressure!), please reach out to Danqi/Karthik; the consequences will be much less severe than if we approach you.

Final Project

The final project offers you the chance to apply your newly acquired skills towards an in-depth NLP application. Students are required to complete the final project in teams of 2-3. You can use Piazza to search for teammates.
Deliverables: The final project is worth 35% of your course grade. The deliverables include:
Policy and honor code:

Submission

Electronic Submission: Assignments and project proposal/paper are to be submitted as pdf files through Gradescope (we will send you the signup code for the class through Blackboard). If you need to sign up for a Gradescope account, please use your @princeton.edu email address. You can submit as many times as you'd like until the deadline: we will only grade the last submission. Submit early to make sure your submission uploads/runs properly on the Gradescope servers. If anything goes wrong, please ask a question on Piazza or contact a TA. Do not email us your submission. Partial work is better than not submitting any work.

For assignments with a programming component, we may automatically sanity check your code with some basic test cases, but we will grade your code on additional test cases. Important: just because you pass the basic test cases, you are by no means guaranteed to get full credit on the other, hidden test cases, so you should test the program more thoroughly yourself!

Regrades: If you believe that the course staff made an objective error in grading, then you may submit a regrade request. Remember that even if the grading seems harsh to you, the same rubric was used for everyone for fairness, so this is not sufficient justification for a regrade. It is also helpful to cross-check your answer against the released solutions. If you still choose to submit a regrade request, click the corresponding question on Gradescope, then click the "Request Regrade" button at the bottom. Any requests submitted over email or in person will be ignored. Regrade requests for a particular assignment are due by Sunday 11:59pm, one week after the grades are returned. Note that we may regrade your entire submission, so depending on your submission you may actually lose more points than you gain.