CS662
Advanced Natural Language Processing
Staff
Instructor

Jonathan May
Office Hours: Mondays 12:00-1:00 pm TBD and Wednesdays 9:00–9:50 am TBD, or by appointment (at ISI on other days)
Teaching Assistant

Lectures
- Monday and Wednesday 10:00–11:50 am, DMC 261
- See schedule for select days where class is canceled
Textbook
Optional: Natural Language Processing - Eisenstein (‘E’ in schedule) – or free version
Optional: Speech and Language Processing 3rd edition - Jurafsky, Martin (‘JM’ in schedule) – January 2025 pdf
Required: Selected papers from NLP literature, see (evolving) schedule
Grading
Percentage | Assessment Component |
---|---|
10% | In class participation |
10% | Posted questions before each in-class selected paper presentation and possible quizzes |
10% | In-class selected paper presentation |
30% | Three Homeworks (10% each) |
40% | Project, done in small groups, comprising: |
- Proposal (5%) | |
- First version of report (5%) | |
- In-class presentation (10%) | |
- Final report (20%). |
- Written homeworks and project components except for final project report must be submitted on the date listed in the schedule, by 23:59:59 AoE.
- Final project report is due Monday, December 15, 2025, 10:00 AM PST
- A deduction of 1/5 of the total possible score will be assessed for each late day. After four late days (i.e. on the fifth), you get a 0 on the assignment (and you should come talk to us because your grade will likely suffer!)
- You have four extension days, to be applied as you wish, throughout the entire class, for homeworks and project proposal / first report (NOT final report). No deduction will be assessed if an extension day is used. As an example, if an assignment is due November 10, you have two extension days remaining, you submit the assignment on November 12, and your score is 90/100. In this case you lose the extension days but your grade is not reduced; it remains 90/100. If you have one extension day, you lose it, and your grade is 70/100. If you have no extension days, your grade is 50/100.
Contact us
On Slack, or in class/office hours. Please do not email (unless notified otherwise).
Topics
- (subject to change per instructor/class whim) (will not necessarily be presented in this order):
- Fundamentals
- Linguistic Stack (graphemes/phones - words - syntax - semantics - pragmatics - discourse
- Corpora, Corpus statistics, Data cleaning, munging, and annotation
- Evaluation
- Linear and Nonlinear Models
- Dense Representations and neural architectures (feed-forward, RNN, Transformer)
- Language Models
- Pre-training, Fine-tuning, Prompting, Reward Alignment
- Ethics
- Effective written and oral communication
- Corpora, Corpus statistics, Data cleaning, munging, and annotation
- Applications
- Multilingualism and Translation
- Syntax
- Information Retrieval/Question Answering
- Dialogue
- Information Extraction
- Multimodality
- Speech Recognition and Generation
- Agent Interaction
- Discourse
- Syntax
Week 1
- Aug 25
- Introduction, Applications
- E 1, Probabilities (refresher only)
- project assignment out (due 9/22)
- paper selection out (due 9/8) :-
- Aug 27
Week 2
- Sep 1
- LABOR DAY NO CLASS
- HW1 out (due 9/19)
- Sep 3
- Linear Classifiers
- E 2.2, 2.3, 2.4. JM 4, 5, Thumbs up? Sentiment Classification using Machine Learning Techniques, Goldwater probability tutorial. The Perceptron (Rosenblatt 1958) (optional)
Week 3
- Sep 8
- Non-linear Classifiers, Backprop, Gradient Descent
- E 3. JM 7.2–7.5 :
- Sep 10
- Sep 12
- Early Drop (no W, refund)
Week 4
- Sep 15
- N-Gram Language Models, Feed Forward and Recurrent Language Models (RNNs)
- E 6.1–2, 6.4. 7.5, 7.7. JM 3 Exploring the limits of language modeling LM notebook
- Sep 17
- Attention, Transformer Language Models
- E 6.3, JM 9, 10. Attention is all you need
- Sep 19
- HW1 due
Week 5
- Sep 22
- Pretrained language models (ELMo, BERT, and sentence similarity)
- JM 11.1–11.3 ELMo paper BERT paper Zoph Fine-Tuning paper Fine-Tuning demo
project proposal due
- Sep 24
- NO CLASS
Week 6
- Sep 29
- HW2 out (due 10/18)
- Oct 1
- MEGA (Guest Lecture by Xuezhe Ma)
- Mega Paper Megalodon
Week 7
- Oct 6
- Oct 8
- Ethics (Guest Lecture by Katy Felkner)
- The Social Impact of Natural Language Processing, Energy and Policy Considerations for Deep Learning in NLP, Model Cards for Model Reporting
- Oct 10
- Mid Drop (No W, No refund)
Week 8
Week 9
- Oct 22
- Dialogue
- JM 15
Week 10
- Oct 27
- HW3 out (due 11/21)
- JM17.3, 20
- Oct 29
- Agents (Guest Lecture by Tenghao Huang)
- WebArena, ToolLLM, Narrative Discourse, ReAct
Week 11
- Nov 3
- Multimodal NLP (Guest Lecture by Xuezhe Ma)
- Nov 5
- Spoken Language Processing (SLP) (Guest Lecture by Sudarsana Reddy Kadiri)
- JM 16
Week 12
- Nov 10
- TBD
- Nov 12
- TBD
- -
- Nov 14
- Late Drop (W, No refund)
Week 13
Week 14
- Nov 24
- TBD
- Nov 26
- THANKSGIVING BREAK; NO CLASS
Week 15
- Dec 1
- Project Presentations
- (10:00) TBD
Questions by: TBD
(10:22) TBD
Questions by: TBD
(10:44) TBD
Questions by: TBD
(11:06) TBD
Questions by: TBD
(11:28) TBD
Questions by: TBD
- Dec 4
- Project presentations
- (10:00) TBD
Questions by: TBD
(10:22) TBD
Questions by: TBD
(10:44) TBD
Questions by: TBD
(11:06) TBD
Questions by: TBD
(11:28) TBD
Questions by: TBD