CS662

Advanced Natural Language Processing

Staff

Instructor

Jonathan May

Office Hours: Mondays and Wednesdays 3:00-4:00 pm GCS SB10 (LL2) or by appointment

Teaching Assistant

Katy Felkner

felkner@usc.edu

Office Hours: 1-3pm Wednesdays, GCS LL2, room SB3, or by appointment on Calendly

Lectures

Monday and Wednesday 10:00–11:50 am, DMC 261
See schedule for select days where class is canceled

Textbook

Optional: Natural Language Processing - Eisenstein (‘E’ in schedule) – or free version
Optional: Speech and Language Processing 3rd edition - Jurafsky, Martin (‘JM’ in schedule) – August 2025 pdf
Required: Selected papers from NLP literature, see (evolving) schedule

Grading

Percentage	Assessment Component
10%	In class participation
10%	Posted questions before each in-class selected paper presentation and possible quizzes
10%	In-class selected paper presentation
30%	Three Homeworks (10% each)
40%	Project, done in small groups, comprising:
	- Proposal (5%)
	- First version of report (5%)
	- In-class presentation (10%)
	- Final report (20%).

Written homeworks and project components except for final project report must be submitted on the date listed in the schedule, by 23:59:59 AoE.
Final project report is due Monday, December 15, 2025, 10:00 AM PST
A deduction of 1/5 of the total possible score will be assessed for each late day. After four late days (i.e. on the fifth), you get a 0 on the assignment (and you should come talk to us because your grade will likely suffer!)
You have four extension days, to be applied as you wish, throughout the entire class, for homeworks and project proposal / first report (NOT final report). No deduction will be assessed if an extension day is used. As an example, if an assignment is due November 10, you have two extension days remaining, you submit the assignment on November 12, and your score is 90/100. In this case you lose the extension days but your grade is not reduced; it remains 90/100. If you have one extension day, you lose it, and your grade is 70/100. If you have no extension days, your grade is 50/100.

Contact us

On Slack, or in class/office hours. Please do not email (unless notified otherwise).

Topics

(subject to change per instructor/class whim) (will not necessarily be presented in this order):

Fundamentals: Linguistic Stack (graphemes/phones - words - syntax - semantics - pragmatics - discourse; Corpora, Corpus statistics, Data cleaning, munging, and annotation; Evaluation; Linear and Nonlinear Models; Dense Representations and neural architectures (feed-forward, RNN, Transformer); Language Models; Pre-training, Fine-tuning, Prompting, Reward Alignment; Ethics; Effective written and oral communication

Applications: Multilingualism and Translation; Syntax; Information Retrieval/Question Answering; Dialogue; Information Extraction; Multimodality; Speech Recognition and Generation; Agent Interaction; Discourse

Week 1

Aug 25

Introduction, Applications: E 1, Probabilities (refresher only)

HW0 out (due 8/29)

paper selection out (due 9/8)

project assignment out (due 9/22)

Aug 27

Data Processing. Data Resources, Evaluation, Annotation: E 4.4-4.5, JM 2, 4.7-4.9, Nathan Schneider’s unix notes, Unix for poets, sculpting text, Suleyman Reading 1, Suleyman Reading 2, Hank Green Video Berg-Kirkpatrick on statistical significance tests

Aug 29

HW0 due

Week 2

Sep 1

LABOR DAY NO CLASS

HW1 out (due 9/19)

Sep 3

Linear Classifiers: E 2.2, 2.3, 2.4. JM 4, app. B, Thumbs up? Sentiment Classification using Machine Learning Techniques, Goldwater probability tutorial. The Perceptron (Rosenblatt 1958) (optional)

Week 3

Sep 8

Non-linear Classifiers, Backprop, Gradient Descent: E 3. JM 6 :

Sep 10

Distributional Feature Representations: PPMI, LSA, word2vec

E 14.3, 14.5–6. JM 5, LSA via SVD, Linguistic regularities in continuous space word representations, Efficient Estimation of Word Representations in Vector Space, Distributed Representations of Words and Phrases and their Compositionality

Sep 12

Early Drop (no W, refund)

Week 4

Sep 15

N-Gram Language Models, Feed Forward and Recurrent Language Models (RNNs): E 6.1–2, 6.4. 7.5, 7.7. JM 3, 13 Exploring the limits of language modeling, LM notebook, Fast and Robust Neural Network Joint Models for Statistical Machine Translation

Sep 17

Slides, Transformer Language Models: E 6.3, JM 8. Attention is all you need, Neural Machine Translation of Rare Words with Subword Units

Sep 19

HW1 due

Week 5

Sep 22

Pretrained language models (ELMo, BERT, and sentence similarity); took 1.5 lectures: JM 10 ELMo paper BERT paper Zoph Fine-Tuning paper Fine-Tuning demo

project proposal due

Sep 24

NO CLASS

Week 6

Sep 29

Prompting and Large Language Models; took 1.5 lectures: JM 7 T5 LoRA Prefix Tuning T0; Jinyi Ye - What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective; Questions by: Narges Ghasemi Ghaleh Bahmani; Saba Hashemi Safaei - Byte Latent Transformer: Patches Scale Better Than Tokens; Questions by: Yuxin Yang

HW2 out (due 10/18)

Oct 1

Reinforcement Learning with Human Feedback: Proximal Policy Optimization (PPO) and Direct Preference Optimization (DPO); started but did not finish: JM 6, Ziegler RLHF Paper, DPO Paper; Sadra Sabouri Halestani - HUMT DUMT: Measuring and controlling human-like language in LLMs; Questions by: Nikunj Gupta; Feiyu Zhu - Self-Instructed Derived Prompt Generation Meets In-Context Learning: Unlocking New Potential of Black-Box LLMs; Questions by: Sichang (Stephen) He

Week 7

Oct 6

MEGA/Megalodon/Gecko (Guest Lecture by Xuezhe Ma): Mega Paper Megalodon; Narges Ghasemi Ghaleh Bahmani - LLMs know their vulnerabilities: Uncover Safety Gaps through Natural Distribution Shifts; Questions by: Tianming Guo; Saeed Hedayatian - TreeRL: LLM Reinforcement Learning with On-Policy Tree Search; Questions by: Zhiyuan Gao

Oct 8

Reinforcement Learning with Human Feedback: Proximal Policy Optimization (PPO) and Direct Preference Optimization (DPO); almost finished: Daniel Ruiz - TokAlign: Efficient Vocabulary Adaptation via Token Alignment; Questions by: Abhinav Vadhera; Ardysatrio Haroen - Turning Trash into Treasure: Accelerating Inference of Large Language Models with Token Recycling; Questions by: Chufan Shi

Oct 10

Mid Drop (No W, No refund)

Week 8

Oct 13

Agents (Guest Lecture by Tenghao Huang): WebArena, Proactive Info Gathering, ToolLLM, Narrative Discourse, ReAct; Kiarash Vaziri Goodarzi - TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters; Questions by: Matthew Finlayson

Oct 15

Ethics (Guest Lecture by Katy Felkner): The Social Impact of Natural Language Processing, Energy and Policy Considerations for Deep Learning in NLP, Model Cards for Model Reporting; Kaicheng Wang - MAIN-RAG: Multi-Agent Filtering Retrieval-Augmented Generation; Questions by: Ardysatrio Haroen; Zhiyuan Gao - OS Agents: A Survey on MLLM-based Agents for Computer, Phone and Browser Use; Questions by: Naga Vamsi Ramana Dinavahi

Oct 17

HW 2 due

Week 9

Oct 20

Efficient Inference: LM-Infinite; Speculative Decoding; Keep the Cost Down: A Review on Methods to Optimize LLM’s KV Cache Consumption, Unlocking Efficiency in Large Language Model Inference: A Comprehensive Survey of Speculative Decoding; Faith Baca - Large Language Models Are Biased Because They Are Large Language Models; Questions by: Sajjad Shahabi; Ruth-Ann Armstrong - Toward Automatic Discovery of a Canine Phonetic Alphabet; Questions by: Saba Hashemi Safaei

Oct 22

Information Retrieval (IR) and Question Answering (QA): JM 11; Tianwen Fu - Improving Factuality with Explicit Working Memory; Questions by: Kaicheng Wang; Nikunj Gupta - Reinforced IR: A Self-Boosting Framework For Domain-Adapted Information Retrieval; Questions by: Faith Baca

Week 10

Oct 27

Machine Translation (MT)/Multilinguality slides1 slides2: JM12 Weaver, Translation (1952); Shixuan Li - Re-ranking Using Large Language Models for Mitigating Exposure to Harmful Content on Social Media Platforms; Questions by: Jinyi Ye

HW3 out (due 11/21)

Oct 29

Dialogue: JM 25, Appendix K; Sajjad Shahabi - Integrating Audio, Visual, and Semantic Information for Enhanced Multimodal Speaker Diarization on Multi-party Conversation; Questions by: Daniel Ruiz; Tianming Guo - HotelMatch-LLM: Joint Multi-Task Training of Small and Large Language Models for Efficient Multimodal Hotel Retrieval; Questions by: Shixuan Li

Week 11

Nov 3

Multimodal NLP (Guest Lecture by Xuezhe Ma): Anzhe Cheng - SHuBERT: Self-Supervised Sign Language Representation Learning via Multi-Stream Cluster Prediction; Questions by: Feiyu Zhu; Gonglin Chen - SpaRE: Enhancing Spatial Reasoning in Vision-Language Models with Synthetic Data; Questions by: Wenbin Teng

Nov 5

Spoken Language Processing (SLP) (Guest Lecture by Sudarsana Reddy Kadiri) speech, asr, e2e asr, tts, wav2vec2 tutorial, tacotron tutorial: JM 15; Wenbin Teng - Improve Vision Language Model Chain-of-thought Reasoning; Questions by: Kiarash Vaziri Goodarzi; Chufan Shi - ChartCoder: Advancing Multimodal Large Language Model for Chart-to-Code Generation; Questions by: Tianwen Fu

Nov 7

Project Report Version 1 due

Week 12

Nov 10

Mind Reading (Guest Lecture by Sam Nastase): Lydia Ignatova - Dehumanizing Machines: Mitigating Anthropomorphic Behaviors in Text Generation Systems; Questions by: Gonglin Chen; Sichang (Stephen) He - Learning to Rewrite: Generalized LLM-Generated Text Detection; Questions by: Anzhe Cheng

Nov 12

Information Extraction: JM17.3, 20; Abhinav Vadhera - JailbreakRadar: Comprehensive Assessment of Jailbreak Attacks Against LLMs; Questions by: Ruth-Ann Armstrong; Yuxin Yang - A Troublemaker with Contagious Jailbreak Makes Chaos in Honest Towns; Questions by: Sadra Sabouri Halestani

Nov 14

Late Drop (W, No refund)

Week 13

Nov 17

Discourse Slides: Danny Deng - LocAgent: Graph-Guided LLM Agents for Code Localization; Questions by: Saeed Hedayatian; Matthew Finlayson - Geometric Signatures of Compositionality Across a Language Model’s Lifetime; Questions by: Lydia Ignatova

Nov 19

Auditing, Dissecting, and Evaluating Large Language Models (Guest Lecture by Robin Jia): Naga Vamsi Ramana Dinavahi - Sliding Windows Are Not the End: Exploring Full Ranking with Long-Context Large Language Models; Questions by: Danny Deng

Nov 21

HW 3 due

Week 14

Nov 24

Project Presentations: (10:00) Danny Deng, Feiyu Zhu - Adaption-of-Thought: Learning Question Difficulty Improves Large Language Models for Reasoning; Questions by: Ruth-Ann Armstrong, Yuxin Yang; (10:18) Kiarash Vaziri Goodarzi, Saba Hashemi Safaei, Saeed Hedayatian - Unveiling Multi-level and Multi-modal Semantic Representations in the Human Brain using Large Language Models; Questions by: Wenbin Teng, Naga Vamsi Ramana Dinavahi, Gonglin Chen; (10:36) Narges Bahmani, Sajjad Shahabi - Evaluating the Prompt Steerability of Large Language Models; Questions by: Nikunj Gupta, Chufan Shi, Ardysatrio Haroen; (10:54) Jinyi Ye, Sichang (Steven) He - Zero-Shot Detection of LLM-Generated Text using Token Cohesiveness; Questions by: Faith Baca, Abhinav Vadhera; (11:12) Matthew Finlayson, Daniel Ruiz, Tianming Guo - Learning from Natural Language Explanations for Generalizable Entity Matching; Questions by: Lydia Ignatova, Zhiyuan Gao; (11:30) Anzhe Cheng, Kaicheng Wang, Shixuan Li - Prompts have evil twins; Questions by: Matthew Finlayson, Tianwen Fu, Sadra Sabouri Halestani

Nov 26

THANKSGIVING BREAK; NO CLASS

Week 15

Dec 1

Project Presentations: (10:00) Gonglin Chen, Zhiyuan Gao - Hello Again! LLM-powered Personalized Agent for Long-term Dialogue; Questions by: Danny Deng, Saeed Hedayatian, Feiyu Zhu; (10:18) Chufan Shi, Tianwen Fu, Wenbin Teng - Vision-Language Models Can Self-Improve Reasoning via Reflection; Questions by: Daniel Ruiz, Kaicheng Wang, Saba Hashemi Safaei; (10:36) Nikunj Gupta, Yuxin Yang - FIZZ: Factual Inconsistency Detection by Zoom-in Summary and Zoom-out Document; Questions by: Anzhe Cheng, Kiarash Vaziri Goodarzi; (10:54) Faith Baca, Lydia Ignatova, Ruth-Ann Armstrong - BiasWipe: Mitigating Unintended Bias in Text Classifiers through Model Interpretability; Questions by: Sichang (Steven) He, Narges Bahmani; (11:12) Abhinav Vadhera, Ardysatrio Haroen - Ranking Manipulation for Conversational Search Engines; Questions by: Jinyi Ye, Shixuan Li; (11:30) Naga Vamsi Ramana Dinavahi, Sadra Sabouri Halestani - Sneaking Syntax into Transformer Language Models with Tree Regularization; Questions by: Sajjad Shahabi, Tianming Guo

Dec 3

NO CLASS