CS662

Advanced Natural Language Processing

Staff

Instructor

Jonathan May

Office Hours: Mondays 12:00-1:00 pm TBD and Wednesdays 9:00–9:50 am TBD, or by appointment (at ISI on other days)

Teaching Assistants

Justin Cho

Office Hours: 2-4pm Mondays, SAL PhD lounge

Shushan Arakelyan

shushana@usc.edu

Office Hours: 3-5pm Fri, RTH 314

Lectures

Monday and Wednesday 10:00–11:50 am, DMC 261

Textbook

Required: Natural Language Processing - Eisenstein (‘E’ in schedule) – or free version
Required: Speech and Language Processing 3rd edition - Jurafsky, Martin (‘JM’ in schedule) – January 2023 pdf
Required: Selected papers from NLP literature, see (evolving) schedule

Grading

Percentage	Assessment Component
10%	In class participation
10%	Posted questions before each in-class selected paper presentation and possible quizzes
10%	In-class selected paper presentation
30%	Three Homeworks (10% each)
40%	Project, done in small groups, comprising:
	- Proposal (5%)
	- First version of report (5%)
	- In-class presentation (10%)
	- Final report (20%).

Written homeworks and project components except for final project report must be submitted on the date listed in the schedule, by 23:59:59 AoE.
Final project report is due Monday, December 11, 2023, 10:00 AM PST
A deduction of 1/5 of the total possible score will be assessed for each late day. After four late days (i.e. on the fifth), you get a 0 on the assignment (and you should come talk to us because your grade will likely suffer!)
You have four extension days, to be applied as you wish, throughout the entire class, for homeworks and project proposal / first report (NOT final report). No deduction will be assessed if an extension day is used. As an example, if an assignment is due November 10, you have two extension days remaining, you submit the assignment on November 12, and your score is 90/100. In this case you lose the extension days but your grade is not reduced; it remains 90/100. If you have one extension day, you lose it, and your grade is 70/100. If you have no extension days, your grade is 50/100.

Contact us

On Piazza, Slack, or in class/office hours. Please do not email (unless notified otherwise).

Topics

(subject to change per instructor/class whim) (will not necessarily be presented in this order):

Linguistic Stack (graphemes/phones - words - syntax - semantics - pragmatics - discourse)

Tools:: Corpora, Corpus statistics, Data cleaning and munging; Annotation and crowdwork; Evaluation; Models/approaches: rule-based, automata/grammars, perceptron, logistic regression, neural network models; Effective written and oral communication; Components/Tasks/Subtasks:; Language Models

Syntax: POS tags, constituency tree, dependency tree, parsing: Semantics: lexical, formal, inference tasks; Information Extraction: Named Entities, Relations, Events; Generation: Machine Translation, Summarization, Dialogue, Creative Generation

Week 1

Aug 21

intro, applications: E 1

project assignment out (due 9/18)

paper selection out (due 9/4)

Aug 23

data processing. data resources, evaluation, annotation: E 4.4-4.5, JM 2, 4.7-4.9, Nathan Schneider’s unix notes, Unix for poets, sculpting text

Week 2

Aug 28

linear classifiers: E 2.2, 2.3, 2.4. JM 4, 5, Thumbs up? Sentiment Classification using Machine Learning Techniques, Goldwater probability tutorial. The Perceptron (Rosenblatt 1958) (optional)

HW1 out (due 9/15)

Aug 30

nonlinear classifiers, backprop, gradient descent

E 3. JM 7.2–7.4, 7.6.

Week 3

Sep 4

LABOR DAY NO CLASS

Paper selection due

Sep 6

distributional feature representations: PPMI, LSA, word2vec

E 14.3, 14.5–6. JM 6, LSA via SVD, Linguistic regularities in continuous space word representations, Efficient Estimation of Word Representations in Vector Space, Distributed Representations of Words and Phrases and their Compositionality

Sep 8

Drop deadline (for refund, without W)

Week 4

Sep 11

ngram language models, feed forward and recurrent language models: E 6.1–2, 6.4. 7.5, 7.7. JM 3 Exploring the limits of language modeling

Sep 13

NO CLASS

Sep 15

HW1 due

Week 5

Sep 18

transformer language models

E 6.3, JM 9, 10. Attention is all you need LM notebook

project proposal due

Sep 20

Pretrained language models: JM 10

Week 6

Sep 25

YOM KIPPUR NO CLASS

HW2 out (due 10/13)

Sep 27

ethics (Guest Lecture by Katy Felkner)

The Social Impact of Natural Language Processing, Energy and Policy Considerations for Deep Learning in NLP, Model Cards for Model Reporting: Daniel P - Extracted BERT Model Leaks More Information than you Think!; Questions by: Tianyi

Week 7

Oct 2

POS tags, HMMs, treebanks

E 7.1–7.4, 8.1, JM 8.1–8.5, 17.3: Katie - Affective Knowledge Enhanced Multiple-Graph Fusion Networks for Aspect-based Sentiment Analysis; Questions by: Yavuz; Sean - Reproducibility in Computational Linguistics: Is Source Code Enough?; Questions by: Tina

Oct 4

constituencies, cky

E 10.1–10.4, JM 17 (the rest): Ian - Finding Skill Neurons in Pre-trained Transformer-based Language Models; Questions by: Deuksin; Darshan - On the Transformation of Latent Space in Fine-Tuned NLP Models; Questions by: Shauryasikt

Oct 6

Drop deadline (no refund, without W)

Week 8

Oct 9

dependencies

E 11, JM 18: Sina - Perturbation Augmentation for Fairer NLP; Questions by: Kian; Tina - Balancing out Bias: Achieving Fairness Through Balanced Training; Questions by: Ajay

Oct 11

semantics: logical/compositional, frames and roles, amr, distributional

E 12.1, 12.2, 13.1, 13.3, 14.1-3, 14.6-8, JM 19, 23, 24: Nuan - Don’t Prompt, Search! Mining-based Zero-Shot Learning with Language Models; Questions by: Daniel P; Yavuz - The Geometry of Multilingual Language Model Representations; Questions by: Weike

Oct 13

HW 2 due

Week 9

Oct 16

RLHF (Guest Lecture by Justin Cho)

RLHF - Chip Huyen

Illustrating Reinforcement Learning from Human Feedback (RLHF): Xinyue - Beyond prompting: Making Pre-trained Language Models Better Zero-shot Learners by Clustering Representations; Questions by: Sina; Suhaib - Interpreting Language Models with Contrastive Explanations; Questions by: Yiming

Oct 18

machine translation: history, evaluation, data

“Oh, yes, everything’s right on schedule, Fred”: Ajay - Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space; Questions by: Gabriela; Kate - RANKGEN: Improving Text Generation with Large Ranking Models; Questions by: Abid

Week 10

Oct 23

MEGA (Guest Lecture by Xuezehe Ma)

Mega Paper; Annotated S4: Tianyi Y. - Active Example Selection for In-Context Learning; Questions by: Xinyue; Daniel Y. - mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections; Questions by: Duygu

HW3 out (due 11/17)

Oct 25

machine translation: statistical, neural

E 18, JM 10, 13: Abid - Calibration Meets Explanation: A Simple and Effective Approach for Model Confidence Estimates; Questions by: Sean; Eray - Zero-Shot Text Classification with Self-Training; Questions by: Ian

Week 11

Oct 30

dialogue: task-oriented and chatbots

E 19.3, JM 15 The original ELIZA: Patrick - Towards Multi-Modal Sarcasm Detection via Hierarchical Congruity Modeling with Knowledge Enhancement; Questions by: Kate Hsu; David - Stanceosaurus: Classifying Stance Towards Multicultural Misinformation; Questions by: Shicheng

Nov 1

dialogue 2

-n/a-: Kian - Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?; Questions by: Priyanka; Duygu - Learning Robust Representations for Continual Relation Extraction via Adversarial Class Augmentation; Questions by: Darshan

Nov 3

Project Report Version 1 due

Week 12

Nov 6

information extraction

JM 21, E 17, 25 years of IE: Priyanka - Improving Multi-turn Emotional Support Dialogue Generation with Lookahead Strategy Planning; Questions by: Zhejian; Shambhavi - Reflect, Not Reflex: Inference-Based Common Ground Improves Dialogue Response Quality; Questions by: Eray

Nov 8

information extraction 2

-n/a-: Brian - ProsocialDialog: A Prosocial Backbone for Conversational Agents; Questions by: Michael; Gabriela - Directions for NLP Practices Applied to Online Hate Speech Detection; Questions by: David

Nov 10

Drop deadline with W

Week 13

Nov 13

question answering, information retrieval

JM 14, RAG: Michael - Logical Reasoning with Span-Level Predictions for Interpretable and Robust NLI Models; Questions by: Nuan; Weike - Learning Instructions with Unlabeled Data for Zero-Shot Cross-Task Generalization; Questions by: Shambhavi

Nov 15

natural language inference and common sense tasks

Modeling Semantic Containment and Exclusion in Natural Language Inference, Commonsense Reasoning and Commonsense Knowledge in Artificial Intelligence: Shauryasikt - Exploring Dual Encoder Architectures for Question Answering; Questions by: Katie; Shicheng - Learning to Generate Question by Asking Question: A Primal-Dual Approach with Uncommon Word Generation; Questions by: Suhaib

Nov 17

HW 3 due

Week 14

Nov 20

prompts, multi task large language models (guest lecture by Qinyuan Ye): InstructGPT paper (pp1–20), Large Language Models are Human-Level Prompt Engineers; Zhejian - Maieutic Prompting: Logically Consistent Reasoning with Recursive Explanations; Questions by: Patrick; Yiming - Reasoning Like Program Executors; Questions by: Daniel Y.

Nov 22

THANKSGIVING BREAK; NO CLASS

Week 15

Nov 27

Project presentations

Nov 29

Project presentations