CS662

Advanced Natural Language Processing

Staff

Jonathan May

Office Hours: Mondays and Wednesdays 9:00–9:50 am, THH 110, or by appointment (at ISI on other days)

Elan Markowitz

esmarkow@usc.edu

Office Hours: Mondays 1:00pm–2:00pm and Wednesdays 12:00pm–1:00pm, located on 4th floor of RTH by whiteboards

Lectures

Monday and Wednesday 10:00–11:50 am, THH 110

Textbook

Required: Natural Language Processing - Eisenstein (‘E’ in schedule) – or free version

Required: Speech and Language Processing 3rd edition - Jurafsky, Martin (‘JM’ in schedule) – January 2022 pdf

Required: Selected papers from NLP literature, see (evolving) schedule

Grading

10% - In class participation
10% - Posted questions before each in-class selected paper presentation and possible quizzes
10% - In-class selected paper presentation
30% - Three Homeworks (10% each)
40% - Project, comprising proposal (5%), first version of report (5%), in-class presentation (10%), and final report (20%). Done in small groups.
Written homeworks and project components except for final project report must be submitted on the date listed in the schedule, by 23:59:59 AoE.
Final project report is due Monday, December 12, 2022, 10:00 AM PST
A deduction of 1/5 of the total possible score will be assessed for each late day. After five late days, you get a 0 on the assignment (and you should come talk to us because your grade will likely suffer!)
You have four late days, to be applied as you wish, throughout the entire class, for homeworks and project proposal / first report (NOT final report). No deduction will be assessed if a late day is used.

Contact us

On Piazza, Slack, or in class/office hours. Please do not email (unless notified otherwise).

Topics

(subject to change per instructor/class whim) (will not necessarily be presented in this order):

Linguistic Stack (graphemes/phones - words - syntax - semantics - pragmatics - discourse)

Tools:: Corpora, Corpus statistics, Data cleaning and munging; Annotation and crowdwork; Evaluation; Models/approaches: rule-based, automata/grammars, perceptron, logistic regression, neural network models; Effective written and oral communication; Components/Tasks/Subtasks:; Language Models

Syntax: POS tags, constituency tree, dependency tree, parsing: Semantics: lexical, formal, inference tasks; Information Extraction: Named Entities, Relations, Events; Generation: Machine Translation, Summarization, Dialogue, Creative Generation

Schedule of Classes

Aug 22

intro, applications: E 1

project assignment out (due 9/19)

Aug 24

data processing. data resources, evaluation, annotation: E 4.5, JM 2, 4.9, Nathan Schneider’s unix notes, Unix for poets, sculpting text

Aug 29

linear classifiers: E 2.2, 2.3, 2.4. JM 4, 5, Thumbs up? Sentiment Classification using Machine Learning Techniques, Goldwater probability tutorial. The Perceptron (Rosenblatt 1958) (optional)

HW1 out (due 9/14)

Aug 31

nonlinear classifiers, backprop, gradient descent

E 3. JM 7.2–7.4, 7.6.

Sep 5

LABOR DAY NO CLASS

Sep 7

distributional feature representations: PPMI, LSA, word2vec, bilingual dictionary induction

E 14.3, 14.5–6. JM 6.

(Sep 9): Drop deadline (for refund, without W))

Sep 12

ngram language models: E 6.1–2, 6.4. 7.5, 7.7. JM 3 Exploring the limits of language modeling

Sep 14

recurrent and transformer language models, ELMo, BERT

E 6.3, JM 9. Attention is all you need

HW1 due

Sep 19

Transformers, actually

project proposal due

Sep 21

ethics

The Social Impact of Natural Language Processing, Energy and Policy Considerations for Deep Learning in NLP, Model Cards for Model Reporting

Sep 26

ROSH HASHANAH NO CLASS : :

HW2 out (due 10/12)

Sep 28

POS tags, HMMs, treebanks

E 7.1–7.4, JM 8.1–8.5, 12 (through 12.4.2)

Oct 3

constituencies, cky, dependencies, shift-reduce

E 10.1–10.4, JM 13.1–13.4, 14–14.4.4

Oct 5

YOM KIPPUR NO CLASS :

Oct 10

Omey - TweetSpin - Propaganda detection in Social Media using Multi-View Representations: Alireza - Hate Speech and Counter Speech Detection: Conversational Context Does Matter

Oct 12

shift-reduce :

Smit - Aligning to Social Norms and Values in Interactive Narratives: Syeda - How Gender Debiasing Affects Internal Model Representations, and Why It Matters

HW2 due

Oct 17

machine translation: history, evaluation, data

“Oh, yes, everything’s right on schedule, Fred”: Mahak - Time Waits for No One! Analysis and Challenges of Temporal Misalignment; Amin - Lifelong Pretraining: Continually Adapting Language Models to Emerging Corpora

Oct 19

machine translation: statistical, recurrent, transformer, transfer learning, unsupervised, nonautoregressive

E 18, JM 10: Zhuoyu - Annotators with Attitudes: How Annotator Beliefs And Identities Bias Toxic Language Detection; Hirona - Recognition of They/Them as Singular Personal Pronouns in Coreference Resolution

Oct 24

MEGA (Guest Lecture by Xuezehe Ma)

Mega Paper; Annotated S4: Robert - Non-Autoregressive Machine Translation: It’s Not as Fast as it Seem; Yixiang - Ask Me Anything in Your Native Language

HW3 out (due 11/16)

Oct 26

semantics: logical/compositional, frames and roles, amr, distributional

E 12.1, 12.2, 13.1, 13.3, 14.1-3, 14.6-8, JM 15.1-3, 6: Charles - Theory-Grounded Measurement of U.S. Social Stereotypes in English Language Models; Mina - How Conservative are Language Models? Adapting to the Introduction of Gender-Neutral Pronouns

Oct 31

Vision-and-language Models (Guest lecture: Xuezhe Ma)

Denoising Diffusion Probabilistic Models, Score-Based Generative Modeling through Stochastic Differential Equations, Variational Diffusion Models, Understanding Diffusion Models: A Unified Perspective: Basem - Explaining Dialogue Evaluation Metrics using Adversarial Behavioral Analysis; Leticia - Learning Dialogue Representations from Consecutive Utterances

Nov 2

dialogue: task-oriented and chatbots

E 19.3, JM 24 The original ELIZA: Zhivar - Semantic Diversity in Dialogue with Natural Language Inference; Fazle - On the Origin of Hallucinations in Conversational Models: Is it the Datasets or the Models?

Project Report Version 1 due

Nov 7

question answering, information retrieval

JM 23: James - Learning as Conversation: Dialogue Systems Reinforced for Information Acquisition; Yanze - Meet Your Favorite Character: Open-domain Chatbot Mimicking Fictional Characters with only a Few Utterances

Nov 9

natural language inference and common sense tasks

Modeling Semantic Containment and Exclusion in Natural Language Inference, Commonsense Reasoning and Commonsense Knowledge in Artificial Intelligence: Sophie - Knowledge-Grounded Dialogue Generation with a Unified Knowledge Representation; Jiarui - JointLK: Joint Reasoning with Language Models and Knowledge Graphs for Commonsense Question Answering

Nov 14

prompts, multi task large language models: InstructGPT paper (pp1–20), Large Language Models are Human-Level Prompt Engineers; Darpan - Robust Conversational Agents against Imperceptible Toxicity Triggers

Nov 16

adapters, prefix tuning, few-parameter fine-tuning

TBD: Sahana - Learning to Transfer Prompts for Text Generation

HW3 due

Nov 21

information extraction: JM 17, E 17, 25 years of IE

Nov 23

THANKSGIVING BREAK; NO CLASS

Nov 28

Project presentations

Nov 30

Project presentations