CSCI 544 Fall 2018 course page

Jonathan May and Nanyun (Violet) Peng

Website https://www.isi.edu/~jonmay/cs544_fa18_web/
Lectures SAL 101, Wednesdays and Fridays 8:00-9:50 am
Instructor office hours Jonathan May or Nanyun (Violet) Peng, RTH 512, Wednesdays and Fridays 10-11am or 11am-12pm with appointment
Textbook None required; Speech and Language Processing, 2nd Edition1 is optional but badly out of date. The new version (called ``jm'' in reading selection notes below) is incomplete but preferred2. All required readings from jm and elsewhere will be listed in the syllabus.
TA Office hours Xusen Yin, 10:30am-12:30pm Mon., SAL Lab
Ramesh Manuvinakurike, 11:30am-1:30pm Wed., SAL Lab
Sarik Ghazarian, 3pm-5pm Wed., SAL Lab
Quizzes (3% of grade) approximately 25 x 0.12% of grade each. Multiple choice online tests taken mid-class to break up the monotony. Open note, open internet, open discussion.
Homeworks (48.6% of grade) 6 x 8.1% of grade each. A mix of programming and written assignments, electronic submission due 11:59 PM local Pacific time per table below.
Project (8.1% of grade) Can be done in groups of up to 4. Preliminary part due mid-class, due on last class day. (this may be replaced with a seventh homework)
Late days Four (4) cumulative with no more than two (2) per assignment. 50% penalty first day; thereafter, no credit afterward.
Midterm (15% or 25% of grade, whichever is more advantageous) Friday, October 5, 8:00-9:45 am. Covers lectures, readings, and homeworks from 8/22-10/3
Final exam (15% or 25% of grade, whichever is more advantageous) Wednesday, December 5, 8:00-10:00 am. Covers lectures, readings, and homeworks from 8/22-11/30 (i.e. the entire class)
Contact us On Piazza or in class/office hours. Please do not email (unless notified otherwise).

A note about the midterm/final grading: whichever you do better on will count as 25% of your grade and the other one will count as 15% of your grade. Together the tests will count as 40% of your grade.

Syllabus/schedule - subject to change - latest version is always on the website!

instructor date material reading HW out HW due
JM 8/22intro, applicationsjm v2 c1 3, hirschberg & manning 4
JM 8/24corpora, text processing, words, regular expressions (lec 2 starting code 5)NLTK ch.2 6, jmv3 c27, nathan schneider's notes 8, Unix for poets9, sculpting text10,HW 1 (due 9/7)
JM 8/29Morphology, Finite-State Automata, FSA relationship to regex, finite state transducers jm v2 c3 11
JM 8/31 probability Goldwater probability tutorial 12, HW 2 (due 9/14)
JM 9/5Classifiers, features, naive Bayes, perceptron, logistic regressionjm v3 c613, Eisenstein notes pp. 21-4814
JM 9/7pos tagging, hmm, search jm v3 c915, blunsom notes16, collins notes (optional, more detailed)17HW1
NP 9/12parsing and syntax 1: treebanks, evaluation, cky, grammar inductionpenn treebank 18, jm v3 c11 19 jm v3 c1220, jm v3 c1321HW3=project proposal (due 9/26)
NP 9/14parsing and syntax 2: pcfgs, restructuring, lexicalization, smoothing, beamingchiang notes22HW2
NP 9/19parsing and syntax 3: dependencies, mst and shift reduce algorithmsjm v3 c1423HW 4 (due 10/10)
NP 9/21ngram LMsjm v3 c424
NP 9/26smoothing, interpolation(optional) chen and goodman25HW 3(pp)
NP 9/28feed forward lm, rnn lmjm v3 c826
JM 10/3 REVIEW
10/5MIDTERM
NP 10/10classical lexical semantics jm v3 c1727 HW 4
NP 10/12distributional lexical semanticsjm v3 c1528, jm v3 c1629HW 5 (due 10/26)
NP 10/17RNNs, LM and semantics jmv3 c730 c931
NP 10/19Information Extraction jmv3 c2132HW 6 (due 11/9)
NP 10/24Information Extraction: CRF Collins's Note33 Sutton and McCallum'a Note34 project rewrite (moved to 10/31)
NP 10/26Neural CRF Lample et. al.35HW 5
JM 10/31 MT evaluation and alignment arcturan and centauri36, Bleu37, koehn slides38project rewrite
JM 11/2Statistical MT Brown et al. 39 (mathy bits of models 3+ can be skimmed), Knight workbook40, mert41
JM 11/7Neural MT neubig tutorial42 or Koehn Chapter (section 13.5) 43
JM 11/9Summarization jmv2 c23.3-23.7 (see blackboard), abi see blog post44 HW 6
JM 11/14 Task-Oriented Dialogue jmv3 c2445, jmv3 c2546
JM 11/16 Chit-Chat Dialogue ritter et al.47
11/21NO CLASS
11/23NO CLASS
JM 11/28Leftovers: annotation, common sense, Entailment, generation overview of kappa: 48, bowman snli paper: 49, csr: davis and marcus50
JM/NP 11/30Reviewproject
12/5FINAL



Footnotes

... Edition1
https://www.amazon.com/Speech-Language-Processing-Daniel-Jurafsky/dp/0131873210/
... preferred2
https://web.stanford.edu/~jurafsky/slp3
... c13
http://www.cs.colorado.edu/~martin/SLP/Updates/1.pdf
... manning4
https://cs224d.stanford.edu/papers/advances.pdf
... code5
https://www.isi.edu/~jonmay/cs544_fa18_web/lec2.startercode.zip
... ch.26
http://www.nltk.org/book/ch02.html
... c27
https://web.stanford.edu/~jurafsky/slp3/2.pdf
... notes8
https://github.com/nschneid/unix-text-commands
... poets9
https://www.cs.upc.edu/~padro/Unixforpoets.pdf
... text10
http://matt.might.net/articles/sculpting-text/
... c311
http://ling.umd.edu/~idsardi/620/Jurafsky/jurafsky2000-3.pdf
... tutorial12
http://homepages.inf.ed.ac.uk/sgwater/teaching/general/probability.pdf
... c613
https://web.stanford.edu/~jurafsky/slp3/6.pdf
... 21-4814
https://github.com/jacobeisenstein/gt-nlp-class/blob/master/notes/eisenstein-nlp-notes.pdf
... c915
https://web.stanford.edu/~jurafsky/slp3/9.pdf
... notes16
http://digital.cs.usu.edu/~cyan/CS7960/hmm-tutorial.pdf
... detailed)17
http://www.cs.columbia.edu/~mcollins/courses/nlp2011/notes/hmms.pdf
... treebank18
http://aclweb.org/anthology/J93-2004
... c1119
https://web.stanford.edu/~jurafsky/slp3/11.pdf
... c1220
https://web.stanford.edu/~jurafsky/slp3/12.pdf
... c1321
https://web.stanford.edu/~jurafsky/slp3/13.pdf
... notes22
https://www3.nd.edu/~dchiang/teaching/nlp/2016/notes/chapter13v2.pdf
... c1423
https://web.stanford.edu/~jurafsky/slp3/14.pdf
... c424
https://web.stanford.edu/~jurafsky/slp3/4.pdf
... goodman25
https://people.eecs.berkeley.edu/~klein/cs294-5/chen_goodman.pdf
... c826
https://web.stanford.edu/~jurafsky/slp3/8.pdf
... c1727
https://web.stanford.edu/~jurafsky/slp3/17.pdf
... c1528
https://web.stanford.edu/~jurafsky/slp3/15.pdf
... c1629
https://web.stanford.edu/~jurafsky/slp3/16.pdf
... c730
https://web.stanford.edu/~jurafsky/slp3/7.pdf
... c931
https://web.stanford.edu/~jurafsky/slp3/9.pdf
... c2132
https://web.stanford.edu/~jurafsky/slp3/21.pdf
... Note33
http://www.cs.columbia.edu/~mcollins/crf.pdf
... Note34
https://people.cs.umass.edu/~mccallum/papers/crf-tutorial.pdf
... al.35
https://www.aclweb.org/anthology/N16-1030
... centauri36
https://www.isi.edu/natural-language/mt/aimag97.pdf
... Bleu37
http://www.aclweb.org/anthology/P02-1040.pdf
... slides38
http://www.statmt.org/book/slides/04-word-based-models.pdf
... al.39
http://www.aclweb.org/anthology/J93-2003
... workbook40
https://www.isi.edu/natural-language/mt/wkbk-rw.pdf
... mert41
http://www.aclweb.org/anthology/P03-1021
... tutorial42
https://arxiv.org/pdf/1703.01619.pdf
... 13.5)43
https://arxiv.org/abs/1709.07809
... post44
http://www.abigailsee.com/2017/04/16/taming-rnns-for-better-summarization.html
... c2445
https://web.stanford.edu/~jurafsky/slp3/24.pdf
... c2546
https://web.stanford.edu/~jurafsky/slp3/25.pdf
... al.47
http://aclweb.org/anthology/D11-1054
... kappa:48
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3900052/
... paper:49
https://nlp.stanford.edu/pubs/snli_paper.pdf
... marcus50
https://cs.nyu.edu/davise/papers/CommonsenseFinal.pdf


jonmay@isi.edu