Schedule

Schedule: Reading and homework assignments

THIS SCHEDULE IS SUBJECT TO CHANGE
Check Canvas for specific due dates and times.

Week 1. NLP and Basic Text Processing. Chaps 1 and 2

January 8: The NLP Pipeline

General applications of NLP
The Holy Grail: Deep understanding of extended discourses such as stories or dialogues.
Installing NLTK, examples of how to use it
Reading in language data from files, splitting sentences etc.
Counting words modeling their frequency (CH. 1)
Collocations and Bigrams
Reading: Chapters 1 and 2 (reading in natural language data from files)
Homework 1 assigned.

January 10: Basic Text Processing. Unigrams, Words, POS

Review NLTK Lexical resources (Read Chapter 2)
Tokenization (READ ch. 3.1.1)
POS Word categorization & generalization
Stemming
Collocations
Lexical Meaning: Wordnet
Wordnet (READ ch. 2.5)
Synonyms and Synsets.
Word Senses
Read Chapter 3 for next week.

Week 2: Words, sentences, frequencies, POS, ngrams. Chaps 3 and 5

Homework 1 DUE

January 15: Lexical Resources. Moving beyond Words and POS

PYTHON: Review chapter 4 if you are learning Python as you go along.
Review HW1.
Homework 2 assigned.
Regular Expressions (Read ch 3.4, 3.5, 3.6, 3.7)
POS tagging patterns with Regexp
What is an ontology?
Wordnet API, how it works (READ ch. 2.5)
Synonyms and Synsets.
Semantic Relatedness
Brief introduction to Distributional Semantics and Word Embeddings (Word2Vec)

January 17 : Processing Bits of Language above the word

POS tagging & applications (READ ch. 5.1 & 5.2)
Review of Probability and Conditional Probability
N-Gram language models (READ Chapter 5.4 and 5.5)
Introduction to Text Classification

Week 3: Natural Language Understanding I

Homework 2 DUE

January 22: Text Classification

READING: Chapter 5, section 3 (python dictionaries) Chapter 6, sections 6.1 to 6.4
Homework 3 assigned.
Text Classification Using Sentiment Lexicons, Lexical Resources
Classifying Texts or Utterances into Categories.
Defining an Experiment.
Example: Restaurant Reviews. For homework.
Constructing Feature Representations of Texts.
Features for POS, features for words (unigrams), Bigram Features.
Sentiment Lexicons: LIWC Linguistic Inquiry and Word Count. LIWC Features
How to do error analysis on your classifier predicted output.

January 24:

How many NLU problems can be cast as classification problems?
How to figure out what features are useful.
Example: Movie Reviews: Thumbs up or Thumbs Down?
Examining the most important features.
SciKit-learn package for classifiers in NLTK.
Decision Tree learners
Differences between different types of classifiers
How Naive Bayes works.

Week 4: More classification

Homework 3 DUE

January 29: Classifiers, feature analysis, word embeddings.

HW4: Extend HW3 with more features, learners, analysis.
- Try word Embeddings, Decision Tree learners, SVM in SciKit Learn
Feature Selection
Different types of classifiers
Distributional Semantics: Background to Word Embeddings
Word Embeddings: how they are built, how you can use in classification

January 31: The Lexicon, Verbs and their subcategorization.

Homework 5 provides you with sample problems that will allow you to review for the midterm.
More on Lexical Subcategorization (Re-read Chapter 2)
Lexical Meaning: Wordnet and Verbnet
Review Wordnet
Verbs and their dependents

Week 5:

Homework 4 is due

February 5:

Midterm review
How Parsers Work, Part 1.

February 7: Midterm

Bring a Scantron

Midterm (Probability, Conditional Probability, NGram Language models, POS tagging, Stemming, Collocations, Text Classification, Naive Bayes, Wordnet, Verbnet}
Multiple Choice. Bring a PINK SCANTRON.

Week 6:

February 12: Starting on the projects!! Natural Language Understanding II

Homework 6 assigned.

- Factoid QA
- Baseline QA system. String operations, sentence selection/ranking.
- Types of Questions in baseline QA: Who, What, When, Where
- Identifying likely phrases and sentences, REGEXP patterns.
- Sample Code Stubs for baseline system
- Evaluation metrics: Precision, Recall, F-measure
- Maximizing Recall at the expense of Precision
- Setup of QA task and demo of scoring
- The evolution of parsing algorithms.
- How Parsers work. Probabilistic Parsing.

February 14:

Week 7:

February 19: Chunking, Sentence Structure and Parsing I

February 21: MORE Natural Language Understanding for QA

Chunking (Shallow Parsing vs. Parsing). Read Ch. 7.
Sentence Structure. READ ch. 8.1-8.3, 8.5
QA pipeline
Baseline QA system using string operations and sentence ranking
Next Steps: Using Syntax
Stanford Dependency Parse structure VS. Constituent structure
Dependency Structures and Relations
Constituent Structures and CFG rules
Where can we use text classification? How many NLU problems can be cast as classification problems?
POS tagging as a classification problem
Revisit N-Gram language models (READ Chapter 5.4 and 5.5)
Introduction to Parsing

Week 8: Question Answering II

HOMEWORK 6 is DUE

February 26: Question Answering II

Question Answering using Syntax, Working with NLU representations for Question Answering
Homework 7 assigned.
IMPROVING PRECISION.
Grammars and Parsing
Constituency and Dependency Tree Readers
Chunking and Parsing: How to search trees
Pattern Matching on Dependency Relations
Ranking possible responses

February 28:

Working with NLU representations.
Syntactic Structure and Coordination
Prepositional Phrase Attachments
Dependency vs. Constituent Structures II
Answering Questions from NLU/parsing representations

Week 9: Question Answering III: Lexicons & Lexical Semantics

Homework 7 DUE.

March 5: Using VerbNet and WordNet API in QA

HW8: final HW assigned. The final QA competition.
Using WordNet and Verbnet. New kinds of questions.
Increasing Precision of Answers
Verbs and their dependents, VerbNet semantic role types
Verbnet and Wordnet API, how it works
Word Sense Disambiguation .DICT files provided with HW8.
Constituent and Dependency Trees, finding subjects etc
Increasing Precision of Answers

Schedule: Reading and homework assignments

Week 1. NLP and Basic Text Processing. Chaps 1 and 2

January 8: The NLP Pipeline

January 10: Basic Text Processing. Unigrams, Words, POS

Week 2: Words, sentences, frequencies, POS, ngrams. Chaps 3 and 5

January 15: Lexical Resources. Moving beyond Words and POS

January 17 : Processing Bits of Language above the word

Week 3: Natural Language Understanding I

January 22: Text Classification

January 24:

Week 4: More classification

January 29: Classifiers, feature analysis, word embeddings.

January 31: The Lexicon, Verbs and their subcategorization.

Week 5:

February 5:

February 7: Midterm

Week 6:

February 12: Starting on the projects!! Natural Language Understanding II

February 14:

Week 7:

February 19: Chunking, Sentence Structure and Parsing I

February 21: MORE Natural Language Understanding for QA

Week 8: Question Answering II

February 26: Question Answering II

February 28:

Week 9: Question Answering III: Lexicons & Lexical Semantics

March 5: Using VerbNet and WordNet API in QA

March 7: No class. Work on your projects

Week 10: Question Answering Competition & Poster presentations

March 12: Poster session

March 14: Poster session

FINALS WEEK

FINAL SLOT: Wednesday March 20th. 8:00 - 11:00 AM.