Carl Sable

Professor of Computer Engineering

ECE 467: Natural Language Processing

Catalog Data:

This course focuses on computational applications involving the processing of written or spoken human languages. Content may vary from year to year. Theoretical subtopics will likely include word statistics, formal and natural language grammars, computational linguistics, hidden Markov models, and various machine learning methods. Applications covered will likely include information retrieval, information extraction, text categorization, question answering, summarization, machine translation and speech recognition. Course work includes programming projects and tests.

Topics:

  1. Course Introduction.
  2. Tokenization, Words, and Morphology.
  3. N-grams and Conventional Language Models.
  4. Part-of-speech tagging.
  5. Information Retrieval and Text Categorization.
  6. Phrase Structure Grammars and Dependency Grammars
  7. Natural Languages and Psycholinguistics
  8. Parsing.
  9. Semantics.
  10. Feedforward Neural Networks
  11. Word Embeddings
  12. Recurrent Neural Networks, LSTMs, and GRUs
  13. Encoder-decoder Models, Attention, and Machine Translation
  14. Advanced Topics (Character and Subword Embeddings, Question Answering Systems, Trnasformers, Contextual Word Embeddings, Ethics)

Course Outcomes:

  1. Knowledge of various subtopics of natural language processing, some in depth.
  2. Familiarity with many important NLP algorithms and methodologies.
  3. Experience developing three large NLP programming projects.

Assessment Methods:

The three large programming projects test in-depth understanding of important NLP algorithms. Quizzes or problem sets are used to evaluate various levels of knowledge of other NLP topics.