IS&T Research Computing Boot Camp - Natural Language Processing: With and without Neural Networks

  • Starts: 10:00 am on Wednesday, January 19, 2022
  • Ends: 4:00 pm on Wednesday, January 19, 2022
Prerequisites: No prior MATLAB experience is assumed (although it will be helpful). No experience is required with Machine Learning, AI, Deep Learning etc. Description: The vast majority of human communication and knowledge is encoded in “natural language”, this Boot Camp will focus on two techniques for “Natural Language Processing” (NLP) using computers. The first does not require explicit use of neural networks (NN), but the second does. We will first look at a way of encoding words and how this encoding carries with it semantic meaning, “word2vec”. We will explore how we can then use the encoded semantic meaning to turn mathematical operations into linguistic ones. Using simple addition and subtraction we will see how we can reconstruct analogies for example. In the next part we will explore a breakthrough NN based NLP model called “GPT”. GPT first gained fame with the release of GPT-2, which is what we will explore using. GPT-2 simply tries to predict the next word in a sequence; however from this simple mechanism it is able to translate text, answer questions, summarizes passages, and generate text passages that can be indistinguishable from human-created ones. We will explore how to use GPT-2 based inference to do a variety of tasks. Throughout the Boot Camp attendees will work in small groups applying all these principles/techniques firsthand. After the first and second section of the Boot Camp each group will come up with a small project that applies the techniques learned in a novel way.
675 Commonwealth Avenue, SCI B39

