IS&T RCS Summer 2022 Trainings and Open House

Trainings: May 31 – June 29, 2022
Open House: June 14, 11am – 2pm

Registration is open for the RCS Summer 2022 Trainings.

This summer’s trainings include the usual 2 hour tutorials as well as longer 4+ hour in-person project-oriented Boot Camp sessions focused on C, R, Git, Image Processing, and Natural Language Processing. In addition to the trainings, in honor of our return to in-person training, RCS is hosting an Open House and invites you to join us at 2 Cummington Mall for light refreshments and open office hours. RCS staff will be happy to answer your questions, give you feedback on your programming project, or simply say “Hello”.

  • All of the Boot Camp sessions are new as is the “Things to Know about Machine Learning” tutorial.
  • For hands-on sessions where you wish to use your own computer, please have the appropriate software installed on your computer before the tutorial starts.
  • Tutorials are tagged based on experience required (Beginner, Intermediate, or Advanced), location (details below), and if they are new.
  • Most of our summer trainings are offered only live in-person. There are four trainings being offered over Zoom; these have special considerations:
    • Please register at least three days in advance in order to be emailed the Zoom link.
    • Sessions will be recorded; keep your camera off if you do not want your image recorded. The recorded sessions may be made available to the BU community.
  • Videos and slides for past tutorials by RCS staff and vendors are available. Access to that page is restricted to the BU community and you must agree not to share the materials.

The IS&T Research Computing Services (RCS) group offers a tutorial series on programming, data analysis, high performance computing, and visualization three times each year. These tutorials are free and open to all members of the Boston University community.

The RCS tutorials cover concepts, techniques, and tools which researchers can use in their own computing environments. Many are designed to help you make effective use of the Boston University Shared Computing Cluster (SCC). The RCS staff can also deliver extra, or customized, tutorial sessions to your course, group, or lab. Please contact us at help@scc.bu.edu if you are interested.

Also in June, XSEDE is offering a free HPC Workshop. Currently, the agenda is available but the dates are not.

Register

Boot Camp Schedule

Tutorials Schedule

You may register for as many tutorials as you like. Registration is required and is accessed with your BU Kerberos password.

If you don’t have a Kerberos password, or if you find that a tutorial is full, or have any other questions, please send email to rcs-tutorial@bu.edu.

Tutorial Locations

2CM 2 Cummington Mall, Room 107
SCI Metcalf Science Center (SCI), 590 Commonwealth Avenue, Room B39
Zoom Online over Zoom Registered attendees will be sent via email the Zoom link for each tutorial 2-3 days before the tutorial starts and at this point registration for the tutorial will close.


Tutorial Descriptions and Times

RCS Boot Camp Sessions

Beginner Gentle Introduction to Programming in C, Part One (Hands‐on)

Instructor: Katia Bulekova (ktrn@bu.edu)

2CM Tuesday, May 31, 9:30am – 3:00pm

The goal of this immersive, hands-on Boot Camp is to build the foundation of programming in C. The many hands-on exercises will teach you key programming skills and make you feel confident about writing and compiling your C programs.

During the Boot Camp we will use the Shared Computing Cluster, so you do not need to install anything prior to the Boot Camp. Please feel free to bring your own laptop or use the computers available in our tutorial room.

This workshop aims to cover the following topics:

  • Basic C program: types and variables
  • Mathematical Operations
  • Conditional Statements
  • Loops
  • Functions in C
  • Pointers and Memory Management
  • Handling strings in C
  • Advanced Data Types
  • Handling Input and Output in C
  • Using Makefile for code compilation

Prerequisites: No prior programming experience is required, but it is helpful.

Beginner Gentle Introduction to Programming in C, Part Two (Hands‐on)

Instructor: Katia Bulekova (ktrn@bu.edu)

2CM Wednesday, June 1, 9:30am – 2:00pm

This tutorial is a continuation of Gentle Introduction to Programming in C, Part One. Please register for both parts.

Beginner MATLAB Image Processing using the SVD (Hands‐on)

Instructor: Josh Bevan (jbevan@bu.edu)

SCI Tuesday, May 31, 1:00pm – 5:00pm

In this Boot Camp we will explore the power of the Singular Value Decomposition (SVD) and its usefulness as a tool in Image Processing and Analysis. We will first explore the ideas underpinning the SVD and how it decomposes data into singular values and eigenvectors. We will then apply this knowledge to examine how it can be used for image compression, component analysis, facial recognition/classification with clustering, and feature detection.

Throughout the Boot Camp attendees will work in small groups applying all these principles/techniques firsthand. At the end of the Boot Camp, each group will come up with a small project that applies the techniques learned in a novel way.

Beginner Git and GitHub for Version Control & Collaboration (Hands‐on)

Instructor: Katia Bulekova (ktrn@bu.edu)

2CM Monday, June 6, 9:30am – 3:00pm

In this interactive, hands-on Git workshop, attendees will learn how to use Git for version control and collaboration with other code developers.

We will go over the fundamentals of version control using Git and GitHub and practice the essential git commands.

Participants will learn how to systematically store different versions of their code, recover previous versions, and safely integrate changes. We will also explore the connection with GitHub, productive collaboration with others, working with branches, and resolving conflicts.

We recommend that you bring your own computer with a recent version of Git installed. You should also have a GitHub account.

Prerequisites: No prior programming experience is required.

Beginner Building R Packages, Part One (Hands‐on)

Instructor: Katia Bulekova (ktrn@bu.edu)

2CM Tuesday, June 7, 9:30am – 2:00pm

In this two-day workshop, you will learn the process of creating an R package from scratch. In addition, we will go over the best practices in building R packages, including how to test that functions execute properly, create informative documentation and examples, and use unit tests to ensure that the various components of the package behave as intended. Finally, we will go over the workflow of using Git for publishing the new package.

Prerequisites: Basic knowledge of the R language; ability to write and use R functions.

We recommend that you bring your own computer with a recent version of R and RStudio installed.

Beginner Building R Packages, Part Two (Hands‐on)

Instructor: Katia Bulekova (ktrn@bu.edu)

2CM Thursday, June 9, 9:30am – 2:00pm

This tutorial is a continuation of Building R Packages, Part One. Please register for both parts.

Beginner MATLAB Natural Language Processing (Hands‐on)

Instructor: Josh Bevan (jbevan@bu.edu)

SCI Tuesday, June 7, 1:00pm – 5:00pm

Human communication/knowledge is encoded with “natural language”; working with the encoded info computationally is known as “Natural Language Processing”. We will first look at a way of encoding words and how this encoding carries with it semantic meaning, “word2vec”. We will explore how we can then use the encoded semantic meaning to turn mathematical operations into linguistic ones. For example, it is possible to compute analogies simply using addition/subtraction.

In the next part we will explore a breakthrough neural-network based NLP model called “GPT”. GPT first gained fame with the release of GPT-2; it simply tries to predict the next word in a sequence, however from this simple mechanism it is able to translate text, answer questions, summarizes passages, and generate text passages that can be indistinguishable from human-created ones. We will explore how to use GPT-2 based inference to do a variety of tasks.

Throughout the Boot Camp attendees will work in small groups applying all these principles/techniques firsthand. At the end of the Boot Camp, each group will come up with a small project that applies the techniques learned in a novel way.

Register

Research Computing Basics Tutorials

Beginner Introduction to Linux (Hands‐on)

Instructor: Augustine Abaris (augustin@bu.edu)

Zoom Thursday, June 2, 10:00am – 12:00pm
2CM Friday, June 3, 10:00am – 12:00pm

This tutorial will give attendees a hands-on introduction to Linux. Topics covered will include a short history of Linux, logging in with ssh, the Bash shell and shell scripts, I/O redirection (pipes), file system navigation, and job control. Time permitting, attendees will edit, compile, and run a simple C program.

If you have not connected to the SCC from your home machine before, please read and follow these instructions prior to attending the tutorial.

Beginner Introduction to BU’s Shared Computing Cluster (Hands‐on)

Instructor: Aaron Fuegi (aarondf@bu.edu)

Zoom Thursday, June 2, 1:00pm – 3:00pm
2CM Friday, June 3, 1:00pm – 3:00pm

This tutorial will introduce Boston University’s Shared Computing Cluster (SCC) in Holyoke, MA. This Linux cluster has more than 21000 processors and over 9 petabytes of storage available for Research Computing by students and faculty on the Charles River and BUMC campuses. A very large number of software packages for programming, mathematics, data analysis, plotting, statistics, visualization, and domain-specific disciplines are available as well on the SCC. You will get a general overview of the SCC and the facility that houses it and then a hands-on introduction covering connecting to and using the SCC for new users. This tutorial will cover a few basic Linux commands but we strongly encourage people to also take our more extensive “Introduction to Linux” tutorial.

There will also be ample time for questions of all types about the SCC.

For those in the BU community interested in using a particular package on the SCC, after taking this tutorial we also recommend viewing one of our short videos on that package if one is available. After that, if you have questions, you can sign up for one of our Office Hours sections.

Please read and follow these instructions prior to attending the tutorial.

Beginner Research Computing Office Hours – R, SAS, and Stata

Instructor: Katia Bulekova (ktrn@bu.edu)

Zoom Wednesday, June 8, 1:00pm – 2:00pm

During Research Computing Office Hours, our staff will be happy to answer any question you might have related to using R, SAS, and/or Stata on the Shared Computing Cluster. If you are new to the SCC, we highly recommend you first attend one or more of our introductory SCC tutorials (“Introduction to BU’s Shared Computing Cluster” or “Introduction to Linux”) first or watch some of our introductory videos.

During our Office Hours we will be happy to answer questions, assist with use of the cluster, and help you get started with using the SCC. For more complex questions, emailing help@scc.bu.edu to receive tailored assistance is probably a better option.

Intermediate Intermediate Usage of the SCC (Lecture)

Instructor: Katia Bulekova (ktrn@bu.edu)

2CM Friday, June 10, 10:00am – 12:00pm
Zoom Tuesday, June 14, 10:00am – 12:00pm

This tutorial will provide some more advanced techniques and common strategies used for interacting with the Shared Computing Cluster and its resources.

The topics discussed during the tutorial include:

  • Customizing your environment
  • Parallel computing on the SCC
  • Jobs monitoring (CPU and memory usage)
  • Profiling programs for performance optimization
  • General optimization strategies

Prerequisites: some prior experience with high performance computing or attendance of our “Introduction to BU’s Shared Computing Cluster” tutorial.

Beginner RCS Open House

Various RCS staff members will be available to help

2CM Tuesday, June 14, 11:00am – 2:00pm

In honor of returning to in-person training, RCS is hosting an Open House and invites you to join us at 2 Cummington Mall for light refreshments and open office hours. RCS staff will be happy to answer your questions, give you feedback on your programming project, or simply say “Hello”. Registration is appreciated but not required.

Register

Computer Programming Tutorials

Introduction to C++, Part One (Hands-on)

Instructor: Brian Gregor (bgregor@bu.edu)

2CM Thursday, June 16, 1:00pm – 3:00pm

C++ is an enduringly popular compiled language that is well-suited for the implementation of high performance and complex programs. This tutorial assumes familiarity with compiled languages (e.g. languages like C, C#, Java, or Fortran). The topics that will be covered over the four parts are: C++ syntax and data types, the C++ Standard Template Libray, object-oriented programming principles, writing C++ classes, and class inheritance. Each part builds on the material from the preceding part and attendees are encouraged to sign up for all four parts.

Introduction to C++, Part Two (Hands-on)

Instructor: Brian Gregor (bgregor@bu.edu)

2CM Tuesday, June 21, 1:00pm – 3:00pm

This tutorial is the second part of a four part series that begins with Introduction to C++ Programming, Part One.

Introduction to C++, Part Three (Hands-on)

Instructor: Brian Gregor (bgregor@bu.edu)

2CM Thursday, June 23, 1:00pm – 3:00pm

This tutorial is the third part of a four part series that begins with Introduction to C++ Programming, Part One.

Introduction to C++, Part Four (Hands-on)

Instructor: Brian Gregor (bgregor@bu.edu)

2CM Tuesday, June 28, 1:00pm – 3:00pm

This tutorial is the fourth and final part of a four part series that begins with Introduction to C++ Programming, Part One.

Register

Data Analysis Tutorials

Intermediate Analysis of Large Datasets Using R’s Data.table (Hands-on)

Instructor: Katia Bulekova (ktrn@bu.edu)

2CM Thursday, June 16, 10:00am – 12:00pm

The R data.table package is known for its efficient handling of large datasets and low memory usage compared to similar functions in other packages. In many cases it might significantly decrease the time your code needs to execute. This tutorial will introduce you to the utilities in this package, including:

  • data query
  • basic aggregate/update operations
  • data subsetting, data updating by reference
  • set() utilities
  • indexing and fast joins

Advanced R Code Optimization (Lecture)

Instructor: Katia Bulekova (ktrn@bu.edu)

2CM Tuesday, June 21, 10:00am – 12:00pm

This tutorial is primarily aimed at those who have some experience working in a Linux environment and programming in R. The topics covered in this tutorial:

  • debugging and profiling R code
  • choosing the right functions to speed-up your code
  • parallelization techniques
  • tuning your code for faster performance on the SCC cluster

Advanced R Code Parallelization (Lecture)

Instructor: Katia Bulekova (ktrn@bu.edu)

2CM Thursday, June 23, 10:00am – 12:00pm

This tutorial will discuss how parallel libraries work in R and when parallel computing may be useful. We will first review the sequential loops and apply-family functions. We will then explore a few popular parallel R packages, such as parallel, foreach, and snowfall. We will also discuss other topics such as memory management and Random Numbers handling in parallel computing.

Register

High Performance Computing Tutorials

Intermediate Introduction to OpenMP (Hands‐on)

Instructor: Josh Bevan (jbevan@bu.edu)

2CM Thursday, June 9, 2:30pm – 4:30pm

Many programs can be sped up by using additional CPU cores. To do this the execution needs to be parallelized and distributed across multiple cores. OpenMP provides a relatively straightforward way to do this for single machines (desktop/laptop) or a single computational node on a cluster. By adding directives within the code to modify the behavior of the compiler, you can generate programs that will use multiple cores. This tutorial will take a hands-on look at several example serial (single-core) programs and show how to use OpenMP to modify them to run in parallel.

Experience in Fortran and parallel programming will be helpful, but not required. It is expected attendees have previous programming experience in at least one language, preferably a compiled one.

Intermediate Things to Know About Machine Learning (Hands‐on)

Instructors: Josh Bevan (jbevan@bu.edu) and Brian Gregor (bgregor@bu.edu)

2CM Monday, June 13, 10:00am – 12:00pm

Over the past several years machine learning has become a popular tool in a wide variety of disciplines. The effective use of machine learning requires knowledge of how to best use available computational and software resources. This tutorial focuses on best practices for solving machine learning problems on the SCC and is also relevant for other HPC/cloud environments.

Introduction to Parallel Programming Concepts (Hands‐on)

Instructor: Brian Gregor (bgregor@bu.edu)

2CM Wednesday, June 15, 1:00pm – 3:00pm

This “Introduction to Parallel Programming Concepts” tutorial is recommended for anyone interested in learning more about the topic or who plans on taking our language-specific tutorials on parallel programming. This tutorial is not oriented towards any program language in particular and is intended for anyone with programming experience. This tutorial covers basic topics such as the use of processes and threads, types of computer hardware for parallel computing, and the limits of parallelization as a strategy. Additionally, several common data and algorithm patterns in software will be discussed along with effective strategies on how to parallelize them.

Python Parallelization (Hands‐on)

Instructor: Brian Gregor (bgregor@bu.edu)

2CM Wednesday, June 22, 1:00pm – 3:00pm

This tutorial is an introduction to the variety of ways that parallel computations can be performed in Python. Ways of identifying code that can benefit from parallelization will be discussed. Several parallelization methods using the Python language and external libraries will be covered with examples. This tutorial assumes an intermediate understanding of the Python language and parallel computing concepts. It is strongly recommended that the “Introduction to Parallel Programming Concepts” tutorial be taken first for those new to parallel software development.

If you do not have Python installed on your home machine, please read and follow these instructions prior to attending the tutorial.

Python Optimization (Hands‐on)

Instructor: Brian Gregor (bgregor@bu.edu)

2CM Wednesday, June 29, 1:00pm – 3:00pm

This tutorial is for those with intermediate Python experience who are interested in optimizing their code to maximize performance. The topics covered are profiling and timing Python code, selecting data structures, avoiding common pitfalls, using external libraries, and tuning Python code.

If you do not have Python installed on your home machine, please read and follow these instructions prior to attending the tutorial.

Register