Research Computing Services (RCS) offers a number of training services in a wide range of topics, including data analysis and visualization, as well as Linux basics, programming and high performance computing.
Each semester the RCS staff offer a series of tutorials consisting of one to three hours of classroom instruction. Most of the sessions are hands-on and are designed to help you make effective use of the Boston University Shared Computing Cluster and its related scientific visualization resources. All the tutorials are free and open to all members of the Boston University community.
Below is a list of topics which may be of particular interest for data science. For a full list of the RCS tutorials and a schedule of upcoming sessions, please visit the RCS Tutorials page. You may also access the slides for the most recent version of each tutorial.
In addition to the regularly scheduled tutorials, Research Computing staff can offer extra sessions and/or customize tutorials for a particular course, seminar, lab or research group. These customized tutorials can be combinations of our regular materials or other similar content of specific interest to your group. Please contact us for more details by email to help@rcs.bu.edu.
Mathematics and Data Analysis Tutorials
- Introduction to MATLAB
- Introduction to SAS
- Introduction to SPSS
- Introduction to Mathematica
- Introduction to R
- Graphics in R
- Programming in R
- R Code Optimization
- Tuning MATLAB Codes For Better Performance
- MATLAB Parallel Computing Toolbox
Visualization Tutorials
Mathematics and Data Analysis Tutorials
Introduction to MATLAB
MATLAB (for MATrix LABoratory) is a numerical computing environment developed by MathWorks, Inc. MATLAB is essentially an interpretive high level language that does not require data type declaration or compilation. It can be used to implement mathematical computations such as matrix manipulations with existing linear algebra packages. Many plotting and visualization tools are available as an integral part of MATLAB. MATLAB operations are very intuitive, user-friendly, and are used primarily in an interactive environment to enable fast proto-typing of research activities and efficient software development. Many highly specialized applications, such as Mathematical Finance, Bioinformatics, and Image Processing, are also available as toolboxes.
In this tutorial, many of the basic MATLAB operations, including basic 2D and 3D graphics, will be introduced. You will learn many of these operations hands-on.
No prior programming experience in any languages is required to attend this course. However, basic knowledge of linear algebra, such as matrix operations, is required.
Introduction to SAS
SAS (Statistical Analysis System) is one of the most powerful statistical packages available on any computer platform. This tutorial will introduce you to SAS on acs-linux.
- access SAS on ACS on campus and remotely via Windows
- create, edit, and save program files containing SAS commands
- obtain printed output
- create, run, and modify your own programs
Introduction to SPSS
SPSS (Statistical Package for the Social Sciences) is a widely used program for analyzing data. SPSS uses windows and dialog boxes to manipulate data and perform statistical analyses. This hands-on tutorial will introduce you to the basics of SPSS and will give you one hours’ practice using SPSS on Microsoft Windows.
- enter data into SPSS
- use SPSS to transform data
- use SPSS to perform basic statistical analyses
Introduction to Mathematica
The purpose of this Mathematica tutorial is to provide you with a starting point for the use of this powerful and well-developed tool.
- create data vectors and arrays
- read-in data from files
- plot basic functions and data
- solve simple equations
No prior programming experience in any language is required to attend this course.
Introduction to R
R is the most powerful, rapidly developing, highly reliable, open source statistical language. It is widely used among statisticians for the development of statistical software and for data analysis. New features appear every few months.
- operators and arithmetic operations
- atomic types, variable rules and built-in constants
- scalar and vector function overview
- working with data (workspace setup as well as reading, creating, exploring, and saving data)
- working with R data types (vectors, matrices, lists, data frames)
- working with script files
- installing and loading R extension packages and getting help
- overview of functions for data analysis
- know the basics of the R environment.
- get a solid understanding of various data types and objects used in R.
- be able to create, load and analyze data.
- find appropriate functions and get necessary help and examples for these functions.
Graphics in R
R provides extensive and powerful graphics options that allow for the production of publication-ready, high quality diagrams, and plots. This tutorial introduces R graphics libraries and functions.
- understand what to expect from R’s graphics capabilities.
- be able to create, modify, and customize graphs and plots used in statistical analysis.
- find appropriate libraries, download, and use them for your visualization needs.
Prerequisite: If you are new to the R environment we strongly recommend that you also register for the “Introduction to R” tutorial.
Programming in R
This tutorial is the third in a series of R tutorials. It introduces basic R programming, debugging and optimization techniques and develops practices of proper and efficient R coding. It covers the following topics:
- if-else and switch statements
- types of loops (for, while, repeat) and loop control statements (next, break)
- user functions and argument definitions
- local and global variables
- apply function family
- sourcing, timing, compilation and debugging
- code profiling and optimization
Prerequisite: We strongly recommend that you also register for the “Introduction to R” tutorial if you are new to the R environment.
R Code Optimization
This tutorial is primarily aimed at those who have some experience working in a Linux environment and programming in R. The topics covered in this tutorial:
- debugging and profiling R code
- choosing the right functions to speed-up your code
- parallelization techniques
- tuning your code for faster performance on the SCC cluster
Tuning MATLAB Codes For Better Performance
Being an interpretive language, MATLAB provides many features to enhance the ease‐of‐use of interactive operations. However, these features may have the adverse effect of degrading computational performance. This is especially pronounced on jobs that require long run times and large memory. This tutorial identifies these pitfalls and demonstrates ways to improve user code performance.
The prerequisite for this course is a basic knowledge of MATLAB, either developed on your own or from our Introduction to MATLAB tutorial.
MATLAB Parallel Computing Toolbox
MATLAB Parallel Computing Toolbox is now available to Boston University’s MATLAB users. This toolbox enables users to solve computationally intensive and data intensive problems on multi‐cored personal computers, laptops, and especially the Shared Computing Cluster (SCC) managed by the Scientific Computing and Visualization group of Information Services & Technology.
Parallel processing operations such as parallel for‐loops, parallel numerical algorithms, and message‐passing functions let you implement task‐ and data‐parallel algorithms in MATLAB. Converting serial MATLAB applications to parallel MATLAB applications usually requires few code modifications and no programming in a low‐level language.
The prerequisite for this course is a basic knowledge of MATLAB, either developed on your own or from our Introduction to MATLAB tutorial.
Visualization Tutorials
Scientific Visualization Using MATLAB
MATLAB is a high-performance language for technical computing. It integrates computation, visualization, and programming in an easy-to-use environment where problems and solutions are expressed in common mathematical notation. MATLAB has facilities for producing a wide variety of plots, graphs, surfaces, volumes, and specialized visualizations for scientific data.
This tutorial will present a hands-on introduction to producing scientific graphics with MATLAB. We will begin with examples of various plotting methods, including surface plots, slices, contours, isosurfaces, oriented glyphs, streaklines, etc. In addition, we will cover annotation, color mapping, MATLAB’s underlying graphics model, and the production of high-resolution images.
Introduction to Maya
Autodesk Maya 2013 is a powerful state-of-the-art 3D modeling and animation software package. It has a wide variety of modeling, animation, special effects, and rendering tools. It has a customizable graphical user interface as well as a scripting language for optimal flexibility in problem solving and production.
In this tutorial we will show you how to get started using Maya. We will teach you the basic workflow for modeling, creating and applying materials, animation, and rendering. We will also cover the basics of importing scientific geometric data and creating high quality renderings and animations from it.
Ordinarily Maya is considered to have a steep learning curve, but in this tutorial we will present a workflow which will provide a sound foundation for pursuing more complex projects.