Python for parallel scientific computing

by Dr Lisandro Dalcín

Centro Internacional de Métodos Computacionales en Ingeniería


The Python programming language has attracted the attention of many end-users and developers in the scientific community.  Python is a very powerful language, offers a clean and simple syntax, and has efficient high-level data structures.

Sophisticated but easy to use and well integrated packages are available for interactive command-line work (IPython), efficient multi-dimensional array processing (NumPy), symbolic computing (SymPy), and 2D and 3D visualization (matplotlib and Mayavi).

Python allows skilled users to build their own computing environment, tailored to their specific needs and based on their favorite high-performance Fortran, C or C++ codes.  These tasks are facilitated by tools like Cython, SWIG, F2PY and fwrap.

This course will cover concepts of parallel computing on distributed memory architectures using MPI and PETSc within a Python programming environment.  In particular, we will discuss two open-source tools providing Python access to MPI and PETSc functionalities.  We will cover the following topics:

  • MPI for Python (mpi4py)
    • high-level but slow communication of general Python objects
    • low-level but fast communication of NumPy array data
    • point to point and collective communications
    • dynamic process management
    • parallel input/output
  • PETSc for Python (petsc4py)
    • assemble distributed vector and matrices in parallel
    • solve linear and nonlinear systems (including matrix-free methods)
    • profiling and logging

We will produce quick Python scripts exercising these functionalities on extensive laboratory sessions. These simple codes will serve as building blocks to be reused for larger simulations. By incorporating C/C++ and Fortran codes to the game for selected, performance-sensitive parts, we will show that Python codes can achieve high performance while retaining high-level and flexibility.

Supplementary Material