This post introduces the Open MPI library. (Its official website, its GitHub Repo)
Open MPI is a popular open source MPI implementation.
Open MPI is a Message Passing Interface (MPI) library project combining technologies and resources from several other projects (FT-MPI, LA-MPI, LAM/MPI, and PACX-MPI). It is used by many TOP500 supercomputers including Roadrunner, which was the world’s fastest supercomputer from June 2008 to November 2009,and K computer, the fastest supercomputer from June 2011 to June 2012.
The Open MPI developers selected these MPI implementations as excelling in one or more areas. Open MPI aims to use the best ideas and technologies from the individual projects and create one world-class open-source MPI implementation that excels in all areas. The Open MPI project specifies several top-level goals:
- to create a free, open source software, peer-reviewed, production-quality complete MPI-3.0 implementation
- to provide extremely high, competitive performance (low latency or high bandwidth)
- to involve the high-performance computing community directly with external development and feedback (vendors, 3rd party researchers, users, etc.)
- to provide a stable platform for 3rd-party research and commercial development
- to help prevent the “forking problem” common to other MPI projects
- to support a wide variety of high-performance computing platforms and environments
This post lists resources for using MPI with Python.
An application built with the hybrid model of parallel programming can run on a computer cluster using both OpenMP and Message Passing Interface (MPI), such that OpenMP is used for parallelism within a (multi-core) node while MPI is used for parallelism between nodes. There have also been efforts to run OpenMP on software distributed shared memory systems, to translate OpenMP into MPI and to extend OpenMP for non-shared memory systems.
cython.parallel is built on top of OpenMP (see Using Parallelism)
Please read Laurent Duchesne’s excellent step-by-step guide for parallelizing your Python code using multiple processors and MPI.
On our cluster, to run MPI Python programs, mpi4py has been compiled against OpenMPI 1.10.1 therefore we need to load that additional package:
module load python/3.4.3 mpi/openmpi/1.10.1-gcc
Create the the test MPI example file as described in Laurent’s guide above, using the same name
from mpi4py import MPI
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()
print("I am rank", rank, "of", size)
Create the SLURM submission script
#SBATCH -n 4
mpirun python mpi.py
You should get output similar to:
I am rank 3 of 4
I am rank 0 of 4
I am rank 1 of 4
I am rank 2 of 4
Craig Finch has a more practical example for high throughput MPI on GitHub.