MPI left-right communication kernel for performance testing on Stampede. Three binaries are used for testing:
threadWorkComm
Only 1 MPI task is ran and the number of workers corresponds to the number of pthreads that are created. Data is exchanged between pthreads. This code is using MPI_THREAD_MULTIPLE
mpiWork
Here workers are representing MPI tasks. Data is exhchanged between MPI tasks. The code is using MPI_INIT.
mpiWorkThreadMult
Here workers are representing MPI tasks. Data is exhchanged between MPI tasks. The code is using MPI_THREAD_MULTIPLE.
MPI_THREAD_MULTIPLE peformance with pthreads is bad on the Phi.
The following presentation and paper discuss the use of a library that implements the proposed MPI endpoints apis to increase bandwidth at the cost of increased latency. Results show net application speedup despite the latency penalty.
http://meetings.mpi-forum.org/secretary/2014/12/slides/Hybrid-Plenary--EP-Lib.pdf
http://pcl.intel-research.net/publications/sridharan-sc14.pdf
- envImpi.sh - environment file for Intel MPI 4
- envImpi5.sh - environment file for Intel MPI 5
- envImpi51.sh - environment file for Intel 16 and Intel MPI 5.1 (beta)
- envMvapich.sh - environment file for Mvapich2-mic
- envMvapich2.sh - environment file for Mvapich2-mic v2.0 - edit the install path
- envMpich3.sh - environment file for MPICH 3.2b2 - edit the install path
- build.sh - build script
- build.x86.sh - build script for stampede host processors
- getTimeImpi.sh - run script for Intel MPI 4 and 5
- getTimeImpi.x86.sh - run script for Intel MPI 4 and 5 on stampede host processors
- getTimeImpi51.sh - run script for Intel MPI 5.1
- getTimeMvapich.sh - run script Mvapich2-mic
- getTimeMvapich2.sh - run script Mvapich2-mic
- getTimeMpich3.sh - run script MPICH 3.2b2
- kernelComm.c - communication kernel
- kernelComm.x86.c - communication kernel with affinity on stampede host processors
- kernel.h - kernel header
- mpiWork.c - mpi driver
- README.md - this file
- threadWork.c - threaded mpi driver
source env<Impi|Impi51|Mvapich|Mvapich2|Mpich3>.sh
for i in 1 2 4; do ./build[.x86].sh $i; done
From an interactive idev session:
./getTime<Impi|Mvapich|Mvapich2|Mpich3>[.x86].sh
The following tests used MPSS 3.3 as installed on Stampede. see performance.csv
Mvapich2-MIC v2.0 is not installed on Stampede. To install it follow the instructions below:
wget http://mvapich.cse.ohio-state.edu/download/mvapich/mic/mvapich2-mic-2.0_mpss-3.3.run
./mvapich2-mic-2.0_mpss-3.3.run
#set the install path to something in your home/work directory
#accept defaults for library paths
edit the 'ins' variable in the environment file 'envMvapichMic2.sh' to point at the chosen install path
source envMvapich2.sh
create file 'config'
echo "-n 2 : $MV2_MIC_INSTALL_PATH/libexec/mvapich2/osu_latency" > config
create file 'hosts'
echo 'mic0:2' >> hosts
run
mpirun_rsh -config ./config -hostfile ./hosts
MPICH 3.2b is not installed on Stampede. To install it follow the instructions below:
wget http://www.mpich.org/static/downloads/3.2b2/mpich-3.2b2.tar.gz
tar xzf mpich-3.2b2.tar.gz
cd mpich-3.2b2
mkdir buildIntel15
cd buildIntel15
./doConfigIntel15.sh <install prefix absolute path>
make
make install