Homework 9, 11/17
Latency hiding in three-point average
Look at the example Fortran and C++ code in the hw9
directory of your assignments repo (be sure to pull from the work
repo!). These codes implement a blocking ghost zone exchange for a 1D vector assuming periodic boundary conditions. The operation performed in parallel on the vector is a simple three-point rolling averaging.
- With your group, come up with a design plan for how to adapt this code to implement “latency hiding.” By this we mean to overlap communication and calculation by posting non-blocking communication calls, then instead of waiting for those calls to complete, going ahead and perform all calculations that do not depend on the data being communicated.
- Now, go ahead with implementing a latency-hiding version of the three point averaging function. Check that your code works and gives the correct result on 1, 2, and 4 processors.
- Compare the performance of the non-blocking version of the code to the blocking version of the code for up to 200 ranks keeping the array size per-rank fixed.
Latency hiding with one-sided MPI
- Watch the lecture by Bill Gropp on One-sided MPI.
- Modify your three-point averaging code to use one-sided MPI communication to achieve the halo exchanges.
- Implement overlapping communication and calculation then be sure to correctly handle computing the “boundary” values that depend on halo data.
- Verify that you get sensible results for arbitrary number of processors and global array sizes.
- Compare the relative performance of the non-blocking and one-sided versions of your three-point averaging code for up to 200 ranks keeping the array size per-rank fixed.
What to turn-in
To your git repo, in the hw9
directory, commit your final, working code for both the non-blocking and one-sided three-point averaging. Also commit a PDF describing the relative performance of the codes including plots. Your final code and write-up is due in one week, on 11/17.