WeeklyTelcon_20210112

Open MPI Weekly Telecon ---

Dialup Info: (Do not post to public mailing list or public wiki)

Attendees (on Web-ex)

Akshay Venkatesh (NVIDIA)
Aurelien Bouteiller (UTK)
Brendan Cunningham (Cornelis Networks)
Christoph Niethammer (HLRS)
Edgar Gabriel (UH)
Geoffrey Paulsen (IBM)
George Bosilca (UTK)
Harumi Kuno (HPE)
Hessam Mirsadeghi (UCX/nVidia)
Howard Pritchard (LANL)
Jeff Squyres (Cisco)
Joseph Schuchart
Matthew Dosanjh (Sandia)
Michael Heinz (Cornelis Networks)
Naughton III, Thomas (ORNL)
Raghu Raja (AWS)
Ralph Castain (Intel)
Todd Kordenbrock (Sandia)
William Zhang (AWS)

not there today (I keep this for easy cut-n-paste for future notes)

Artem Polyakov (nVidia/Mellanox)
Austen Lauria (IBM)
Barrett, Brian (AWS)
Brandon Yates (Intel)
Charles Shereda (LLNL)
David Bernhold (ORNL)
Erik Zeiske
Geoffroy Vallee (ARM)
Josh Hursey (IBM)
Joshua Ladd (nVidia/Mellanox)
Mark Allen (IBM)
Matias Cabral (Intel)
Nathan Hjelm (Google)
Noah Evans (Sandia)
Scott Breyer (Sandia?)
Shintaro iwasaki
Tomislav Janjusic
Xin Zhao (nVidia/Mellanox)
mohan (AWS)

Web-Ex

link has changed for 2021. Please see email from Jeff Squyres to [email protected] on 12/15/2020 for the new link

4.0.x

v4.0.6rc1 - built, please test.
Discussed https://github.com/open-mpi/ompi/issues/8299 - srun issue in v4.0.x, mpirun works.
- SRUN might not give us enough info, so might need a fix.
- Curious what version of hwloc their slurm is built with.
Discussed https://github.com/open-mpi/ompi/issues/8321
- UCX in VM possible silent error.
- Added blocker label.
- in v4.0.x and master, though might be down in UCX.
SLURM_WHOLE issue, want to stay in sync with OMPI v4.1.x.
Howard wants to get Luster testing before v4.0.6rc2.
- Geoff pinged Mark to post his branch of ROMIO fixes for Luster

v4.1

Merged a number of PRs yesterday.
Issue 8334 - a performance regression with AVX. Still digging into.
- AVX Perf issue.
- Raghu tested AVX512 seems to make it slower.
- Papers show that anything after AVX2 throttles down cores and have this effect.
- Need to look into root cause.
- Probably not ready for default.
- Many apps just do one rank per node, which might WANT AVX on, but fully subscribed may want AVX off.
Issue 8335 - Trying to run with external PMIx.
- resolved
Michael Heinz is looking at PSM2(?) new issue from yesterday. Possibly for v4.1.1
- Fix PRed CQ entry data size field
Josh Hursey is working on Issue 8304 (verified in v4.1, v4.0, and v3.1)
- Resolved.

Open-MPI v5.0

What's the state of ULFM (PR 7740) for v5.0?

Does the community want this ULFM PR 7740 for OMPI v5.0? If so, we need a PRRTE v3.0
- Aurelien will rebase.
- Works with PRRTE refered to ompi master submodule pointer.
- Currently used in a bunch of places.
- Run normal regression tests. Should not see any performance regressions.
- When this works, can provide other tests.
- Is a configure flag. Default is to configure in, but disabled at runtime.
  - A number of things to set to enable.
  - Aurelien is working to get a single parameter
- Lets get some CODE reviews done.
  - Look at intersections of the core, and ensure that the NOT-ULFM paths are "clean".
- Also we have a downstream affect PMIX and PRRTE to get a
- Lets put a deadline on reviews. Lets say in 4 weeks, we'll push the merge button.
  - Jan 26th we'll merge if no issues

Josh and George removed Checkpoint Restart

Modified ABI - removed one callback/member function from some components (BTLs/PMLs) used for FT event.
- All these structures for these components.
- Pending for this discussion.
- Going to version the frameworks that are affected.
- Not this simple in practice, because usually we just return a pointer to a static object.
  - But this isn't possible anymore.
  - We don't support multiple versions
Do we think we should allow Open-MPI v5.0 to run with mcas from past versions?
- Maybe good to protect against it?
- Unless we know of someone we need to support like this, we shouldn't bend over for this.
- Josh thinks the Container community is experimenting with this.
Josh has advised that Open-MPI doesn't guarantee
v5.0 is advertised as an ABI break.
In this case, the framework doesn't exist anymore.
George will do a check to ensure we're not loading mcas from earlier version. *

Jeff Squyres want the v5.0 RMs to generate a list of versions it'll support, to document.

Still need to coirdinate on this. He'd like this, this week.
PMIx v4.0 working on Tools, hopefully done soon.
- PMIx go through python bindings.
- a new Shmem component to replace
- Still working on.
Dave Wooten pushed up some PRRTE patches, and making some progress there.
- Slow but steady progress.
- Once tool work is more stabilized on PMIx v4.0, will add some tool tests to CI.
- Probably won't start until first of the year.
How is the submodule reference updatees on Open-MPI master
- Probably be switching OMPI master to master PMIx in next few weeks.
  - PR 8319 - this failed. Should this be closed and create a new one?
- Josh was still looking to see about adding some cross checking CI
- When making a PRTE PR, could add some comment to the PR and it'll trigger Open-MPI CI with that PR.
v4.0 PMIx and PRRTE master.
- When PRRTE branches a v2.0 branch, we can switch to that then, but that'll
Two different drivers:
- OFI MTL
- HFI support
- Interest in PRRTE in a release, and a few other things that are already in v4.1.x
- HAN and ADAPT as default.
- Amazon helping testing and other resources
- Amazon also investing to contract Ralph to help get PRRTE up to speed.
Other features in PMIX
- can set GPU affinities, can query GPU info

This is the last Tuesday call of December.

New web-ex for January

ROMIO issue on Lustre

Too latest ROMIO from and it failed on both
But then he took LAST week's 3.4 BETA ROMIO and it passed. But it's a little too new.
- He gave a bit more info about the stuff he integrates, and stuff he moves forward.
  - 1. ROMIO modernization (don't use MPI1 based things)
  - 1. ROMIO integration items.
- We're hesitant to put this into 4.1.0 because it's NOT yet release from MPICH
- hesitant to even update ROMIO in v4.0.6 since it's a big change.
- If we delay and pickup newer ROMIO in the next minor, would there be backwards compatibility issues?
  - Need to ask about compatibility between ROMIO 3.2.2 and 3.4
    - If fully compatibile, then only one ROMIO
- We could ship multiple ROMIOs, but that has a lot of problems.

Edgar hunted down performance issue of OMPIO

Just got resources to test, and root caused the issue in OMPIO
So, given some more time Edgar will get a fix, and OMPIO can be default

ROMIO Long Term (12/8)

What do we want to do about ROMIO in general.
- OMPIO is the default everywhere.
- Giles is saying the changes we made are integration changes.
  - There have been some OMPI specific changes put into ROMIO, meaning upstream maintainers refuse to help us with it.
  - We may be able to work with upstream to make a clear API between the two.
- As a 3rd party package, should we move it upto the 3rd party packaging area, to be clear that we shouldn't make changes to this area?
Need to look at this treematch thing. Upstream package that is now inside of Open-MPI.
Might want a CI bot to watch a set of files, and flag PRs that violate principles like this.

Doc update

PR 8329 - convert README, HACKING, and possibly Manpages to restructured text.
- Uses https://www.sphinx-doc.org/en/master/ (Python tool, can pip install)
- Has a built from this PR, so we can see what it looks like.
- Have a look. It's a different approach to have one document that's the whole thing.
  - FAQ, README, HACKING.
Do people even use manpages anymore? Do we need/want them in our tarballs?

Josh described new command line flags (-prot / -protlazy )

https://github.com/openpmix/prrte/pull/711
please review and give opinon.
Will commit next week if no opinion

How's the state of https://github.com/open-mpi/ompi-tests-public/

Putting new tests there
Very little there so far, but working on adding some more.
Should have some new Sessions tests
What's going to be the state of the SM Cuda BTL and CUDA support in v5.0?
- What's the general state? Any known issues?
- AWS would like to get.
- Josh Ladd - Will take internally to see what they have to say.
- From nVidia/Mellanox, Cuda Support is through UCX, SM Cuda isn't tested that much.
- Hessam Mirsadeg - All Cuda awareness through UCX
- May ask George Bosilica about this.
- Don't want to remove a BTL if someone is interested in it.
- UCX also supports TCP via CUDA
- PRRTE CLI on v5.0 will have some GPU functionality that Ralph is working on
Update 11/17/2020
- UTK is interested in this BTL, and maybe others.
- Still gap in the MTL use-case.
- nVidia is not maintaining SMCuda anymore. All CUDA support will be through UCX
- What's the state of the shared memory in the BTL?
  - This is the really old generation Shared Memory. Older than Vader.
- Was told after a certain point, no more development in SM Cuda.
- One option might be to
- Another option might be to bring that SM in SMCuda to Vader(now SM)
Restructure Tech Doc (more features than Markdown, including crossrefrences)
- Jeff had a first stab at this, but take a look. Sent it out to devel-list.
- All work for master / v5.0
  - Might just be useful to do README for v4.1.? (don't block v4.1.0 for this)
- Sphynx is tool to generate docs from restructured doc.
  - can handle current markdown manpages together with new docs.
- readthedocs.io encourages "restructured text" format over markdown.
  - They also support a hybrid for projects that have both.
- Thomas Naughton has done the restructured text, and it allows
- LICENSE question - what license would the docs be available under? Open-MPI BSD license, or
Ralph tried the Instant on at scale:
- 10,000 nodes x 32PPN
- Ralph verified Open-MPI could do all of that in < 5 seconds, Instant-On.
- Through MPI_Init() (if using Instant-On)
- TCP and Slingshot (OFI provider private now)
- PRRTE with PMIx v4.0 support
- SLURM has some of the integration, but hasn't taken this patch yet.
Discussion on:
- Draft Request Make default static https://github.com/open-mpi/ompi/pull/8132
- One con is that many providers hard link against libraries, which would then make libmpi dependent on this.
- Non-Homogenous clusters (GPUs on some nodes, and non-GPUs on some other)

Video Presentation

New George and Jeff are leading
One for Open-MPI and one for PMIx
In a month and a half or so. George will send date to Jeff

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WeeklyTelcon_20210112

Open MPI Weekly Telecon ---

Attendees (on Web-ex)

not there today (I keep this for easy cut-n-paste for future notes)

Web-Ex

4.0.x

v4.1

Open-MPI v5.0

What's the state of ULFM (PR 7740) for v5.0?

Josh and George removed Checkpoint Restart

Jeff Squyres want the v5.0 RMs to generate a list of versions it'll support, to document.

This is the last Tuesday call of December.

ROMIO issue on Lustre

Edgar hunted down performance issue of OMPIO

ROMIO Long Term (12/8)

Doc update

Josh described new command line flags (-prot / -protlazy )

How's the state of https://github.com/open-mpi/ompi-tests-public/

Video Presentation

Clone this wiki locally