Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MPI test failures #98

Open
bkmgit opened this issue Nov 14, 2023 · 0 comments
Open

MPI test failures #98

bkmgit opened this issue Nov 14, 2023 · 0 comments

Comments

@bkmgit
Copy link

bkmgit commented Nov 14, 2023

When testing Extrae 4.0.6 on Fedora 39 with mpich 4.1.2 the following tests fail:

==============================================================
   Extrae 4.0.6: tests/functional/tracer/MPI/test-suite.log
==============================================================

# TOTAL: 21
# PASS:  18
# SKIP:  0
# XFAIL: 0
# FAIL:  3
# XPASS: 0
# ERROR: 0

.. contents:: :depth: 2

FAIL: mpi_sendirecv_c.sh
========================

Welcome to Extrae 4.0.6
Extrae: Parsing the configuration file (extrae.xml) begins
Extrae: Tracing package is located on /home/harald/aplic/extrae/3.3.0rc
Extrae: Generating intermediate files for Paraver traces.
Extrae: MPI routines will NOT collect HW counters information.
Extrae: Dynamic memory instrumentation is disabled.
Extrae: Basic I/O memory instrumentation is disabled.
Extrae: System calls instrumentation is disabled.
Extrae: Parsing the configuration file (extrae.xml) has ended
Extrae: Intermediate traces will be stored in /home/fedora/extrae-4.0.6-mpich-self-install/tests/functional/tracer/MPI
Extrae: Tracing mode is set to: Detail.
Extrae: Successfully initiated with 1 tasks and 1 threads

Extrae: Successfully initiated with 1 tasks and 1 threads

Assertion failed in file src/mpi/datatype/typerep/src/typerep_yaksa_pack.c at line 315: FALSE
memcpy argument memory ranges overlap, dst_=0xffffca743540 src_=0xffffca743540 len_=4

Abort(1) on node 0: Internal error
Extrae: Intermediate raw trace file created : /home/fedora/extrae-4.0.6-mpich-self-install/tests/functional/tracer/MPI/set-0/[email protected]
Extrae: Intermediate raw sym file created : /home/fedora/extrae-4.0.6-mpich-self-install/tests/functional/tracer/MPI/set-0/[email protected]
Extrae: Deallocating memory.
Extrae: Application has ended. Tracing has been terminated.
merger: Output trace format is: Paraver
merger: Extrae 4.0.6
mpi2prv: Assigned nodes < ip-172-31-79-137.ec2.internal >
mpi2prv: Assigned size per processor < <1 Mbyte >
mpi2prv: File /home/fedora/extrae-4.0.6-mpich-self-install/tests/functional/tracer/MPI/set-0/[email protected] is object 1.1.1 on node ip-172-31-79-137.ec2.internal assigned to processor 0
mpi2prv: Time synchronization has been turned off
mpi2prv: Checking for target directory existence... exists, ok!
mpi2prv: Selected output trace format is Paraver
mpi2prv: Stored trace format is Paraver
mpi2prv: Enabling Time Synchronization (Node).
mpi2prv: Circular buffer enabled at tracing time? NO
mpi2prv: Parsing intermediate files
mpi2prv: Progress 1 of 2 ... 5% 11% 17% 23% 29% 35% 41% 47% 52% 58% 64% 70% 76% 82% 88% 94% 100% done
mpi2prv: Processor 0 succeeded to translate its assigned files
mpi2prv: Elapsed time translating files: 0 hours 0 minutes 0 seconds
mpi2prv: Elapsed time sorting addresses: 0 hours 0 minutes 0 seconds
mpi2prv: Generating tracefile (intermediate buffers of 6710784 events)
         This process can take a while. Please, be patient.
mpi2prv: Progress 2 of 2 ... 12% 15% 21% 33% 36% 42% 54% 57% 60% 66% 72% 75% 81% 87% 90% 100% done
mpi2prv: Elapsed time merge step: 0 hours 0 minutes 0 seconds
mpi2prv: Resulting tracefile occupies 828 bytes
mpi2prv: Removing temporal files... done
mpi2prv: Elapsed time removing temporal files: 0 hours 0 minutes 0 seconds
mpi2prv: Congratulations! ./mpi_sendirecv_c.prv has been generated.
Error! Could not find 'MPI_Wait'.
FAIL mpi_sendirecv_c.sh (exit status: 1)

FAIL: mpi_isendirecv_c.sh
=========================

Welcome to Extrae 4.0.6
Extrae: Parsing the configuration file (extrae.xml) begins
Extrae: Tracing package is located on /home/harald/aplic/extrae/3.3.0rc
Extrae: Generating intermediate files for Paraver traces.
Extrae: MPI routines will NOT collect HW counters information.
Extrae: Dynamic memory instrumentation is disabled.
Extrae: Basic I/O memory instrumentation is disabled.
Extrae: System calls instrumentation is disabled.
Extrae: Parsing the configuration file (extrae.xml) has ended
Extrae: Intermediate traces will be stored in /home/fedora/extrae-4.0.6-mpich-self-install/tests/functional/tracer/MPI
Extrae: Tracing mode is set to: Detail.
Extrae: Successfully initiated with 1 tasks and 1 threads

Extrae: Successfully initiated with 1 tasks and 1 threads

Assertion failed in file src/mpi/datatype/typerep/src/typerep_yaksa_pack.c at line 315: FALSE
memcpy argument memory ranges overlap, dst_=0xffffd2915ac4 src_=0xffffd2915ac4 len_=4

Abort(1) on node 0: Internal error
Extrae: Intermediate raw trace file created : /home/fedora/extrae-4.0.6-mpich-self-install/tests/functional/tracer/MPI/set-0/[email protected]
Extrae: Intermediate raw sym file created : /home/fedora/extrae-4.0.6-mpich-self-install/tests/functional/tracer/MPI/set-0/[email protected]
Extrae: Deallocating memory.
Extrae: Application has ended. Tracing has been terminated.
merger: Output trace format is: Paraver
merger: Extrae 4.0.6
mpi2prv: Assigned nodes < ip-172-31-79-137.ec2.internal >
mpi2prv: Assigned size per processor < <1 Mbyte >
mpi2prv: File /home/fedora/extrae-4.0.6-mpich-self-install/tests/functional/tracer/MPI/set-0/[email protected] is object 1.1.1 on node ip-172-31-79-137.ec2.internal assigned to processor 0
mpi2prv: Time synchronization has been turned off
mpi2prv: Checking for target directory existence... exists, ok!
mpi2prv: Selected output trace format is Paraver
mpi2prv: Stored trace format is Paraver
mpi2prv: Enabling Time Synchronization (Node).
mpi2prv: Circular buffer enabled at tracing time? NO
mpi2prv: Parsing intermediate files
mpi2prv: Progress 1 of 2 ... 5% 11% 17% 23% 29% 35% 41% 47% 52% 58% 64% 70% 76% 82% 88% 94% 100% done
mpi2prv: Processor 0 succeeded to translate its assigned files
mpi2prv: Elapsed time translating files: 0 hours 0 minutes 0 seconds
mpi2prv: Elapsed time sorting addresses: 0 hours 0 minutes 0 seconds
mpi2prv: Generating tracefile (intermediate buffers of 6710784 events)
         This process can take a while. Please, be patient.
mpi2prv: Progress 2 of 2 ... 11% 17% 20% 32% 35% 41% 52% 55% 61% 67% 70% mpi2prv: Error! Found unmatched communication! Continuing...
76% 82% 85% 91% 100% done
mpi2prv: Error! Found 1 unmatched communications. Resulting tracefile may be inconsistent.
mpi2prv: Elapsed time merge step: 0 hours 0 minutes 0 seconds
mpi2prv: Resulting tracefile occupies 830 bytes
mpi2prv: Removing temporal files... done
mpi2prv: Elapsed time removing temporal files: 0 hours 0 minutes 0 seconds
mpi2prv: Congratulations! ./mpi_isendirecv_c.prv has been generated.
Error! Could not find 'MPI_Wait'.
FAIL mpi_isendirecv_c.sh (exit status: 1)

FAIL: mpi_isendirecvwaitall_c.sh
================================

Welcome to Extrae 4.0.6
Extrae: Parsing the configuration file (extrae.xml) begins
Extrae: Tracing package is located on /home/harald/aplic/extrae/3.3.0rc
Extrae: Generating intermediate files for Paraver traces.
Extrae: MPI routines will NOT collect HW counters information.
Extrae: Dynamic memory instrumentation is disabled.
Extrae: Basic I/O memory instrumentation is disabled.
Extrae: System calls instrumentation is disabled.
Extrae: Parsing the configuration file (extrae.xml) has ended
Extrae: Intermediate traces will be stored in /home/fedora/extrae-4.0.6-mpich-self-install/tests/functional/tracer/MPI
Extrae: Tracing mode is set to: Detail.
Extrae: Successfully initiated with 1 tasks and 1 threads

Extrae: Successfully initiated with 1 tasks and 1 threads

Assertion failed in file src/mpi/datatype/typerep/src/typerep_yaksa_pack.c at line 315: FALSE
memcpy argument memory ranges overlap, dst_=0xffffe214f37c src_=0xffffe214f37c len_=4

Abort(1) on node 0: Internal error
Extrae: Intermediate raw trace file created : /home/fedora/extrae-4.0.6-mpich-self-install/tests/functional/tracer/MPI/set-0/[email protected]
Extrae: Intermediate raw sym file created : /home/fedora/extrae-4.0.6-mpich-self-install/tests/functional/tracer/MPI/set-0/[email protected]
Extrae: Deallocating memory.
Extrae: Application has ended. Tracing has been terminated.
merger: Output trace format is: Paraver
merger: Extrae 4.0.6
mpi2prv: Assigned nodes < ip-172-31-79-137.ec2.internal >
mpi2prv: Assigned size per processor < <1 Mbyte >
mpi2prv: File /home/fedora/extrae-4.0.6-mpich-self-install/tests/functional/tracer/MPI/set-0/[email protected] is object 1.1.1 on node ip-172-31-79-137.ec2.internal assigned to processor 0
mpi2prv: Time synchronization has been turned off
mpi2prv: Checking for target directory existence... exists, ok!
mpi2prv: Selected output trace format is Paraver
mpi2prv: Stored trace format is Paraver
mpi2prv: Enabling Time Synchronization (Node).
mpi2prv: Circular buffer enabled at tracing time? NO
mpi2prv: Parsing intermediate files
mpi2prv: Progress 1 of 2 ... 5% 11% 16% 22% 27% 33% 38% 44% 50% 55% 61% 66% 72% 77% 83% 88% 94% 100% done
mpi2prv: Processor 0 succeeded to translate its assigned files
mpi2prv: Elapsed time translating files: 0 hours 0 minutes 0 seconds
mpi2prv: Elapsed time sorting addresses: 0 hours 0 minutes 0 seconds
mpi2prv: Generating tracefile (intermediate buffers of 6710784 events)
         This process can take a while. Please, be patient.
mpi2prv: Progress 2 of 2 ... 14% 17% 22% 34% 37% 42% 54% 57% 62% 65% 71% mpi2prv: Error! Found unmatched communication! Continuing...
77% 82% 85% 91% 100% done
mpi2prv: Error! Found 1 unmatched communications. Resulting tracefile may be inconsistent.
mpi2prv: Elapsed time merge step: 0 hours 0 minutes 0 seconds
mpi2prv: Resulting tracefile occupies 841 bytes
mpi2prv: Removing temporal files... done
mpi2prv: Elapsed time removing temporal files: 0 hours 0 minutes 0 seconds
mpi2prv: Congratulations! ./mpi_isendirecvwaitall_c.prv has been generated.
Error! Could not find 'MPI_Waitall'.
FAIL mpi_isendirecvwaitall_c.sh (exit status: 1)

When testing Extrae 4.0.6 on Fedora 39 with OpenMPI 5.0.0 the following tests fail:

==============================================================
   Extrae 4.0.6: tests/functional/tracer/MPI/test-suite.log
==============================================================

# TOTAL: 21
# PASS:  20
# SKIP:  0
# XFAIL: 0
# FAIL:  1
# XPASS: 0
# ERROR: 0

.. contents:: :depth: 2

FAIL: mpi_commranksize_f_1proc.sh
=================================

Welcome to Extrae 4.0.6
Extrae: Parsing the configuration file (extrae.xml) begins
Extrae: Tracing package is located on /home/harald/aplic/extrae/3.3.0rc
Extrae: Generating intermediate files for Paraver traces.
Extrae: MPI routines will NOT collect HW counters information.
Extrae: Dynamic memory instrumentation is disabled.
Extrae: Basic I/O memory instrumentation is disabled.
Extrae: System calls instrumentation is disabled.
Extrae: Parsing the configuration file (extrae.xml) has ended
Extrae: Intermediate traces will be stored in /home/fedora/extrae-4.0.6-openmpi-self-install/tests/functional/tracer/MPI
Extrae: Tracing mode is set to: Detail.
Extrae: Successfully initiated with 1 tasks and 1 threads

Extrae: Intermediate raw trace file created : /home/fedora/extrae-4.0.6-openmpi-self-install/tests/functional/tracer/MPI/set-0/[email protected]
Extrae: Intermediate raw sym file created : /home/fedora/extrae-4.0.6-openmpi-self-install/tests/functional/tracer/MPI/set-0/[email protected]
Extrae: Deallocating memory.
Extrae: Application has ended. Tracing has been terminated.
*** The MPI_Allreduce() function was called after MPI_FINALIZE was invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort.
[ip-172-31-79-137.ec2.internal:521208] Local abort after MPI_FINALIZE started completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
merger: Output trace format is: Paraver
merger: Extrae 4.0.6
mpi2prv: Assigned nodes < ip-172-31-79-137.ec2.internal >
mpi2prv: Assigned size per processor < <1 Mbyte >
mpi2prv: File /home/fedora/extrae-4.0.6-openmpi-self-install/tests/functional/tracer/MPI/set-0/[email protected] is object 1.1.1 on node ip-172-31-79-137.ec2.internal assigned to processor 0
mpi2prv: Time synchronization has been turned off
mpi2prv: Checking for target directory existence... exists, ok!
mpi2prv: Selected output trace format is Paraver
mpi2prv: Stored trace format is Paraver
mpi2prv: Enabling Time Synchronization (Node).
mpi2prv: Circular buffer enabled at tracing time? NO
mpi2prv: Parsing intermediate files
mpi2prv: Progress 1 of 2 ... 12% 25% 37% 50% 62% 75% 87% 100% done
mpi2prv: Processor 0 succeeded to translate its assigned files
mpi2prv: Elapsed time translating files: 0 hours 0 minutes 0 seconds
mpi2prv: Elapsed time sorting addresses: 0 hours 0 minutes 0 seconds
mpi2prv: Generating tracefile (intermediate buffers of 6710784 events)
         This process can take a while. Please, be patient.
mpi2prv: Progress 2 of 2 ... 5% 21% 26% 31% 36% 57% 63% 68% 73% 78% 84% 89% 100% done
mpi2prv: Elapsed time merge step: 0 hours 0 minutes 0 seconds
mpi2prv: Resulting tracefile occupies 477 bytes
mpi2prv: Removing temporal files... done
mpi2prv: Elapsed time removing temporal files: 0 hours 0 minutes 0 seconds
mpi2prv: Congratulations! ./mpi_commranksize_f_1proc.prv has been generated.
Error! Could not find 'MPI_Init'.
FAIL mpi_commranksize_f_1proc.sh (exit status: 1)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant