Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to capture MPI Functions when instrumenting fortran code #230

Open
deanchester opened this issue Nov 26, 2019 · 9 comments
Open

Unable to capture MPI Functions when instrumenting fortran code #230

deanchester opened this issue Nov 26, 2019 · 9 comments

Comments

@deanchester
Copy link

I have instrumented a fortran code using Caliper but when I run the application it doesn't capture MPI communication.

I have built Caliper with the following configuration:

cmake -DCMAKE_INSTALL_PREFIX=$HOME/local/caliper-gcc -DCMAKE_C_COMPILER=/csc/tinis/software/Core/GCCcore/7.3.0/bin/gcc -DCMAKE_CXX_COMPILER=/csc/tinis/software/Core/GCCcore/7.3.0/bin/g++ -DWITH_FORTRAN=On -DWITH_TOOLS=On -DWITH_MPI=On -DMPI_C_COMPILER=/csc/tinis/software/Compiler/GCC/7.3.0-2.30/OpenMPI/3.1.1/bin/mpicc -DCMAKE_Fortran_COMPILER=/csc/tinis/software/Core/GCCcore/7.3.0/bin/gfortran ..

When I run my application I set the following in my script:

export CALI_SERVICES_ENABLE=trace,event,mpi,timestamp,recorder
export CALI_TIMER_SNAPSHOT_DURATION=true
export CALI_TIMER_INCLUSIVE_DURATION=true
export CALI_MPI_WHITELIST=all
export CALI_RECORDER_FILENAME="./caliper-$SLURM_JOB_ID/caliper-%mpi.rank%.cali"

In the caliper output files for the code I only have the instrumented areas of the code with the start and end routines:

  call cali_begin_byname('sweep')
  c CODE... 
  call cali_end_byname('sweep')

Any ideas whats going wrong?

@daboehme
Copy link
Member

Hi @deanchester ,

By default Caliper relies on library constructors to initialize its MPI component, but sometimes that fails. In that case, you can explicitly initialize it with the cali_mpi_init() function. I've added a Fortran binding for that function in the latest commit to master (#232). With that, adding call cali_mpi_init() somewhere at program start should help. You can set CALI_LOG_VERBOSITY=1 and see if the mpi service gets initialized.

@andrewreisner
Copy link

I am having an issue with this as well. Using cali_mpi_init() results in undefined reference to cali_mpi_init_. If I remove the preprocessor guard from https://github.com/LLNL/Caliper/blob/master/src/interface/c_fortran/wrapfcaliper.F#L457 everything works as expected and I get caliper output. Is CALIPER_HAVE_MPI from caliper-config.h supposed to propagate to this wrapper?

@daboehme
Copy link
Member

Hi @andrewreisner ,

The CALIPER_HAVE_MPI flag is supposed to propagate to the wrapper, so that might be a bug. I'll take a look. Calling cali_mpi_init() is no longer necessary in newer Caliper versions (as of v2.5.0 at least) though, so you can safely remove it. Capturing MPI functions in Fortran codes is unrelated to that: we simply don't have the wrapper functions for the Fortran MPI functions in Caliper, which are different from the C MPI functions. I hope I can add them at some point, but it's a significant effort.

@andrewreisner
Copy link

Thanks for the information. I updated my Caliper version and removed cali_mpi_init() and everything works as expected.

@Jiang-Weibo
Copy link

Hi @andrewreisner ,

The CALIPER_HAVE_MPI flag is supposed to propagate to the wrapper, so that might be a bug. I'll take a look. Calling cali_mpi_init() is no longer necessary in newer Caliper versions (as of v2.5.0 at least) though, so you can safely remove it. Capturing MPI functions in Fortran codes is unrelated to that: we simply don't have the wrapper functions for the Fortran MPI functions in Caliper, which are different from the C MPI functions. I hope I can add them at some point, but it's a significant effort.

Hi, I just had the same problem here. I want to intercept MPI functions in Fortran codes, and could you tell me how to write wrapper functions in Caliper for the Fortran MPI functions? Thank you so much.

@andrewreisner
Copy link

@Jiang-Weibo I do not have experience with this, but I suspect Caliper already wraps the MPI calls using gotcha and wrapping them yourself is unnecessary.

@Jiang-Weibo
Copy link

@andrewreisner Thank you so much for your reply. I tested a simple mpi program in Fortran, and I am sure I installed and enabled mpi services in Caliper, but still I got no correcc response. This is the configuration.
bash CALI_SERVICES_ENABLE=aggregate,event,mpi,mpireport,timestamp srun -n 2 ./simple_program
This is what I got.

MPI process 0 sends value 12345.
MPI process 1 received value: 12345.
== CALIPER: (1): default: mpireport: MPI is already finalized. Cannot aggregate output.
== CALIPER: (0): default: mpireport: MPI is already finalized. Cannot aggregate output.

There is another issue of Question about MPI_Finalize #535 reporting the same problem. Inspired by that, I simply removed the call of MPI_Finalize in my program, and somehow it works, and I just got the records below.

Path            Min time/rank Max time/rank Avg time/rank Time %    Allocated MB 
mpi-simple-test      0.002451      0.002887      0.002669 17.140302     0.005371 
  mainloop           0.001412      0.001414      0.001413  9.073287     0.001173 
MPI_Comm_dup         0.002300      0.002671      0.002485 15.960977     0.005340 
MPI_Send             0.000084      0.000084      0.000084  0.269365     0.005340 
MPI_Recv             0.000288      0.000288      0.000288  0.926402     0.005340 
MPI_Comm_free        0.000024      0.000028      0.000026  0.165153     0.005340 
MPI_Probe            0.000047      0.000047      0.000047  0.151672     0.005340 
MPI_Get_count        0.000040      0.000040      0.000040  0.126926     0.005340 

I doubt this is because the way I instrument Fortran code or the version of Caliper is not correct. Could you tell me the version of Caliper you are using or how you instrument the Fortran code with Caliper as well as the Caliper configurations? Thank you for your help.

@andrewreisner
Copy link

@Jiang-Weibo Try adding mpiflush to your CALI_SERVICES_ENABLE. Otherwise, I would just use the config manager fortran api and call flush before calling MPI_Finalize: https://software.llnl.gov/Caliper/FortranSupport.html#caliper-fortran-api. It has been a couple years since I have used Caliper with fortran, so I do not recall the configuration.

@Jiang-Weibo
Copy link

@andrewreisner I have tried both of them, but neither worked. I even changed the version of Caliper for ver. 2.7.0, 2.8.0 and 2.9.0 but they all reported the same error. I guess there is an internal mechanics of Caliper of how MPI_Finalize affect flush. Currently I will just remove MPI_Finalize calls. Thanks for your help again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants