-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GFDL-ESM2M piControl does not run #377
Comments
@Jete90 This bug originates from using an old netCDF version as documented here NOAA-GFDL/CM4#11 and NOAA-GFDL/icebergs#44 You'll need to update to 4.7.3 or later. |
As a follow-up to Jens' question: Does this mean that many of the .res.nc included in the ESM2M piControl test setup provided are corrupt ? I have netCDF v4.7.4, and regardless of compiling with the netCDF4 flag on/off, I still receive the above error that Jens runs into. |
@wienkers The bug was specific to the iceberg restarts as far as I remember. It's quite possible there are other problems with non ocean restarts. |
Thank you for the quick reply @russfiedler
points back to flux_exchange_init, where
At run-time, kd = 6 on the Ice/Atm processes (as it should for num_part = 6 in the input.nml), but kd = 0 on the Ocean processes (which then each throw the error). This block of code is evaluated on all processes; however, it seems like the call to subroutine ice_model_init in coupler_init which allocates Ice%ice_mask only occurs for the Ice processes. So the size information about Ice%ice_mask needed then in the above block of code just becomes 0. |
@wienkers Ah, yes, I vaguely remember that was a possibility and that it should only be evaluated on Ice processors. I can't remember if it's sufficient to encase the code in an |
Hello,
I downloaded the MOM5 code to the WHOI supercomputer.
After compiling GFDL-ESM2M, I tried to run it.
Unfortunately, I quickly ran into segmentation faults when running it.
I attached the error message below.
It might be due to the modules/compiler versions that I am using.
This is what my environment looks like:
source $MODULESHOME/init/csh
module load intel
module load netcdf/intel/4.6.1
module load openmpi/intel
setenv mpirunCommand "mpirun -np"
Kind regards
Jens
ERROR MESSAGE
[...]
LND(ATMOCNLND)= 0.153673308874230 0.153673308874230
0.153673308871445
NOTE from PE 0: xgrid_mod: reading exchange grid information from mosaic grid file
NOTE from load_xgrid(xgrid_mod): field 'scale' exist in the file INPUT/land_mos
aicXocean_mosaic.nc, this field will be read and the exchange grid cell area wi
ll be multiplied by scale
Checked data is array of constant 1
LND(LNDOCN)= 0.703873657789463 0.703873657789466
0.703873657789463
OCN(LNDOCN)= 0.703873657789467 0.703873657789463
0.703873657789466
FATAL from PE 31: ==>Error from coupler_types_mod (CT_spawn_1d_3d): Disordered k-dimension index bound list 1 0
FATAL from PE 32: ==>Error from coupler_types_mod (CT_spawn_1d_3d): Disordered k-dimension index bound list 1 0
[.....]
fms_ESM2M.x 0000000000452D04 Unknown Unknown Unknown
fms_ESM2M.x 000000000045BD03 Unknown Unknown Unknown
fms_ESM2M.x 00000000004556BF Unknown Unknown Unknown
fms_ESM2M.x 000000000040E19E Unknown Unknown Unknown
libc-2.17.so 00002AAAAC544555 __libc_start_main Unknown Unknown
fms_ESM2M.x 000000000040E0A9 Unknown Unknown Unknown
MPI_ABORT was invoked on rank 30 in communicator MPI_COMM_WORLD
with errorcode 1.
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
forrtl: error (78): process killed (SIGTERM)
Image PC Routine Line Source
fms_ESM2M.x 0000000002A8FDEE for__signal_handl Unknown Unknown
libpthread-2.17.s 00002AAAAC315630 Unknown Unknown Unknown
libpthread-2.17.s 00002AAAAC312573 pthread_spin_lock Unknown Unknown
forrtl: error (78): process killed (SIGTERM)
Image PC Routine Line Source
fms_ESM2M.x 0000000002A8FDEE for__signal_handl Unknown Unknown
libpthread-2.17.s 00002AAAAC315630 Unknown Unknown Unknown
libpthread-2.17.s 00002AAAAC312573 pthread_spin_lock Unknown Unknown
forrtl: error (78): process killed (SIGTERM)
Image PC Routine Line Source
fms_ESM2M.x 0000000002A8FDEE for__signal_handl Unknown Unknown
libpthread-2.17.s 00002AAAAC315630 Unknown Unknown Unknown
libpthread-2.17.s 00002AAAAC312573 pthread_spin_lock Unknown Unknown
forrtl: error (78): process killed (SIGTERM)
Image PC Routine Line Source
fms_ESM2M.x 0000000002A8FDEE for__signal_handl Unknown Unknown
libpthread-2.17.s 00002AAAAC315630 Unknown Unknown Unknown
libpthread-2.17.s 00002AAAAC312573 pthread_spin_lock Unknown Unknown
[pn030:263631] *** Process received signal ***
[pn030:263631] Signal: Segmentation fault (11)
[pn030:263631] Signal code: Address not mapped (1)
[pn030:263631] Failing at address: 0x28
[pn030:263631] [ 0] /lib64/libpthread.so.0(+0xf630)[0x2aaaabe1d630]
[pn030:263631] [ 1] /vortexfs1/apps/openmpi-3.0.1-intel/lib/openmpi/mca_pmix_pmix2x.so(+0xb2723)[0x2aaab86c1723]
[pn030:263631] [ 2] /vortexfs1/apps/openmpi-3.0.1-intel/lib/openmpi/mca_pmix_pmix2x.so(pmix_ptl_base_recv_handler+0x579)[0x2aaab86c24a9]
[pn030:263631] [ 3] /vortexfs1/apps/openmpi-3.0.1-intel/lib/libopen-pal.so.40(opal_libevent2022_event_base_loop+0xa09)[0x2aaaab021829]
[pn030:263631] [ 4] /vortexfs1/apps/openmpi-3.0.1-intel/lib/openmpi/mca_pmix_pmix2x.so(+0x9d0f2)[0x2aaab86ac0f2]
[pn030:263631] [ 5] /lib64/libpthread.so.0(+0x7ea5)[0x2aaaabe15ea5]
[pn030:263631] [ 6] /lib64/libc.so.6(clone+0x6d)[0x2aaaac128b0d]
[pn030:263631] *** End of error message ***
Segmentation fault
ERROR: Model failed to run to completion
The text was updated successfully, but these errors were encountered: