Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert all global (uncoupled) and regional tests to use quilting restart #1946

Merged
merged 29 commits into from
Nov 14, 2023

Conversation

DusanJovic-NOAA
Copy link
Collaborator

@DusanJovic-NOAA DusanJovic-NOAA commented Oct 17, 2023

PR Author Checklist:

  • I have linked PR's from all sub-components involved in section below.
  • I am confirming reviews are completed in ALL sub-component PR's.
  • I have run the full RT suite on either Hera/Cheyenne AND have attached the log to this PR below this line:
  • I have added the list of all failed regression tests to "Anticipated changes" section.
  • I have filled out all sections of the template.

Description

All global (uncoupled) and regional test are converted to use quilting restart. Two regression tests, one global (control_p8) and one regional (hrrr_control) are also running with FMS I/O restart in addition to quilting restart. All other tests that are currently running quiting restart (tests with '_qr' in the name) are removed.

Includes and Closes #1972
Includes and Closes #1974

Linked Issues and Pull Requests

Associated UFSWM Issue to close

Closes #1627
Closes #1973 (From #1974)
Closes #1971 (from #1972)

Subcomponent Pull Requests

NOAA-EMC/fv3atm/pull/713
NOAA-EMC/fv3atm/pull/717 (from #1974)
NOAA-EMC/MOM6/pull/122 (from #1972)

Blocking Dependencies

Subcomponents involved:

  • AQM
  • CDEPS
  • CICE
  • CMEPS
  • CMakeModules
  • FV3
  • GOCART
  • HYCOM
  • MOM6
  • NOAHMP
  • WW3
  • stochastic_physics
  • none

Anticipated Changes

Input data

  • No changes are expected to input data.
  • Changes are expected to input data:
    • New input data.
    • Updated input data.

Regression Tests:

  • No changes are expected to any regression test.
  • Changes are expected to the following tests:
Tests effected by changes in this PR:

Libraries

  • Not Needed
  • Needed
    • Create separate issue in JCSDA/spack-stack asking for update to library. Include library name, library version.
    • Add issue link from JCSDA/spack-stack following this item
Code Managers Log
  • This PR is up-to-date with the top of all sub-component repositories except for those sub-components which are the subject of this PR.
  • Move new/updated input data on RDHPCS Hera and propagate input data changes to all supported systems.
    • N/A

Testing Log:

  • RDHPCS
    • Hera
    • Orion
    • Hercules
    • Jet
    • Gaea
    • Cheyenne
  • WCOSS2
    • Dogwood/Cactus
    • Acorn
  • CI
    • Completed
  • opnReqTest
    • N/A
    • Log attached to comment

@DusanJovic-NOAA
Copy link
Collaborator Author

Regression test passed one Hera:
RegressionTests_hera.log

@SamuelTrahanNOAA
Copy link
Collaborator

I'm curious. Why aren't you making this change to the coupled tests?

@DusanJovic-NOAA
Copy link
Collaborator Author

@junwang-noaa suggested that at this point we should only change uncoupled tests.

@SamuelTrahanNOAA
Copy link
Collaborator

I trust her judgement, but I'm still curious about the reason. If she's willing to share it, I'd be grateful.

@junwang-noaa
Copy link
Collaborator

junwang-noaa commented Oct 17, 2023

@SamuelTrahanNOAA The purpose of the moving restart writing from forecast to write grid component is to save forecast time. In the coupled configuration, only fv3 is using write grid component while other components are still doing IO on the forecast grid component. So at restart time even though fv3 can move on with forecast but it has to wait for other components to finish writing their restart files, some of which may take longer time than fv3. Also for the higher resolution case than C96, writing restart files may require increase write group since now write grid comp takes longer time to finish. So we won't get speed up
in coupled tests while may need to increase resources. Because of these reasons, I suggested not to add the feature in the coupled test at this time.

@SamuelTrahanNOAA
Copy link
Collaborator

Ah, well, perhaps one day the other coupled components will have quilt servers, too.

@jkbk2004
Copy link
Collaborator

jkbk2004 commented Nov 8, 2023

@DusanJovic-NOAA we are considering to combine in #1972 (mom6 update) and #1974 (upp hash update) to this pr. All no baseline change. Can you combine in those PRs?

@DusanJovic-NOAA
Copy link
Collaborator Author

@DusanJovic-NOAA we are considering to combine in #1972 (mom6 update) and #1974 (upp hash update) to this pr. All no baseline change. Can you combine in those PRs?

Done.

@zach1221
Copy link
Collaborator

zach1221 commented Nov 9, 2023

I added code change, so we can resolve issue 1440 when this PR is merged.

@FernandoAndrade-NOAA
Copy link
Collaborator

@FernandoAndrade-NOAA @jkbk2004 - Could you please test Jet again, now that Dusan has merged my fixes?

Looks good after those changes, thanks!

@FernandoAndrade-NOAA
Copy link
Collaborator

My mistake on the review requests, sub prs still need to get merged in.

@SamuelTrahanNOAA
Copy link
Collaborator

My mistake on the review requests, sub prs still need to get merged in.

The FV3 PR is already approved. Someone needs to press the Merge button.

@jkbk2004
Copy link
Collaborator

@jiandewang All test are done. Can you merge in the MOM6 pr?

@jkbk2004
Copy link
Collaborator

NOAA-EMC/fv3atm#713 was merged

@jiandewang
Copy link
Collaborator

@SamuelTrahanNOAA SamuelTrahanNOAA mentioned this pull request Nov 13, 2023
new global_nest_v1 suite and #1965 #1941
Open
36 tasks

just merged
commit a36cb73d6924f6cf56a72b5799bef3d75fe4dd61 (HEAD -> dev/emc, origin/dev/emc, origin/HEAD)
Merge: 02d4dc455 80a93f633
Author: jiandewang [email protected]
Date: Tue Nov 14 08:12:33 2023 -0500

Merge pull request #122 from jiandewang/feature/update-NCAR-GMAO-20231031

update MOM6 to its main repo 20231025 (NCAR candidate) and 20231031(GMAO FMS_cap) updating

@jkbk2004
Copy link
Collaborator

MOM6 PR was merged as well: NOAA-EMC/MOM6@a36cb73

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
jenkins-ci Jenkins CI: ORT build/test on docker container No Baseline Change No Baseline Change Ready for Commit Queue The PR is ready for the Commit Queue. All checkboxes in PR template have been checked.
Projects
None yet
9 participants