Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create and populate ensemble directory for UFS-based ATM DA #1801

Merged
merged 8 commits into from
Aug 18, 2023

Conversation

RussTreadon-NOAA
Copy link
Contributor

@RussTreadon-NOAA RussTreadon-NOAA commented Aug 16, 2023

Description
This PR adds scripting to stage ensemble files for use in hybrid variational and ensemble UFS-based atmospheric DA. Three files are modified:

  • ush/python/pygfs/task/analysis.py - add method get_ens_dict to construct dictionary of ensemble member RESTART files to copy
  • ush/python/pygfs/task/atm_analysis.py - invoke get_ens_dict to stage ensemble members in UFS-based ATM variational DA runtime directory
  • ush/python/pygfs/task/atmens_analysis.py - invoke get_ens_dict to stage ensemble members in UFS-based ATM ensemble DA runtime directory

Fixes #1799

Type of change

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

  • Clone and Build tests on Hera
  • Cycled test on Hera

Clone and install g-w feature/ufsda_hybvar on Hera. Configure EXPDIR to run UFS-based atmospheric DA using hybrid background error. Run gdasatmanlinit, gdasatmanlrun, and gdasatmanlfinal jobs for 2021081418. Confirm that init job populated run directory with correct ensemble member restarts. Subsequent run job successfully ran to completion as did final job.

Repeat the above steps for enkfgdasatmensanlinit, enkfgdasatmensanlrun, and enkfgdasatmensanlfinal. Confirm that modified scripts work as intended.

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • New and existing tests pass with my changes

@RussTreadon-NOAA RussTreadon-NOAA self-assigned this Aug 16, 2023
@github-actions
Copy link

Link to ReadTheDocs sample build for this PR can be found at:
https://global-workflow--1801.org.readthedocs.build/en/1801

@@ -394,6 +400,67 @@ def _get_berror_dict_gsibec(config: Dict[str, Any]) -> Dict[str, List[str]]:
}
return berror_dict

@logit(logger)
def get_berror_ens_dict(self, config: Dict[str, Any]) -> Dict[str, List[str]]:
Copy link
Contributor

@aerorahul aerorahul Aug 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to find a way to not repeat large sections of the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand your comment. A new method was added to atm_analysis.py. It uses templates to create and populate the ens directory if DOHYBVAR=YES. It does not repeat other sections of atm_analysis.py. If you have a preferred way to implement the new method, I'm willing to learn.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new method is a duplicate of the method in the atmens_analysis.py.
Is there a reason why this method from atmens_analysis.py cannot be used by making it available to atm_analysis.py?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. The ensemble directory pieces are 99.9% identical. atmens_analysis.py populates bkg/ whereas atm_analysis.py populates ens/. I always start from the simplest implementation, get it to work, and generalize from there.

A possible refactor would be to

  • remove get_berror_ens_dict from atm_analysis.py
  • remove get_bkg_dict from atmens_analysis.py
  • add generic get_ens_dict to analysis.py. Pass bkg or ens into method to create and populate correct directory. Alternatively, figure out another way to distinguish between deterministic (ens) and ensemble (bkg) jobs.
  • add get_ens_dict to atm_analysis.py and atmens_analysis.py

Is this acceptable?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @RussTreadon-NOAA for iterating with me.
Elevating to analysis.py and calling it get_fv3_ens_dict would be acceptable.

The trouble you are having with generate_com is because you are trying to create a template from a template.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the recommended fix, then, for the shellcheck error? What is currently scripted in JGLOBAL_ATM_ANALYSIS_INITIALIZE works ... but it fails shellcheck.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to use COM_ATMOS_RESTART_TMPL directly instead of running it through generate_com(). All generate_com does is fill in the template and assign the result to a variable. Just be sure to supply the correct variable assignments when you fill it in in python.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Attempt to address reviewer comments committed to feature/ufsda_hybvar at 1f9b889

@github-actions
Copy link

Link to ReadTheDocs sample build for this PR can be found at:
https://global-workflow--1801.org.readthedocs.build/en/1801

…thod to analysis.py; invoke get_ens_dict from atm_analysis.py and atmens_analysis.py (NOAA-EMC#1799)
@github-actions
Copy link

Link to ReadTheDocs sample build for this PR can be found at:
https://global-workflow--1801.org.readthedocs.build/en/1801

@github-actions
Copy link

Link to ReadTheDocs sample build for this PR can be found at:
https://global-workflow--1801.org.readthedocs.build/en/1801

@RussTreadon-NOAA RussTreadon-NOAA changed the title Create and populate ensemble directory for hybrid UFS-based ATM DA Create and populate ensemble directory for UFS-based ATM DA Aug 17, 2023
Copy link
Contributor

@aerorahul aerorahul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this PR @RussTreadon-NOAA
I have a few comments.
If there is anything that is confusing or unclear, please let me know.
If you wish, I can do some of the work that I describe; staticmethod next week.

Also, it would be great to get a test for JEDI Atm 3dvar, EnKF and now 4DEnVar when you are ready to ensure these capabilities are retained.

@@ -200,6 +201,74 @@ def link_jediexe(self) -> None:

return

@logit(logger)
def get_ens_dict(self, task_config: Dict[str, Any]) -> Dict[str, List[str]]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggestion would be to name this method get_fv3ens_dict or get_atmens_dict since that is what this method is doing.
I can anticipate similar requirements for ocean, ice, aerosols in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename as get_fv3ens_dict. Done at 5e88371.

@@ -108,7 +108,8 @@ def initialize(self: Analysis) -> None:
FileHandler(jedi_fix_list).sync()

# stage backgrounds
FileHandler(self.get_bkg_dict()).sync()
logger.debug(f"Stage ensemble member background files")
FileHandler(Analysis.get_ens_dict(self, self.task_config)).sync()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
FileHandler(Analysis.get_ens_dict(self, self.task_config)).sync()
FileHandler(self.get_ens_dict(self.task_config)).sync()

atmens_analysis.py inherits from analysis.py.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accept suggestion. Done at 5e88371.

# stage ensemble files for use in hybrid background error
if self.task_config.DOHYBVAR:
logger.debug(f"Stage ensemble files for DOHYBVAR {self.task_config.DOHYBVAR}")
FileHandler(Analysis.get_ens_dict(self, self.task_config)).sync()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
FileHandler(Analysis.get_ens_dict(self, self.task_config)).sync()
FileHandler(sel.f.get_ens_dict(self.task_config)).sync()

atm_analysis.py inherits from analysis.py, so this method is already in scope.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accept suggestion. Done at 5e88371.

@@ -200,6 +201,74 @@ def link_jediexe(self) -> None:

return

@logit(logger)
def get_ens_dict(self, task_config: Dict[str, Any]) -> Dict[str, List[str]]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation is calling this config and not task_config.
It should be config as task_config is an attribute of self and we do not want it to confused with this "config".

Suggested change
def get_ens_dict(self, task_config: Dict[str, Any]) -> Dict[str, List[str]]:
def get_ens_dict(self, config: Dict[str, Any]) -> Dict[str, List[str]]:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that scripting does not agree with documentation. Both atm_analysis.py and atmens_analysis.py pass self.config_task to get_fv3ens_dict so it seems documentation should be updated. That is, replace config with task_config in the documentation

        Parameters
        ----------
        task_config: Dict
            a dictionary containing all of the configuration needed for the task

This change was committed at 5e88371. Not sure if this is acceptable.

Comment on lines 224 to 239
template_res = self.task_config.COM_ATMOS_RESTART_TMPL
prev_cycle = self.task_config.previous_cycle
tmpl_res_dict = {
'ROTDIR': self.task_config.ROTDIR,
'RUN': self.task_config.RUN,
'YMD': to_YMD(prev_cycle),
'HH': prev_cycle.strftime('%H'),
'MEMDIR': None
}

# set directory type based on RUN
if self.task_config.RUN in ['enkfgdas', 'enkfgfs']:
dirtype = 'bkg'
else:
tmpl_res_dict['RUN'] = 'enkf' + self.task_config.RUN
dirtype = 'ens'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I have had a chance to look at this, I would suggest turning this method in to a staticmethod and pass everything that this method needs through config, rather than relying on the global variable self.task_config.
This would also eliminate this if-else and raise it to the class that defines this specific need.

Copy link
Contributor Author

@RussTreadon-NOAA RussTreadon-NOAA Aug 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I see what you are saying. Let me see if I can correctly implement this suggestion. I'll update the method documentation accordingly once I get the suggested change implemented.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_fv3ens_dict has been changed to a staticmethod. The if-else in get_fv3ens_dict has been removed and necessary additions made in the classes which invoke the method. Done at 51200a0.

@github-actions
Copy link

Link to ReadTheDocs sample build for this PR can be found at:
https://global-workflow--1801.org.readthedocs.build/en/1801

@github-actions
Copy link

Link to ReadTheDocs sample build for this PR can be found at:
https://global-workflow--1801.org.readthedocs.build/en/1801

@github-actions
Copy link

Link to ReadTheDocs sample build for this PR can be found at:
https://global-workflow--1801.org.readthedocs.build/en/1801

Comment on lines 92 to 93
self.task_config.RUN = 'enkf' + self.task_config.RUN
self.task_config.dirname ='ens'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are dangerous.
After these lines, one would have altered the self.task_config This is forbidden.
What you should do here instead, is make a deepcopy of self.task_config, extract the attributes that you need to send to self.get_fv3ens_dict, instead of sending in the entire task configuration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess having this in get_fv3ens_dict was safer. I could move it back but I don't think you want this. I don't know how to do what you outline. Is there an example somewhere in g-w that I can follow.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see land_analysis.py.
You could follow something along these lines (not what this method needs), but an example of creating a local dict and pulling relevant info from self.task_config.

        localconf = AttrDict()
        keys = ['DATA', 'current_cycle', 'COM_OBS', 'COM_ATMOS_RESTART_PREV',
                'OPREFIX', 'CASE', 'ntiles']
        for key in keys:
            localconf[key] = self.task_config[key]
        self.get_fv3ens_dict(localconf)

Having them in self.get_fv3ens_dict is not really a good solution as it makes that method bring in the entire global environment. One should know exactly what a method requires and pass that explicitly. One could say it is the same argument that people frown on using global variables in Fortran.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the guidance. Back to refactoring.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create, populate, and pass localconf to get_fv3ens_dict. Change made in both atm_analysis.py and atmens_analysis.py. Done at a2ab183

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perfect @RussTreadon-NOAA
This way, when the method runs, the logger will print out exactly what is going in and coming out.
Thank you~

@github-actions
Copy link

Link to ReadTheDocs sample build for this PR can be found at:
https://global-workflow--1801.org.readthedocs.build/en/1801

@aerorahul aerorahul self-requested a review August 18, 2023 18:47
Copy link
Contributor

@aerorahul aerorahul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.
Two comments:

  1. please update the description of the PR to reflect the actual changes
  2. can you provide test cases and configurations so we can run the UFS-based atmosphere DA as part of our CI tests?

@RussTreadon-NOAA
Copy link
Contributor Author

PR description updated to note that this PR impacts the staging of ensemble files for both hybrid variational and ensemble UFS-DA atmospheric analyses. Is this acceptable?

My preference is to provide test cases and configurations so we can run the UFS-based atmosphere DA as part of our CI tests via a new g-w issue and subsequent PR. Is this acceptable?

Where are the files for current g-w CI tests? I'd like to see if I can leverage or augment what we already have instead of adding new cases to the g-w CI database.

@aerorahul
Copy link
Contributor

PR description updated to note that this PR impacts the staging of ensemble files for both hybrid variational and ensemble UFS-DA atmospheric analyses. Is this acceptable?

sure

My preference is to provide test cases and configurations so we can run the UFS-based atmosphere DA as part of our CI tests via a new g-w issue and subsequent PR. Is this acceptable?

yes

Where are the files for current g-w CI tests? I'd like to see if I can leverage or augment what we already have instead of adding new cases to the g-w CI database.

On Hera: /scratch1/NCEPDEV/global/glopara/data/ICSDIR
On Orion: /work/noaa/global/glopara/data/ICSDIR
These are the cases that are currently part of the automated testing:
https://github.com/NOAA-EMC/global-workflow/tree/develop/ci/cases

@aerorahul aerorahul merged commit df5f941 into NOAA-EMC:develop Aug 18, 2023
4 checks passed
@RussTreadon-NOAA
Copy link
Contributor Author

OK. We can work cycle ICSDIR/C96C48. This directory is for cold-starts. Nothing wrong with this but we have not yet added to g-w the ability to run UFS-DA ATM parallels from the existing GDA.

Initial steps toward this goal are in GDASApp PR #575. Once UFS-DA ATM parallels can directly use GDA bufr files we can cold start from 20211220/18 ICSDIR/C96C48 and cycle through 20211221/00.

@RussTreadon-NOAA RussTreadon-NOAA deleted the feature/ufsda_hybvar branch August 23, 2023 16:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add hybrid option to UFS-DA ATM variational analysis
3 participants