-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create and populate ensemble directory for UFS-based ATM DA #1801
Create and populate ensemble directory for UFS-based ATM DA #1801
Conversation
…ional analysis with hybrid B (NOAA-EMC#1799)
Link to ReadTheDocs sample build for this PR can be found at: |
@@ -394,6 +400,67 @@ def _get_berror_dict_gsibec(config: Dict[str, Any]) -> Dict[str, List[str]]: | |||
} | |||
return berror_dict | |||
|
|||
@logit(logger) | |||
def get_berror_ens_dict(self, config: Dict[str, Any]) -> Dict[str, List[str]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to find a way to not repeat large sections of the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not understand your comment. A new method was added to atm_analysis.py
. It uses templates to create and populate the ens
directory if DOHYBVAR=YES
. It does not repeat other sections of atm_analysis.py
. If you have a preferred way to implement the new method, I'm willing to learn.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new method is a duplicate of the method in the atmens_analysis.py
.
Is there a reason why this method from atmens_analysis.py
cannot be used by making it available to atm_analysis.py
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. The ensemble directory pieces are 99.9% identical. atmens_analysis.py
populates bkg/
whereas atm_analysis.py
populates ens/
. I always start from the simplest implementation, get it to work, and generalize from there.
A possible refactor would be to
- remove
get_berror_ens_dict
fromatm_analysis.py
- remove
get_bkg_dict
fromatmens_analysis.py
- add generic
get_ens_dict
toanalysis.py
. Passbkg
orens
into method to create and populate correct directory. Alternatively, figure out another way to distinguish between deterministic (ens
) and ensemble (bkg
) jobs. - add
get_ens_dict
toatm_analysis.py
andatmens_analysis.py
Is this acceptable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @RussTreadon-NOAA for iterating with me.
Elevating to analysis.py
and calling it get_fv3_ens_dict
would be acceptable.
The trouble you are having with generate_com
is because you are trying to create a template from a template.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the recommended fix, then, for the shellcheck error? What is currently scripted in JGLOBAL_ATM_ANALYSIS_INITIALIZE
works ... but it fails shellcheck.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should be able to use COM_ATMOS_RESTART_TMPL
directly instead of running it through generate_com()
. All generate_com does is fill in the template and assign the result to a variable. Just be sure to supply the correct variable assignments when you fill it in in python.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Attempt to address reviewer comments committed to feature/ufsda_hybvar
at 1f9b889
Link to ReadTheDocs sample build for this PR can be found at: |
…thod to analysis.py; invoke get_ens_dict from atm_analysis.py and atmens_analysis.py (NOAA-EMC#1799)
Link to ReadTheDocs sample build for this PR can be found at: |
Link to ReadTheDocs sample build for this PR can be found at: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this PR @RussTreadon-NOAA
I have a few comments.
If there is anything that is confusing or unclear, please let me know.
If you wish, I can do some of the work that I describe; staticmethod
next week.
Also, it would be great to get a test for JEDI Atm 3dvar, EnKF and now 4DEnVar when you are ready to ensure these capabilities are retained.
ush/python/pygfs/task/analysis.py
Outdated
@@ -200,6 +201,74 @@ def link_jediexe(self) -> None: | |||
|
|||
return | |||
|
|||
@logit(logger) | |||
def get_ens_dict(self, task_config: Dict[str, Any]) -> Dict[str, List[str]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My suggestion would be to name this method get_fv3ens_dict
or get_atmens_dict
since that is what this method is doing.
I can anticipate similar requirements for ocean, ice, aerosols in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename as get_fv3ens_dict
. Done at 5e88371.
@@ -108,7 +108,8 @@ def initialize(self: Analysis) -> None: | |||
FileHandler(jedi_fix_list).sync() | |||
|
|||
# stage backgrounds | |||
FileHandler(self.get_bkg_dict()).sync() | |||
logger.debug(f"Stage ensemble member background files") | |||
FileHandler(Analysis.get_ens_dict(self, self.task_config)).sync() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FileHandler(Analysis.get_ens_dict(self, self.task_config)).sync() | |
FileHandler(self.get_ens_dict(self.task_config)).sync() |
atmens_analysis.py
inherits from analysis.py
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Accept suggestion. Done at 5e88371.
# stage ensemble files for use in hybrid background error | ||
if self.task_config.DOHYBVAR: | ||
logger.debug(f"Stage ensemble files for DOHYBVAR {self.task_config.DOHYBVAR}") | ||
FileHandler(Analysis.get_ens_dict(self, self.task_config)).sync() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FileHandler(Analysis.get_ens_dict(self, self.task_config)).sync() | |
FileHandler(sel.f.get_ens_dict(self.task_config)).sync() |
atm_analysis.py
inherits from analysis.py
, so this method is already in scope.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Accept suggestion. Done at 5e88371.
ush/python/pygfs/task/analysis.py
Outdated
@@ -200,6 +201,74 @@ def link_jediexe(self) -> None: | |||
|
|||
return | |||
|
|||
@logit(logger) | |||
def get_ens_dict(self, task_config: Dict[str, Any]) -> Dict[str, List[str]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation is calling this config
and not task_config
.
It should be config
as task_config
is an attribute of self
and we do not want it to confused with this "config"
.
def get_ens_dict(self, task_config: Dict[str, Any]) -> Dict[str, List[str]]: | |
def get_ens_dict(self, config: Dict[str, Any]) -> Dict[str, List[str]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree that scripting does not agree with documentation. Both atm_analysis.py
and atmens_analysis.py
pass self.config_task
to get_fv3ens_dict
so it seems documentation should be updated. That is, replace config
with task_config
in the documentation
Parameters
----------
task_config: Dict
a dictionary containing all of the configuration needed for the task
This change was committed at 5e88371. Not sure if this is acceptable.
ush/python/pygfs/task/analysis.py
Outdated
template_res = self.task_config.COM_ATMOS_RESTART_TMPL | ||
prev_cycle = self.task_config.previous_cycle | ||
tmpl_res_dict = { | ||
'ROTDIR': self.task_config.ROTDIR, | ||
'RUN': self.task_config.RUN, | ||
'YMD': to_YMD(prev_cycle), | ||
'HH': prev_cycle.strftime('%H'), | ||
'MEMDIR': None | ||
} | ||
|
||
# set directory type based on RUN | ||
if self.task_config.RUN in ['enkfgdas', 'enkfgfs']: | ||
dirtype = 'bkg' | ||
else: | ||
tmpl_res_dict['RUN'] = 'enkf' + self.task_config.RUN | ||
dirtype = 'ens' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that I have had a chance to look at this, I would suggest turning this method in to a staticmethod
and pass everything that this method needs through config
, rather than relying on the global variable self.task_config
.
This would also eliminate this if-else
and raise it to the class that defines this specific need.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I see what you are saying. Let me see if I can correctly implement this suggestion. I'll update the method documentation accordingly once I get the suggested change implemented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_fv3ens_dict
has been changed to a staticmethod
. The if-else
in get_fv3ens_dict
has been removed and necessary additions made in the classes which invoke the method. Done at 51200a0
.
Link to ReadTheDocs sample build for this PR can be found at: |
…s_analysis accordingly (NOAA-EMC#1799)
Link to ReadTheDocs sample build for this PR can be found at: |
Link to ReadTheDocs sample build for this PR can be found at: |
self.task_config.RUN = 'enkf' + self.task_config.RUN | ||
self.task_config.dirname ='ens' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are dangerous.
After these lines, one would have altered the self.task_config
This is forbidden.
What you should do here instead, is make a deepcopy of self.task_config
, extract the attributes that you need to send to self.get_fv3ens_dict
, instead of sending in the entire task configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess having this in get_fv3ens_dict
was safer. I could move it back but I don't think you want this. I don't know how to do what you outline. Is there an example somewhere in g-w that I can follow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see land_analysis.py
.
You could follow something along these lines (not what this method needs), but an example of creating a local dict and pulling relevant info from self.task_config
.
localconf = AttrDict()
keys = ['DATA', 'current_cycle', 'COM_OBS', 'COM_ATMOS_RESTART_PREV',
'OPREFIX', 'CASE', 'ntiles']
for key in keys:
localconf[key] = self.task_config[key]
self.get_fv3ens_dict(localconf)
Having them in self.get_fv3ens_dict
is not really a good solution as it makes that method bring in the entire global environment. One should know exactly what a method requires and pass that explicitly. One could say it is the same argument that people frown on using global variables in Fortran.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the guidance. Back to refactoring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Create, populate, and pass localconf
to get_fv3ens_dict
. Change made in both atm_analysis.py
and atmens_analysis.py
. Done at a2ab183
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perfect @RussTreadon-NOAA
This way, when the method runs, the logger will print out exactly what is going in and coming out.
Thank you~
Link to ReadTheDocs sample build for this PR can be found at: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
Two comments:
- please update the description of the PR to reflect the actual changes
- can you provide test cases and configurations so we can run the UFS-based atmosphere DA as part of our CI tests?
PR description updated to note that this PR impacts the staging of ensemble files for both hybrid variational and ensemble UFS-DA atmospheric analyses. Is this acceptable? My preference is to provide test cases and configurations so we can run the UFS-based atmosphere DA as part of our CI tests via a new g-w issue and subsequent PR. Is this acceptable? Where are the files for current g-w CI tests? I'd like to see if I can leverage or augment what we already have instead of adding new cases to the g-w CI database. |
sure
yes
On Hera: |
OK. We can work cycle Initial steps toward this goal are in GDASApp PR #575. Once UFS-DA ATM parallels can directly use GDA bufr files we can cold start from 20211220/18 |
Description
This PR adds scripting to stage ensemble files for use in hybrid variational and ensemble UFS-based atmospheric DA. Three files are modified:
ush/python/pygfs/task/analysis.py
- add methodget_ens_dict
to construct dictionary of ensemble member RESTART files to copyush/python/pygfs/task/atm_analysis.py
- invokeget_ens_dict
to stage ensemble members in UFS-based ATM variational DA runtime directoryush/python/pygfs/task/atmens_analysis.py
- invokeget_ens_dict
to stage ensemble members in UFS-based ATM ensemble DA runtime directoryFixes #1799
Type of change
How Has This Been Tested?
Clone and install g-w
feature/ufsda_hybvar
on Hera. ConfigureEXPDIR
to run UFS-based atmospheric DA using hybrid background error. Rungdasatmanlinit
,gdasatmanlrun
, andgdasatmanlfinal
jobs for 2021081418. Confirm thatinit
job populated run directory with correct ensemble member restarts. Subsequentrun
job successfully ran to completion as didfinal
job.Repeat the above steps for
enkfgdasatmensanlinit
,enkfgdasatmensanlrun
, andenkfgdasatmensanlfinal
. Confirm that modified scripts work as intended.Checklist