Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bnb/training with obs #252

Draft
wants to merge 28 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
aff87b6
Refactor parallelization of bias calculations
bnb32 Jan 4, 2025
e726f12
dual sampler, queue, and batch handler with obs. modifying Sup3rDatas…
bnb32 Dec 20, 2024
4548a18
training with obs test
bnb32 Dec 20, 2024
d2fe03d
split up interface and abstact model
bnb32 Dec 21, 2024
9dc0143
made dual batch queue flexible enough to account for additional obs m…
bnb32 Dec 22, 2024
7e79caa
tensorboard mixin moved to model utilities. dual queue completely abs…
bnb32 Dec 22, 2024
6191b28
integrated dual sampler with obs into base dual sampler.
bnb32 Dec 23, 2024
4fdff8c
examples added to DataHandler doc string. Some instructions on sup3rw…
bnb32 Dec 23, 2024
e836057
removed namedtuple from Sup3rDataset to make Sup3rDataset picklable.
bnb32 Dec 26, 2024
1e0fb29
parallel batch queue test added.
bnb32 Dec 27, 2024
8c15444
namedtuple -> DsetTuple missing attr fix
bnb32 Dec 27, 2024
373cb68
gust added to era download variables. len dunder added to ``Container…
bnb32 Dec 27, 2024
4c8d77f
computing before reshaping is 2x faster.
bnb32 Dec 28, 2024
e1d1ac5
obs_index fix - sampler needs to use hr_out_features for the obs member.
bnb32 Dec 28, 2024
536beb1
split up ``calc_loss`` and ``calc_loss_obs``
bnb32 Dec 29, 2024
196becf
Optional run_qa flag in ``DualRasterizer``. Queue shape fix for queue…
bnb32 Dec 29, 2024
2a99d94
``run_qa=True`` default for ``DualRasterizer``
bnb32 Dec 29, 2024
c0d4d0a
better tracking of batch counting. (this can be tricky for parallel q…
bnb32 Dec 29, 2024
586130d
missed compute call for slow batching. this was hidden by queueing an…
bnb32 Dec 29, 2024
1e29b68
Included convert to tensor in ``sample_batch``. Test for training wit…
bnb32 Dec 30, 2024
fde57b6
cc batch handler test fix
bnb32 Dec 31, 2024
9b718d7
added test for new disc with "valid" padding
bnb32 Dec 31, 2024
ed51133
parallel sampling batch sampling test.
bnb32 Jan 1, 2025
39a28bb
removed workers tests. max_workers > 1 still not consistently faster.…
bnb32 Jan 2, 2025
0c3eb44
``Sup3rGanWithObs`` model subclass. Other misc model refactoring.
bnb32 Jan 3, 2025
7b62d65
bias test fixes
bnb32 Jan 4, 2025
a8eea08
additional bias refact: ``_run`` base method and ``_get_run_kwargs`` …
bnb32 Jan 4, 2025
ee8e237
moved ``_run`` method to bias correction interface ``AbstractBiasCorr…
bnb32 Jan 5, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -78,4 +78,4 @@ Brandon Benton, Grant Buster, Guilherme Pimenta Castelao, Malik Hassanaly, Pavlo
Acknowledgments
===============

This work was authored by the National Renewable Energy Laboratory, operated by Alliance for Sustainable Energy, LLC, for the U.S. Department of Energy (DOE) under Contract No. DE-AC36-08GO28308. This research was supported by the Grid Modernization Initiative of the U.S. Department of Energy (DOE) as part of its Grid Modernization Laboratory Consortium, a strategic partnership between DOE and the national laboratories to bring together leading experts, technologies, and resources to collaborate on the goal of modernizing the nation’s grid. Funding provided by the the DOE Office of Energy Efficiency and Renewable Energy (EERE), the DOE Office of Electricity (OE), DOE Grid Deployment Office (GDO), the DOE Office of Fossil Energy and Carbon Management (FECM), and the DOE Office of Cybersecurity, Energy Security, and Emergency Response (CESER), the DOE Advanced Scientific Computing Research (ASCR) program, the DOE Solar Energy Technologies Office (SETO), the DOE Wind Energy Technologies Office (WETO), the United States Agency for International Development (USAID), and the Laboratory Directed Research and Development (LDRD) program at the National Renewable Energy Laboratory. The research was performed using computational resources sponsored by the Department of Energy's Office of Energy Efficiency and Renewable Energy and located at the National Renewable Energy Laboratory. The views expressed in the article do not necessarily represent the views of the DOE or the U.S. Government. The U.S. Government retains and the publisher, by accepting the article for publication, acknowledges that the U.S. Government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this work, or allow others to do so, for U.S. Government purposes.
This work was authored by the National Renewable Energy Laboratory, operated by Alliance for Sustainable Energy, LLC, for the U.S. Department of Energy (DOE) under Contract No. DE-AC36-08GO28308. This research was supported by the Grid Modernization Initiative of the U.S. Department of Energy (DOE) as part of its Grid Modernization Laboratory Consortium, a strategic partnership between DOE and the national laboratories to bring together leading experts, technologies, and resources to collaborate on the goal of modernizing the nation’s grid. Funding provided by the the DOE Office of Energy Efficiency and Renewable Energy (EERE), the DOE Office of Electricity (OE), DOE Grid Deployment Office (GDO), the DOE Office of Fossil Energy and Carbon Management (FECM), and the DOE Office of Cybersecurity, Energy Security, and Emergency Response (CESER), the DOE Advanced Scientific Computing Research (ASCR) program, the DOE Solar Energy Technologies Office (SETO), the DOE Wind Energy Technologies Office (WETO), the United States Agency for International Development (USAID), and the Laboratory Directed Research and Development (LDRD) program at the National Renewable Energy Laboratory. The research was performed using computational resources sponsored by the Department of Energy's Office of Energy Efficiency and Renewable Energy and located at the National Renewable Energy Laboratory. The views expressed in the article do not necessarily represent the views of the DOE or the U.S. Government. The U.S. Government retains and the publisher, by accepting the article for publication, acknowledges that the U.S. Government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this work, or allow others to do so, for U.S. Government purposes.
39 changes: 36 additions & 3 deletions examples/sup3rwind/README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
Sup3rWind Examples
###################

Super-Resolution for Renewable Energy Resource Data with Wind from Reanalysis Data (Sup3rWind) is one application of the sup3r software. In this work, we train generative models to create high-resolution (2km 5-minute) wind data based on coarse (30km hourly) ERA5 data. The generative models and high-resolution output data is publicly available via the `Open Energy Data Initiative (OEDI) <https://data.openei.org/s3_viewer?bucket=nrel-pds-wtk&prefix=sup3rwind%2F>`__ and via HSDS at the bucket ``nrel-pds-hsds`` and path ``/nrel/wtk/sup3rwind``. This data covers recent historical time periods for an expanding selection of countries.
Super-Resolution for Renewable Energy Resource Data with Wind from Reanalysis Data (Sup3rWind) is one application of the sup3r software. In this work, we train generative models to create high-resolution (2km 5-minute) wind data based on coarse (30km hourly) ERA5 data. The generative models, high-resolution output data, and training data is publicly available via the `Open Energy Data Initiative (OEDI) <https://data.openei.org/s3_viewer?bucket=nrel-pds-wtk&prefix=sup3rwind%2F>`__ and via HSDS at the bucket ``nrel-pds-hsds`` and path ``/nrel/wtk/sup3rwind``. This data covers recent historical time periods for an expanding selection of countries.

Sup3rWind Data Access
----------------------
Expand All @@ -11,8 +11,8 @@ The Sup3rWind data and models are publicly available in a public AWS S3 bucket.

The Sup3rWind data is also loaded into `HSDS <https://www.hdfgroup.org/solutions/highly-scalable-data-service-hsds/>`__ so that you may stream the data via the `NREL developer API <https://developer.nrel.gov/signup/>`__ or your own HSDS server. This is the best option if you're not going to want a full annual dataset. See these `rex instructions <https://nrel.github.io/rex/misc/examples.hsds.html>`__ for more details on how to access this data with HSDS and rex.

Example Sup3rWind Data Usage
-----------------------------
Sup3rWind Data Usage
---------------------

Sup3rWind data can be used in generally the same way as `Sup3rCC <https://nrel.github.io/sup3r/examples/sup3rcc.html>`__ data, with the condition that Sup3rWind includes only wind data and ancillary variables for modeling wind energy generation. Refer to the Sup3rCC `example notebook <https://github.com/NREL/sup3r/tree/main/examples/sup3rcc/using_the_data.ipynb>`__ for usage patterns.

Expand All @@ -32,6 +32,39 @@ The process for running the Sup3rWind models is much the same as for `Sup3rCC <h
#. If you're running on a slurm cluster, this will kick off a number of jobs that you can see with the ``squeue`` command. If you're running locally, your terminal should now be running the Sup3rWind models. The software will create a ``./logs/`` directory in which you can monitor the progress of your jobs.
#. The ``sup3r-pipeline`` is designed to run several modules in serial, with each module running multiple chunks in parallel. Once the first module (forward-pass) finishes, you'll want to run ``python -m sup3r.cli -c config_pipeline.json pipeline`` again. This will clean up status files and kick off the next step in the pipeline (if the current step was successful).

Training from scratch
---------------------

To train Sup3rWind models from scratch use the public training `data <https://data.openei.org/s3_viewer?bucket=nrel-pds-wtk&prefix=sup3rwind%2Ftraining_data%2F>`__. This data is for training the spatial enhancement models only. The 2024-01 `models <https://data.openei.org/s3_viewer?bucket=nrel-pds-wtk&prefix=sup3rwind%2Fmodels%2Fsup3rwind_models_202401%2F>`__ perform spatial enhancement in two steps, 3x from ERA5 to coarsened WTK and 5x from coarsened WTK to uncoarsened WTK. The currently used approach performs spatial enhancement in a single 15x step.

For a given year and training domain, initialize low-resolution and high-resolution data handlers and wrap these in a dual rasterizer object. Do this for as many years and training regions as desired, and use these containers to initialize a batch handler. To train models for 3x spatial enhancement use ``hr_spatial_coarsen=5`` in the ``hr_dh``. To train models for 15x (the currently used approach) ``hr_spatial_coarsen=1``. (Refer to tests and docs for information on additional arguments, denoted by the ellipses)::

from sup3r.preprocessing import DataHandler, DualBatchHandler, DualRasterizer
containers = []
for tdir in training_dirs:
lr_dh = DataHandler(f"{tdir}/lr_*.h5", ...)
hr_dh = DataHandler(f"{tdir}/hr_*.h5", hr_spatial_coarsen=...)
container = DualRasterizer({'low_res': lr_dh, 'high_res': hr_dh}, ...)
containers.append(container)
bh = DualBatchHandler(train_containers=containers, ...)

To train a 5x model use the ``hr_*.h5`` files for both the ``lr_dh`` and the ``hr_dh``. Use ``hr_spatial_coarsen=3`` in the ``lr_dh`` and ``hr_spatial_coarsen=1`` in the ``hr_dh``::

for tdir in training_dirs:
lr_dh = DataHandler(f"{tdir}/hr_*.h5", hr_spatial_coarsen=3, ...)
hr_dh = DataHandler(f"{tdir}/hr_*.h5", hr_spatial_coarsen=1, ...)
container = DualRasterizer({'low_res': lr_dh, 'high_res': hr_dh}, ...)
containers.append(container)
bh = DualBatchHandler(train_containers=containers, ...)


Initialize a 3x, 5x, or 15x spatial enhancement model, with 14 output channels, and train for the desired number of epochs. (The 3x and 5x generator configs can be copied from the ``model_params.json`` files in each OEDI model `directory <https://data.openei.org/s3_viewer?bucket=nrel-pds-wtk&prefix=sup3rwind%2Fmodels%2Fsup3rwind_models_202401%2F>`__. The 15x generator config can be created from the OEDI model configs by changing the spatial enhancement factor or from the configs in the repo by changing the enhancement factor and the number of output channels)::

from sup3r.models import Sup3rGan
model = Sup3rGan(gen_layers="./gen_config.json", disc_layers="./disc_config.json", ...)
model.train(batch_handler, ...)


Sup3rWind Versions
-------------------

Expand Down
15 changes: 13 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -42,10 +42,21 @@ dependencies = [
"pytest>=5.2",
"scipy>=1.0.0",
"sphinx>=7.0",
"tensorflow>2.4,<2.16",
"xarray>=2023.0"
]

# If used, cause glibc conflict
# [tool.pixi.target.linux-64.dependencies]
# cuda = ">=11.8"
# cudnn = {version = ">=8.6.0", channel = "conda-forge"}
# # 8.9.7

[tool.pixi.target.linux-64.pypi-dependencies]
tensorflow = {version = "~=2.15.1", extras = ["and-cuda"] }

[tool.pixi.target.osx-arm64.dependencies]
tensorflow = {version = "~=2.15.0", channel = "conda-forge"}

[project.optional-dependencies]
dev = [
"build>=0.5",
Expand Down Expand Up @@ -272,7 +283,6 @@ matplotlib = ">=3.1"
numpy = "~=1.7"
pandas = ">=2.0"
scipy = ">=1.0.0"
tensorflow = ">2.4,<2.16"
xarray = ">=2023.0"

[tool.pixi.pypi-dependencies]
Expand All @@ -284,6 +294,7 @@ NREL-farms = { version = ">=1.0.4" }

[tool.pixi.environments]
default = { solve-group = "default" }
kestrel = { features = ["kestrel"], solve-group = "default" }
dev = { features = ["dev", "doc", "test"], solve-group = "default" }
doc = { features = ["doc"], solve-group = "default" }
test = { features = ["test"], solve-group = "default" }
Expand Down
167 changes: 167 additions & 0 deletions sup3r/bias/abstract.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,167 @@
"""Bias correction class interface."""

import logging
from abc import ABC, abstractmethod
from concurrent.futures import ProcessPoolExecutor, as_completed

import numpy as np

from sup3r.preprocessing import DataHandler

logger = logging.getLogger(__name__)


class AbstractBiasCorrection(ABC):
"""Minimal interface for bias correction classes"""

@abstractmethod
def _get_run_kwargs(self, **kwargs_extras):
"""Get dictionary of kwarg dictionaries to use for calls to
``_run_single``. Each key-value pair is a bias_gid with the associated
``_run_single`` arguments for that gid"""

def _run_in_parallel(self, task_kwargs, max_workers=None):
"""
Execute a list of tasks in parallel using ``ProcessPoolExecutor``.

Parameters
----------
task_kwargs : dictionary
A dictionary of keyword argument dictionaries for a single call to
``task_function``.
max_workers : int, optional
The maximum number of workers to use. If None, it uses all
available.

Returns
-------
results : dictionary
A dictionary of results from the executed tasks with the same keys
as ``task_kwargs``.
"""

results = {}
with ProcessPoolExecutor(max_workers=max_workers) as exe:
futures = {
exe.submit(self._run_single, **kwargs): bias_gid
for bias_gid, kwargs in task_kwargs.items()
}
for future in as_completed(futures):
bias_gid = futures[future]
results[bias_gid] = future.result()
return results

def _run(
self,
out,
max_workers=None,
fill_extend=True,
smooth_extend=0,
smooth_interior=0,
**kwargs_extras,
):
"""Run correction factor calculations for every site in the bias
dataset

Parameters
----------
out : dict
Dictionary of arrays to fill with bias correction factors.
max_workers : int
Number of workers to run in parallel. 1 is serial and None is all
available.
daily_reduction : None | str
Option to do a reduction of the hourly+ source base data to daily
data. Can be None (no reduction, keep source time frequency), "avg"
(daily average), "max" (daily max), "min" (daily min),
"sum" (daily sum/total)
fill_extend : bool
Flag to fill data past distance_upper_bound using spatial nearest
neighbor. If False, the extended domain will be left as NaN.
smooth_extend : float
Option to smooth the scalar/adder data outside of the spatial
domain set by the distance_upper_bound input. This alleviates the
weird seams far from the domain of interest. This value is the
standard deviation for the gaussian_filter kernel
smooth_interior : float
Option to smooth the scalar/adder data within the valid spatial
domain. This can reduce the affect of extreme values within
aggregations over large number of pixels.
kwargs_extras: dict
Additional kwargs that get sent to ``_run_single`` e.g.
daily_reduction='avg', zero_rate_threshold=1.157e-7

Returns
-------
out : dict
Dictionary of values defining the mean/std of the bias + base data
and correction factors to correct the biased data like: bias_data *
scalar + adder. Each value is of shape (lat, lon, time).
"""
self.bad_bias_gids = []

task_kwargs = self._get_run_kwargs(**kwargs_extras)
# sup3r DataHandler opening base files will load all data in parallel
# during the init and should not be passed in parallel to workers
if isinstance(self.base_dh, DataHandler):
max_workers = 1

if max_workers == 1:
logger.debug('Running serial calculation.')
results = {
bias_gid: self._run_single(**kwargs, base_dh_inst=self.base_dh)
for bias_gid, kwargs in task_kwargs.items()
}
else:
logger.info(
'Running parallel calculation with %s workers.', max_workers
)
results = self._run_in_parallel(
task_kwargs, max_workers=max_workers
)
for i, (bias_gid, single_out) in enumerate(results.items()):
raster_loc = np.where(self.bias_gid_raster == bias_gid)
for key, arr in single_out.items():
out[key][raster_loc] = arr
logger.info(
'Completed bias calculations for %s out of %s sites',
i + 1,
len(results),
)

logger.info('Finished calculating bias correction factors.')

return self.fill_and_smooth(
out, fill_extend, smooth_extend, smooth_interior
)

@abstractmethod
def run(
self,
fp_out=None,
max_workers=None,
daily_reduction='avg',
fill_extend=True,
smooth_extend=0,
smooth_interior=0,
):
"""Run correction factor calculations for every site in the bias
dataset"""

@classmethod
@abstractmethod
def _run_single(
cls,
bias_data,
base_fps,
bias_feature,
base_dset,
base_gid,
base_handler,
daily_reduction,
bias_ti,
decimals,
base_dh_inst=None,
match_zero_rate=False,
):
"""Find the bias correction factors at a single site"""
4 changes: 2 additions & 2 deletions sup3r/bias/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ def __init__(
bias_handler_kwargs=None,
decimals=None,
match_zero_rate=False,
pre_load=True
pre_load=True,
):
"""
Parameters
Expand Down Expand Up @@ -178,7 +178,7 @@ class is used, all data will be loaded in this class'

self.nn_dist, self.nn_ind = self.bias_tree.query(
self.base_meta[['latitude', 'longitude']],
distance_upper_bound=self.distance_upper_bound
distance_upper_bound=self.distance_upper_bound,
)

if pre_load:
Expand Down
Loading
Loading