-
Notifications
You must be signed in to change notification settings - Fork 59
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
<!--Please ensure the PR fulfills the following requirements! --> <!-- If this is your first PR, make sure to add your details to the AUTHORS.rst! --> ### Pull Request Checklist: - [x] This PR addresses an already opened issue (for bug fixes / features) - This PR fixes #xyz - [x] Tests for the changes have been added (for bug fixes / features) - [x] (If applicable) Documentation has been added / updated (for bug fixes / features) - [x] CHANGES.rst has been updated (with summary of main changes) - [x] Link to issue (:issue:`number`) and pull request (:pull:`number`) has been added ### What kind of change does this PR introduce? New `MBCn` TrainAdjust class. The train part finds adjustment factors for the npdf transform. The adjust part does the rest. * A single numpy function to perform all rotations of the npdf_transform makes the process faster * Grouping is handled using the same logic as in numpy_groupies. I initially tried to stop using map_blocks by using what I call a the Big Dataset (BD) solution. It was a dataset that included the group windowed blocks. This was working well but sometimes caused dask workers to die. Maybe a better chunking could have solved this problem. But instead of constructing a BD, we simply loop over blocks, and simply specify time indices in each block (à la groupies) in the original datasets. The resulting code is a bit more messy, but it seems to be working well performance-wise. The function also changes how windowed group blocks are handled throughout the computation. Now, a block is preserved its form from begin to start of the MBCn computation. * This is in contrast to the current way which was grouping and ungrouping block between each iteration of the NpdfTransform. * The standardization is performed on a block * The univariate bias correction is maintainted as blocks, reordered, *then* the blocks are ungrouped * In the sdba notebook, it was suggested that we should give the univariate bias corrected datasets in the npdf transform. But following (Cannon, 2018), we should input the raw datasets in the npdf transform. This change should not really matter that much, but still, to perform exactly the MBCn as presented by Cannon, this change is necessary. All these changes will result in a different output for `window>1` and our implementation should now match that of Cannon. ### Does this PR introduce a breaking change? No ### Other information * It might be worthwhile to retest `map_blocks` to see if, with the rest of changes, it can offer a good performance. It would be cleaner code * Using BD would also simplify many things, worth re-exploring if it can maintain the performance
- Loading branch information
Showing
8 changed files
with
975 additions
and
224 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.