Skip to content

Commit

Permalink
Dimensionless units as "1" - adapt to numpy 2 (#1814)
Browse files Browse the repository at this point in the history
<!--Please ensure the PR fulfills the following requirements! -->
<!-- If this is your first PR, make sure to add your details to the
AUTHORS.rst! -->
### Pull Request Checklist:
- [x] This PR addresses an already opened issue (for bug fixes /
features)
    - This PR fixes #1785 
- [x] Tests for the changes have been added (for bug fixes / features)
- [x] (If applicable) Documentation has been added / updated (for bug
fixes / features)
- [x] CHANGES.rst has been updated (with summary of main changes)
- [x] Link to issue (:issue:`number`) and pull request (:pull:`number`)
has been added

### What kind of change does this PR introduce?

* Changes NaN and NAN to nan, Inf to inf.
* Changes a test so the new data type promotion of numpy 2 fits our
expected values
* Relaxes a test for the same reason
* Change expected unit order in some cases (new cf_xarray + pint)
* Dimensionless units are now printed as "1". 
* Simplify `pint2cfunits`.
* Add `ensure_absolute_tempetature` to its module's `__all__` and move
`ensure_delta` up in the same module so both functions are near another
in the file.

### Does this PR introduce a breaking change?
Yes it does. Fixing numpy 2 issues made me fix pint 0.24.1 issues that
made me fix cf_xarray issues which have solution that is not
backwards-compatible and now pint and cf_xarray have updated pinned that
imply a numpy >=2 pin.

### Other information:
~We will require 3 new pins :~
- Most problems with `create_ensemble` come from a np2 bug in xarray,
which was fixed here pydata/xarray#9182. ~We are thus waiting for a
release~. Xarray 2024.07.0 out on the 30th.
- All units problem are solved with xarray-contrib/cf-xarray#523, which
was released in cf-xarray 0.9.3.
- The dimensionless unit thing requires pint 0.24.1 ~which requires
numpy 2, so pinning this as well.~

UPDATE: No pins were added, but the behaviour of xclim will be different
for dimensionless indicators depending on the cf_xarray/pint versions
installed.
  • Loading branch information
aulemahal authored Jul 31, 2024
2 parents 1148771 + 3ded3ed commit d13cb6c
Show file tree
Hide file tree
Showing 45 changed files with 392 additions and 223 deletions.
4 changes: 2 additions & 2 deletions .github/workflows/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,7 @@ jobs:
markers: -m 'not slow'
os: windows-latest
# macOS builds
- tox-env: py310-coverage-extras-lmoments
- tox-env: py310-coverage-extras-lmoments-numpy
python-version: "3.10"
markers: -m 'not slow'
os: macos-latest
Expand All @@ -135,7 +135,7 @@ jobs:
- tox-env: py310-coverage-lmoments # No markers -- includes slow tests
python-version: "3.10"
os: ubuntu-latest
- tox-env: py311-coverage-extras-sbck
- tox-env: py311-coverage-extras-sbck-numpy
python-version: "3.11"
markers: -m 'not slow'
os: ubuntu-latest
Expand Down
7 changes: 6 additions & 1 deletion CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,18 +4,23 @@ Changelog

v0.52.0 (unreleased)
--------------------
Contributors to this version: David Huard (:user:`huard`), Trevor James Smith (:user:`Zeitsperre`), Hui-Min Wang (:user:`Hem-W`), Éric Dupuis (:user:`coxipi`), Sarah Gammon (:user:`SarahG-579462`).
Contributors to this version: David Huard (:user:`huard`), Trevor James Smith (:user:`Zeitsperre`), Hui-Min Wang (:user:`Hem-W`), Éric Dupuis (:user:`coxipi`), Sarah Gammon (:user:`SarahG-579462`), Pascal Bourgault (:user:`aulemahal`).

New features and enhancements
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* ``xclim.sdba.nbutils.quantile`` and its child functions are now faster. If the module `fastnanquantile` is installed, it is used as the backend for the computation of quantiles and yields even faster results. This dependency is now listed in the `xclim[extras]` recipe. (:issue:`1255`, :pull:`1513`).
* New multivariate bias adjustment class `MBCn`, giving a faster and more accurate implementation of the 'MBCn' algorithm (:issue:`1551`, :pull:`1580`).
* `xclim` is now compatible with `pytest` versions `>=8.0.0`. (:pull:`1632`).

Breaking changes
^^^^^^^^^^^^^^^^
* As updated in ``cf_xarray>=0.9.3``, dimensionless quantities now use the "1" units attribute as specified by the CF conventions, previously an empty string was returned. (:pull:`1814`).

Bug fixes
^^^^^^^^^
* Fixed the indexer bug in the ``xclim.indices.standardized_index_fit_params`` when multiple or non-array indexers are specified and fitted parameters are reloaded from netCDF. (:issue:`1842`, :pull:`1843`).
* Addressed a bug found in ``wet_spell_*`` indicators that was contributing to erroneous results. A new generic spell length statistic function ``xclim.indices.generic.spell_length_statistics`` is now used in wet and dry spells indicators. (:issue:`1834`, :pull:`1838`).
* Syntax for ``nan`` and ``inf`` was adapted to support ``numpy>=2.0.0``. (:pull:`1814`, :issue:`1785`).

Internal changes
^^^^^^^^^^^^^^^^
Expand Down
8 changes: 5 additions & 3 deletions docs/notebooks/sdba-advanced.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -271,7 +271,9 @@
"metadata": {},
"outputs": [],
"source": [
"QDM.ds.to_netcdf(\"QDM_training.nc\")\n",
"# The engine keyword is only needed if netCDF4 is not available\n",
"# The training dataset contains some attributes with types not supported by scipy/netCDF3\n",
"QDM.ds.to_netcdf(\"QDM_training.nc\", engine=\"h5netcdf\")\n",
"ds = xr.open_dataset(\"QDM_training.nc\")\n",
"QDM2 = QuantileDeltaMapping.from_dataset(ds)\n",
"QDM2"
Expand All @@ -292,7 +294,7 @@
"metadata": {},
"outputs": [],
"source": [
"QDM.ds.to_netcdf(\"QDM_training2.nc\")\n",
"QDM.ds.to_netcdf(\"QDM_training2.nc\", engine=\"h5netcdf\")\n",
"ds = xr.open_dataset(\"QDM_training2.nc\")\n",
"ds.attrs[\"title\"] = \"This is the dataset, but read from disk.\"\n",
"QDM.set_dataset(ds)\n",
Expand Down Expand Up @@ -878,7 +880,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.4"
"version": "3.12.4"
},
"toc": {
"base_numbering": 1,
Expand Down
4 changes: 2 additions & 2 deletions docs/notebooks/sdba.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -360,7 +360,7 @@
"outputs": [],
"source": [
"# We are using xarray's \"air_temperature\" dataset\n",
"ds = xr.tutorial.open_dataset(\"air_temperature\")"
"ds = xr.tutorial.load_dataset(\"air_temperature\")"
]
},
{
Expand Down Expand Up @@ -598,7 +598,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.3"
"version": "3.12.4"
},
"toc": {
"base_numbering": 1,
Expand Down
6 changes: 2 additions & 4 deletions environment.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
name: xclim
channels:
- numba # Added to gain access to Python3.12-compatible numba release candidates.
- conda-forge
- defaults
dependencies:
Expand All @@ -13,9 +12,9 @@ dependencies:
- dask >=2.6.0
- jsonpickle
- numba
- numpy >=1.20.0,<2.0.0
- numpy >=1.20.0
- pandas >=2.2.0
- pint >=0.10,<0.24
- pint >=0.18.0
- poppler >=0.67
- pyarrow # Strongly encouraged for Pandas v2.2.0+
- pyyaml
Expand Down Expand Up @@ -53,7 +52,6 @@ dependencies:
- nbsphinx
- nbval >=0.11.0
- nc-time-axis
- netCDF4 >=1.4,<1.7
- notebook
- pandas-stubs
- platformdirs
Expand Down
7 changes: 3 additions & 4 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -41,10 +41,10 @@ dependencies = [
"filelock",
"jsonpickle",
"numba",
"numpy>=1.20.0,<2.0.0",
"numpy>=1.20.0",
"packaging",
"pandas>=2.2",
"pint>=0.10,<0.24",
"pint>=0.18",
"platformdirs >=3.2",
"pyarrow", # Strongly encouraged for pandas v2.2.0+
"pyyaml",
Expand Down Expand Up @@ -74,7 +74,6 @@ dev = [
"nbconvert <7.14", # Pinned due to directive errors in sphinx. See: https://github.com/jupyter/nbconvert/issues/2092
"nbqa >=1.8.2",
"nbval >=0.11.0",
"netCDF4 >=1.4,<1.7",
"pandas-stubs >=2.2",
"platformdirs >=3.2",
"pooch",
Expand Down Expand Up @@ -181,7 +180,7 @@ pep621_dev_dependency_groups = ["all", "dev", "docs"]

[tool.deptry.per_rule_ignores]
DEP001 = ["SBCK"]
DEP002 = ["bottleneck", "pyarrow"]
DEP002 = ["bottleneck", "h5netcdf", "pyarrow"]
DEP004 = ["matplotlib", "pytest", "pytest_socket"]

[tool.flit.sdist]
Expand Down
8 changes: 7 additions & 1 deletion tests/test_atmos.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@

import numpy as np
import xarray as xr
from cf_xarray import __version__ as __cfxr_version__
from packaging.version import Version

from xclim import atmos, set_options

Expand Down Expand Up @@ -260,7 +262,11 @@ def test_wind_profile(atmosds):

def test_wind_power_potential(atmosds):
out = atmos.wind_power_potential(wind_speed=atmosds.sfcWind)
assert out.attrs["units"] == ""

if Version(__cfxr_version__) < Version("0.9.3"):
assert out.attrs["units"] == ""
else:
assert out.attrs["units"] == "1"
assert (out >= 0).all()
assert (out <= 1).all()

Expand Down
6 changes: 3 additions & 3 deletions tests/test_calendar.py
Original file line number Diff line number Diff line change
Expand Up @@ -281,7 +281,7 @@ def test_convert_calendar_and_doy():
out = convert_doy(doy, target_cal="360_day", align_on="date").convert_calendar(
"360_day", align_on="date"
)
np.testing.assert_array_equal(out, [np.NaN, 31, 332, 360.23, np.NaN])
np.testing.assert_array_equal(out, [np.nan, 31, 332, 360.23, np.nan])
assert out.time.dt.calendar == "360_day"


Expand Down Expand Up @@ -456,7 +456,7 @@ def test_convert_doy():
)

out = convert_doy(doy, "360_day", align_on="date")
np.testing.assert_array_equal(out, [np.NaN, 31, 332, 360.23, np.NaN])
np.testing.assert_array_equal(out, [np.nan, 31, 332, 360.23, np.nan])
assert out.time.dt.calendar == "noleap"
out = convert_doy(doy, "360_day", align_on="year")
np.testing.assert_allclose(
Expand All @@ -475,7 +475,7 @@ def test_convert_doy():
).expand_dims(lat=[10, 20, 30])

out = convert_doy(doy, "noleap", align_on="date")
np.testing.assert_array_equal(out.isel(lat=0), [31, 200.48, 190, np.NaN, 299.54])
np.testing.assert_array_equal(out.isel(lat=0), [31, 200.48, 190, np.nan, 299.54])
out = convert_doy(doy, "noleap", align_on="year")
np.testing.assert_allclose(
out.isel(lat=0), [31.0, 200.48, 190.0, 59.83607, 299.71885]
Expand Down
12 changes: 6 additions & 6 deletions tests/test_cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,7 @@ def test_normal_computation(
input_file = tmp_path / "in.nc"
output_file = tmp_path / "out.nc"

ds.to_netcdf(input_file)
ds.to_netcdf(input_file, engine="h5netcdf")

args = ["-i", str(input_file), "-o", str(output_file), "-v", indicator]
runner = CliRunner()
Expand Down Expand Up @@ -136,7 +136,7 @@ def test_multi_output(tmp_path, open_dataset):
ds = open_dataset("ERA5/daily_surface_cancities_1990-1993.nc")
input_file = tmp_path / "ws_in.nc"
output_file = tmp_path / "out.nc"
ds.to_netcdf(input_file)
ds.to_netcdf(input_file, engine="h5netcdf")

runner = CliRunner()
results = runner.invoke(
Expand Down Expand Up @@ -216,7 +216,7 @@ def test_missing_variable(tas_series, tmp_path):
input_file = tmp_path / "tas.nc"
output_file = tmp_path / "out.nc"

tas.to_netcdf(input_file)
tas.to_netcdf(input_file, engine="h5netcdf")

runner = CliRunner()
results = runner.invoke(
Expand All @@ -243,7 +243,7 @@ def test_global_options(tas_series, tmp_path, options, output):
input_file = tmp_path / "tas.nc"
output_file = tmp_path / "out.nc"

tas.to_netcdf(input_file)
tas.to_netcdf(input_file, engine="h5netcdf")

runner = CliRunner()
results = runner.invoke(
Expand Down Expand Up @@ -283,7 +283,7 @@ def test_dataflags_output(tmp_path, tas_series, tasmax_series, tasmin_series):
arr = series(vals, start="1971-01-01")
ds = xr.merge([ds, arr])
input_file = tmp_path / "ws_in.nc"
ds.to_netcdf(input_file)
ds.to_netcdf(input_file, engine="h5netcdf")

runner = CliRunner()
results = runner.invoke(
Expand All @@ -303,7 +303,7 @@ def test_bad_usage(tas_series, tmp_path):
input_file = tmp_path / "tas.nc"
output_file = tmp_path / "out.nc"

tas.to_netcdf(input_file)
tas.to_netcdf(input_file, engine="h5netcdf")

runner = CliRunner()

Expand Down
12 changes: 9 additions & 3 deletions tests/test_ensembles.py
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ def test_no_time(self, tmp_path, ensemble_dataset_objects, open_dataset):
ds["time"] = xr.decode_cf(ds).time
ds_all.append(ds.groupby(ds.time.dt.month).mean("time", keep_attrs=True))
ds.groupby(ds.time.dt.month).mean("time", keep_attrs=True).to_netcdf(
f1.joinpath(Path(n).name)
f1.joinpath(Path(n).name), engine="h5netcdf"
)
ens = ensembles.create_ensemble(ds_all)

Expand Down Expand Up @@ -288,8 +288,14 @@ def test_calc_mean_std_min_max(self, ensemble_dataset_objects, open_dataset):
)
out2 = ensembles.ensemble_mean_std_max_min(ens, weights=weights)
values = ens["tg_mean"][:, 0, 5, 5]
# Explicit float64 so numpy does the expected datatype promotion (change in numpy 2)
np.testing.assert_array_equal(
(values[0] * 1 + values[1] * 0.1 + values[2] * 3.5 + values[3] * 5)
(
values[0] * np.float64(1)
+ values[1] * np.float64(0.1)
+ values[2] * np.float64(3.5)
+ values[3] * np.float64(5)
)
/ np.sum(weights),
out2.tg_mean_mean[0, 5, 5],
)
Expand All @@ -311,7 +317,7 @@ def test_stats_min_members(self, ensemble_dataset_objects, open_dataset, aggfunc
ds_all = [open_dataset(n) for n in ensemble_dataset_objects["nc_files_simple"]]
ens = ensembles.create_ensemble(ds_all).isel(lat=0, lon=0)
ens = ens.where(ens.realization > 0)
ens = xr.where((ens.realization == 1) & (ens.time.dt.year == 1950), np.NaN, ens)
ens = xr.where((ens.realization == 1) & (ens.time.dt.year == 1950), np.nan, ens)

def first(ds):
return ds[list(ds.data_vars.keys())[0]]
Expand Down
16 changes: 11 additions & 5 deletions tests/test_generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@
import numpy as np
import pytest
import xarray as xr
from cf_xarray import __version__ as __cfxr_version__
from packaging.version import Version

from xclim.core.calendar import doy_to_days_since, select_time
from xclim.indices import generic
Expand Down Expand Up @@ -101,7 +103,11 @@ def test_doyminmax(self, q_series):
for da in [dmx, dmn]:
for attr in ["units", "is_dayofyear", "calendar"]:
assert attr in da.attrs.keys()
assert da.attrs["units"] == ""

if Version(__cfxr_version__) < Version("0.9.3"):
assert da.attrs["units"] == ""
else:
assert da.attrs["units"] == "1"
assert da.attrs["is_dayofyear"] == 1


Expand Down Expand Up @@ -393,13 +399,13 @@ def test_generic_forbidden_op(self, pr_series):

class TestGetDailyEvents:
def test_simple(self, tas_series):
arr = xr.DataArray(np.array([-10, 15, 20, np.NaN, 10]), name="Stuff")
arr = xr.DataArray(np.array([-10, 15, 20, np.nan, 10]), name="Stuff")

out = generic.get_daily_events(arr, threshold=10, op=">=")

assert out.name == "events"
assert out.sum() == 3
np.testing.assert_array_equal(out, [0, 1, 1, np.NaN, 1])
np.testing.assert_array_equal(out, [0, 1, 1, np.nan, 1])


class TestGenericCountingIndices:
Expand Down Expand Up @@ -470,7 +476,7 @@ def test_count_occurrences(self, tas_series, op, constrain, expected, should_fai
@pytest.mark.parametrize(
"op, constrain, expected, should_fail",
[
("<", None, np.NaN, False),
("<", None, np.nan, False),
("<=", None, 3, False),
("!=", ("!=",), 1, False),
("==", ("==", "!="), 3, False),
Expand All @@ -497,7 +503,7 @@ def test_first_occurrence(self, tas_series, op, constrain, expected, should_fail
@pytest.mark.parametrize(
"op, constrain, expected, should_fail",
[
("<", None, np.NaN, False),
("<", None, np.nan, False),
("<=", None, 8, False),
("!=", ("!=",), 9, False),
("==", ("==", "!="), 8, False),
Expand Down
8 changes: 7 additions & 1 deletion tests/test_hydrology.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
from __future__ import annotations

import numpy as np
from cf_xarray import __version__ as __cfxr_version__
from packaging.version import Version

from xclim import indices as xci

Expand Down Expand Up @@ -40,7 +42,11 @@ def test_simple(self, snw_series):
snw = snw_series(a, start="1999-01-01")
out = xci.snw_max_doy(snw, freq="YS")
np.testing.assert_array_equal(out, [11, np.nan])
assert out.units == ""

if Version(__cfxr_version__) < Version("0.9.3"):
assert out.attrs["units"] == ""
else:
assert out.attrs["units"] == "1"


class TestSnowMeltWEMax:
Expand Down
2 changes: 1 addition & 1 deletion tests/test_indicators.py
Original file line number Diff line number Diff line change
Expand Up @@ -848,7 +848,7 @@ def test_resampling_indicator_with_indexing(tas_series):
out = xclim.atmos.tx_days_above(
tas, thresh="0 degC", freq="YS-JUL", doy_bounds=(1, 50)
)
np.testing.assert_allclose(out, [50, 50, np.NaN])
np.testing.assert_allclose(out, [50, 50, np.nan])

out = xclim.atmos.tx_days_above(
tas, thresh="0 degC", freq="YS", date_bounds=("02-29", "04-01")
Expand Down
Loading

0 comments on commit d13cb6c

Please sign in to comment.