xe.Regridder does not work when parallel is set to true #306

mgorfer · 2023-09-07T19:45:28Z

mgorfer
Sep 7, 2023

I have an input xarray dataset I want to regrid

<xarray.Dataset>
Dimensions:    (longitude: 1440, latitude: 721, time: 132)
Coordinates:
  * longitude  (longitude) float32 0.0 0.25 0.5 0.75 ... 359.0 359.2 359.5 359.8
  * latitude   (latitude) float32 90.0 89.75 89.5 89.25 ... -89.5 -89.75 -90.0
  * time       (time) datetime64[ns] 2000-01-01 2000-02-01 ... 2010-12-01
Data variables:
    d2m        (time, latitude, longitude) float32 dask.array<chunksize=(12, 721, 1440), meta=np.ndarray>

First I create the grid and then the regridder:

    ds_out = xe.util.grid_global(2.5, 2.5)
    regridder = xe.Regridder(
        ds,
        ds_out,
        "bilinear",
        periodic=True,
        parallel=True,
    )

If I have parallel set to true, I get an error message:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
File <timed exec>:1

File ~/python_file.py:41, in regrid_to_lat_lon(ds, target_res_lat, target_res_lon)
     23 """
     24 Regrid any grid to a res_lat x res_lon degree lat lon grid.
     25 
   (...)
     38     Regridded dataset.
     39 """
     40 ds_out = xe.util.grid_global(target_res_lat, target_res_lon)
---> 41 regridder = xe.Regridder(
     42     ds,
     43     ds_out,
     44     "bilinear",
     45     periodic=True,
     46     parallel=True,
     47 )
     48 dr_out = regridder(ds)

File /[...]/python3.9/site-packages/xesmf/frontend.py:955, in Regridder.__init__(self, ds_in, ds_out, method, locstream_in, locstream_out, periodic, parallel, **kwargs)
    952     self.out_coords = {lat_out.name: lat_out, lon_out.name: lon_out}
    954 if parallel:
--> 955     self._init_para_regrid(ds_in, ds_out, kwargs)

File /[...]/python3.9/site-packages/xesmf/frontend.py:973, in Regridder._init_para_regrid(self, ds_in, ds_out, kwargs)
    971     ds_out['mask'] = mask
    972 else:
--> 973     ds_out_chunks = tuple([ds_out.chunksizes[i] for i in self.out_horiz_dims])
    974     ds_out = ds_out.coords.to_dataset()
    975     mask = da.ones(self.shape_out, dtype=bool, chunks=ds_out_chunks)

File /[...]/python3.9/site-packages/xesmf/frontend.py:973, in <listcomp>(.0)
    971     ds_out['mask'] = mask
    972 else:
--> 973     ds_out_chunks = tuple([ds_out.chunksizes[i] for i in self.out_horiz_dims])
    974     ds_out = ds_out.coords.to_dataset()
    975     mask = da.ones(self.shape_out, dtype=bool, chunks=ds_out_chunks)

File /[...]/python3.9/site-packages/xarray/core/utils.py:455, in Frozen.__getitem__(self, key)
    454 def __getitem__(self, key: K) -> V:
--> 455     return self.mapping[key]

KeyError: 'y'

When I set parallel to false, everything works and my regridder looks like this

xESMF Regridder 
Regridding algorithm:       bilinear 
Weight filename:            bilinear_721x1440_72x144_peri.nc 
Reuse pre-computed weights? False 
Input grid shape:           (721, 1440) 
Output grid shape:          (72, 144) 
Periodic in longitude?      True

I am using Python 9.9.18, Xarray 2023.8.0, and xesmf 0.8.1.

Answered by aulemahal

Sep 11, 2023

This is a confusing caveat of parallel=True and we should raise a better error message.

As the documentation of parallel says (empasis mine):

weights are computed in parallel with Dask on subsets of the output grid using chunks of the output grid.

ds_out, the output of grid_global has no variable and thus no chunks (the error is that the dimension name isn't a key of the chunksizes dictionary). Therefore, xESMF doesn't know how to parallelize the weights generation.

An easy way out is:

ds_out = ds_out.assign(mask = ds_out.lon.notnull() & ds_out.lat.notnull()).chunk(y=NNN, x=MMM)

Where you choose NNN and MMM according to your situation.

I'll open a PR that adds a meaningful error.

View full answer

huard · 2023-09-07T21:30:27Z

huard
Sep 7, 2023
Maintainer

Thanks for the detailed report. This is a brand new feature, we'll look into this and release a fix.

0 replies

aulemahal · 2023-09-11T14:55:28Z

aulemahal
Sep 11, 2023
Collaborator

This is a confusing caveat of parallel=True and we should raise a better error message.

As the documentation of parallel says (empasis mine):

weights are computed in parallel with Dask on subsets of the output grid using chunks of the output grid.

ds_out, the output of grid_global has no variable and thus no chunks (the error is that the dimension name isn't a key of the chunksizes dictionary). Therefore, xESMF doesn't know how to parallelize the weights generation.

An easy way out is:

ds_out = ds_out.assign(mask = ds_out.lon.notnull() & ds_out.lat.notnull()).chunk(y=NNN, x=MMM)

Where you choose NNN and MMM according to your situation.

I'll open a PR that adds a meaningful error.

0 replies

mgorfer · 2023-09-11T16:14:44Z

mgorfer
Sep 11, 2023
Author

I am not sure if raising an error message is an appropriate solution for the usage of xe.util.grid_global.
As xe.util.grid_global is part of xESMF I would expect it to work automatically for parallell=True without the user needing to interact with this global grid object manually again by adding data variables with chunks.
Maybe by just adding the (chunked) mask data variable to ds_out in the regridding process when needed?

But I followed your approach, which works perfectly.
I am now using:

ds_out = xe.util.grid_global(target_res_lat, target_res_lon, cf=True)
ds_out = ds_out.assign(mask=ds_out.lat.notnull() & ds_out.lon.notnull()).chunk()

Are there recommended chunk sizes for the mask data variable of ds_out?

Three further additions:

I tried using ds_out with the chunked mask data variable for regridding a ds without chunks. It works normally. Also, when parallel is set to either true or false. I also did not see any speed differences. Is there any danger of just using it always like this?
There is a spurious latitude_longitude float64 nan coordinate in ds_out. Is it necessary?
Sometimes I get an error message [WARNING] yaksa: 10 leaked handle pool objects when using parallel=True.

0 replies

aulemahal · 2023-09-11T17:52:31Z

aulemahal
Sep 11, 2023
Collaborator

Well you are right, but I don't think parallel=True really is one of those "user-doesn't-need-to-know" settings. At least, it doesn't feel ready for that yet... Mostly, because it depends on a user-provided chunking, which itself depends on the available RAM and other specs of the machine. But, this is debatable. It would be easy to add arguments output_chunks and mask to grid_global, would that make sense to you ?

Speed

I did

import xarray as xr
import xesmf as xe

ds = xr.tutorial.open_dataset('air_temperature')
ds_out = xe.util.grid_global(2.5, 2.5)
ds_out = ds_out.assign(mask = ds_out.lon.notnull() & ds_out.lat.notnull())

# Chunked, parallel
reg = xe.Regridder(ds, ds_out.chunk(), 'bilinear', periodic=True, parallel=True)
# took 2.94 s

# Chunked, not parallel
reg = xe.Regridder(ds, ds_out.chunk(), 'bilinear', periodic=True, parallel=False)
# took 476 ms

# Not chunked, not parallel
reg = xe.Regridder(ds, ds_out, 'bilinear', periodic=True, parallel=False)
# took 241 ms

I don't have the same input ds as you, nor the same machine, but I see here that parallelizing the weights generation is much slower than doing it the conventional way. This is what I expected because of how the parallelization is implemented. To circumvent issues in ESMF, the parallelization is implemented at a very high level, which adds a lot of overhead. This goes back to the beginning of this comment : parallel=True is useful when the alternative is impossible and we are able to chunk the output wisely so that the process fits in the available RAM. To answer your question: no, I don't think parallel should be set to true "just in case". Many regridding problems will not gain anything from it.

As a rule of thumb, I would say parallel=True is useful for upscalig, when the output grid is very large, larger than the input. The current implementation only parallelizes over the chunks of the output and thus each output chunk needs to "see" the full input grid, which is not efficient when that input becomes large. Parallelizing over the input chunks is muvch more complex problem, so it was set aside for the moment. We are evidently open to contributions to improve the implementation!

`latitude_longitude`

Well, this variable is how the CF conventions mark the grid as a Lat/lon one (PlateCarre, EPSG:4326). It is indeed unnecessary. It was added there for testing purposes, but we can remove it I think.

Yaksa warnings

I do too and I don't know where they come from...

0 replies

mgorfer · 2023-09-11T18:23:14Z

mgorfer
Sep 11, 2023
Author

Thank you for your explanation!

I have to admit, I do not know a lot about how to optimize my functions using chunks yet.
But the default way of opening a folder full of netcdf files in xarray, open_mfdataset, creates a dataset with automatically selected chunks.
Therefore, until you manually load that dataset, it is the default.

And until this loading, everything runs using dask arrays in the background and is hopefully optimized for parallel computation. (That at least is what I was hoping to accomplish by not loading my datasets before saving the final result again to a netcdf file.) All the xarray functions just work with dask and without dask, and there is no need to think about the RAM and hardware specs or to add custom arguments for either case.

Because I do not really know what is the best option, I was hoping that xESFM would just detect if the input dataset contains dask arrays or not and then chooses the most performant option for regridding them. Also, if that contains loading the dataset before regridding, or using a target grid with chunks or without chunks. Therefore, even though it would be very interesting for me to contribute here, I do not really have that experience.

I will do further speed testing with my input datset.
It contains several air variables of the ERA5 monthly averaged data on pressure levels from 1940 to present, which I want to downscale from a 0.25° x 0.25° resolution to a much broader 5° x 5° resolution.

0 replies

aulemahal · 2023-09-11T19:24:10Z

aulemahal
Sep 11, 2023
Collaborator

Just to clear some things up, xESMF uses ESMF to generate the regridding function, but it uses pure xarray to apply that function.

reg = xe.Regridder(...)  # Generating the regridding function, the regridding weights
dsout = reg(ds_in)  # Applying the function

Up until xESMF 0.7.1, we had two limitations:

you couldn't have chunks on the dataset along the spatial dimensions when applying the function
the regridding function building was without dask, you needed that the two spatial grids fit in memory.

In xESMF 0.8, both have been fixed, with the limitations on the second one that I highlighted above. parallel=True only applies to this second experimental feature. Even when it is False, dask is enabled when applying the function.

Thus, unless you have a very very large output grid (you are performing upscaling), xESMF 0.8 should feel as plug-and-play as the other xarray tools. Going from 0.25° to 5°, you should not need parallel=True.

0 replies

mgorfer · 2023-09-12T08:31:49Z

mgorfer
Sep 12, 2023
Author

That actually cleared up a lot of things for me! I am sorry, for not reading the available information on https://xesmf.readthedocs.io/ before starting this issue / discussion.

I thought, parallel=True was the argument which enables the lazy evaluation on Dask arrays.

0 replies

mgorfer · 2023-09-12T12:37:44Z

mgorfer
Sep 12, 2023
Author

I have done some quick speed testing for my function.

%timeit grd.Era5AhcManual()._read_data(target_res_lat_lon=2.5, xesmf_parallel=False)
24.3 s ± 401 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit grd.Era5AhcManual()._read_data(target_res_lat_lon=2.5, xesmf_parallel=True)
29.7 s ± 494 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

There is lots of other stuff going on in this function, but 5 seconds of saved time is pretty substantial anyway!
So for my 2.5 × 2.5 global grid, not using parallel is (as you indicated) the faster option.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

xe.Regridder does not work when parallel is set to true #306

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 8 comments

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

xe.Regridder does not work when parallel is set to true #306

mgorfer Sep 7, 2023

Replies: 8 comments

huard Sep 7, 2023 Maintainer

aulemahal Sep 11, 2023 Collaborator

mgorfer Sep 11, 2023 Author

aulemahal Sep 11, 2023 Collaborator

Speed

latitude_longitude

Yaksa warnings

mgorfer Sep 11, 2023 Author

aulemahal Sep 11, 2023 Collaborator

mgorfer Sep 12, 2023 Author

mgorfer Sep 12, 2023 Author

mgorfer
Sep 7, 2023

huard
Sep 7, 2023
Maintainer

aulemahal
Sep 11, 2023
Collaborator

mgorfer
Sep 11, 2023
Author

aulemahal
Sep 11, 2023
Collaborator

`latitude_longitude`

mgorfer
Sep 11, 2023
Author

aulemahal
Sep 11, 2023
Collaborator

mgorfer
Sep 12, 2023
Author

mgorfer
Sep 12, 2023
Author