Skip to content

Commit

Permalink
add docs for MPI support
Browse files Browse the repository at this point in the history
  • Loading branch information
Alexander-Barth committed Apr 24, 2024
1 parent 4c6e191 commit 96990c4
Show file tree
Hide file tree
Showing 4 changed files with 77 additions and 9 deletions.
6 changes: 5 additions & 1 deletion docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,11 @@ CommonDataModel_remote = (


makedocs(
modules = [NCDatasets, CommonDataModel],
modules = [
NCDatasets,
CommonDataModel,
Base.get_extension(NCDatasets, :NCDatasetsMPIExt)
],
remotes = Dict(
CommonDataModel_path => CommonDataModel_remote,
),
Expand Down
50 changes: 50 additions & 0 deletions docs/src/other.md
Original file line number Diff line number Diff line change
Expand Up @@ -287,3 +287,53 @@ The use of `missing` as fill value, is thus preferable in the general case.
NCDatasets.ancillaryvariables
NCDatasets.filter
```


## Experimental MPI support

Experimental MPI support is available as a package extension. It is important to load `MPI` in addition to `NCDatasets` to enable this package extension.
All metadata operators (creating dimensions, variables, attributes, groups or types) must be done *collectively*.
Reading and writing data of netCDF variables can be done *independently* (default) or *collectively*. If a variable (or whole dataset) is marked for *collectively* data access, the underlying HDF5 library can enable additional optimization.
More information is available in the [NetCDF documentation](https://web.archive.org/web/20240414204638/https://docs.unidata.ucar.edu/netcdf-c/current/parallel_io.html).

Only the NetCDF 4 format can be currently use for parallel access.

```julia
using MPI
using NCDatasets

MPI.Init()

mpi_comm = MPI.COMM_WORLD
mpi_comm_size = MPI.Comm_size(mpi_comm)
mpi_rank = MPI.Comm_rank(mpi_comm)

# The file needs to be the same for all processes
filename = "file.nc"

# index based on MPI rank
i = mpi_rank + 1

# create the netCDF file
ds = NCDataset(mpi_comm,filename,"c")

# define the dimensions
defDim(ds,"lon",10)
defDim(ds,"lat",mpi_comm_size)
ncv = defVar(ds,"temp",Int32,("lon","lat"))

# enable colletive access (:independent is the default)
NCDatasets.access(ncv.var,:collective)

ncv[:,i] .= mpi_rank

ncv.attrib["units"] = "degree Celsius"
ds.attrib["comment"] = "MPI test"
close(ds)
```


```@docs
NCDataset(comm::MPI.Comm,filename::AbstractString,mode::AbstractString)
NCDatasets.access
```
28 changes: 21 additions & 7 deletions ext/NCDatasetsMPIExt.jl
Original file line number Diff line number Diff line change
Expand Up @@ -49,15 +49,14 @@ function parallel_access_mode(par_access::Symbol)
end
end

# https://web.archive.org/web/20240414204638/https://docs.unidata.ucar.edu/netcdf-c/current/parallel_io.html
"""
NCDatasets.access(ncv::Variable,par_access::Symbol)
NCDatasets.access(ds::NCDataset,par_access::Symbol)
Change the parallel access mode of the variable `ncv` or all variables of the dataset `ds` for writing or reading data. `par_access` is either `:collective` or `:independent`. `NCDatasets.access` will raise an error if `MPI` is not loaded.
# Parallel file access is either collective (all processors must participate)
# or independent (any processor may access the data without waiting for others).
# All netCDF metadata writing operations are collective. That is, all creation
# of groups, types, variables, dimensions, or attributes. Data reads and writes
# (e.g. calls to nc_put_vara_int() and nc_get_vara_int()) may be independent,
# the default) or collective.
More information is available in the [NetCDF documentation](https://web.archive.org/web/20240414204638/https://docs.unidata.ucar.edu/netcdf-c/current/parallel_io.html).
"""
function access(ncv::Variable,par_access::Symbol)
varid = ncv.varid
ncid = dataset(ncv).ncid
Expand All @@ -69,6 +68,21 @@ function access(ds::NCDataset,par_access::Symbol)
nc_var_par_access(ds.ncid,NC_GLOBAL,parallel_access_mode(par_access))
end

"""
ds = NCDataset(comm::MPI.Comm,filename::AbstractString,
mode::AbstractString = "r";
info = MPI.INFO_NULL,
maskingvalue = missing,
attrib = [])
Open or create a netCDF file `filename` for parallel IO using the MPI
communicator `comm`. `info` is a
[MPI info object](https://juliaparallel.org/MPI.jl/stable/reference/advanced/#Info-objects)
containing IO hints or `MPI.INFO_NULL` (default). The `mode` is either `"r"` (default) to open an
existing netCDF file in read-only mode, `"c"` to create a new netCDF file (an
existing file with the same name will be overwritten) or `"a"` to append to an
existing file.
"""
function NCDataset(comm::MPI.Comm,filename::AbstractString,
mode::AbstractString = "r";
info = MPI.INFO_NULL,
Expand Down
2 changes: 1 addition & 1 deletion src/dataset.jl
Original file line number Diff line number Diff line change
Expand Up @@ -152,7 +152,7 @@ end
memory::Union{Vector{UInt8},Nothing} = nothing,
attrib = [])
Load, create, or even overwrite a NetCDF file at `filename`, depending on `mode`
Load, create or overwrite a NetCDF file at `filename`, depending on `mode`
* `"r"` (default) : open an existing netCDF file or OPeNDAP URL
in read-only mode.
Expand Down

0 comments on commit 96990c4

Please sign in to comment.