Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document gotchas when interoperation with python #804

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/Project.toml
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
[deps]
Conda = "8f4d0f93-b110-5947-807f-2305c1781a2d"
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
HDF5 = "f67ccb44-e63f-5c2f-98bd-6dc0ccc4ba2f"
MPI = "da04e1cc-30fd-572f-bb4f-1f8673147195"
PyCall = "438e738f-606a-5dbb-bf0a-cddfbfd45ab0"
3 changes: 3 additions & 0 deletions docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ not_low_level_api(::typeof(HDF5.h5p_get_class_name)) = false
not_low_level_api(::typeof(HDF5.h5t_get_member_name)) = false
not_low_level_api(::typeof(HDF5.h5t_get_tag)) = false


makedocs(;
sitename="HDF5.jl",
modules=[HDF5],
Expand All @@ -24,7 +25,9 @@ makedocs(;
pages=[
"Home" => "index.md",
"Low-level library bindings" => "api_bindings.md",
"Python interoperability" => "h5py.md"
],
strict=true,
)

deploydocs(;
Expand Down
96 changes: 96 additions & 0 deletions docs/src/h5py.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
# Python interoperability

When loading python created hdf5 files from Julia the dimensions of arrays are reversed.
The reason is that in python C-memory layout is the default, while Julia uses Fortran layout.
Here is an example:

```@example h5py
using PyCall #hide
import Conda #hide
Conda.add("h5py") #hide
Conda.add("numpy") #hide
py""" #hide
import h5py
import numpy as np
path = "created_by_h5py.h5"
file = h5py.File(path, "w")
arr1d = np.array([1,2,3])
arr2d = np.array([[1,2,3], [4,5,6]])
arr3d = np.array([[[1,2,3], [4,5,6]]])
assert arr1d.shape == (3,)
assert arr2d.shape == (2,3)
assert arr3d.shape == (1,2,3)
file["1d"] = arr1d
file["2d"] = arr2d
file["3d"] = arr3d
file.close()
""" #hide
```

When we try to load it from julia, dimensions are reversed:

```@example h5py
using HDF5
using Test
path = "created_by_h5py.h5"
h5open(path, "r") do file
arr1d = read(file["1d"])
arr2d = read(file["2d"])
arr3d = read(file["3d"])
@test size(arr1d) == (3,)
@test size(arr2d) == (3,2)
@test size(arr3d) == (3,2,1)
end
```

To fix this, we can simply reverse the dimensions again:

```@example h5py
function reversedims(arr)
return permutedims(arr, reverse(1:ndims(arr)))
end

path = "created_by_h5py.h5"
h5open(path, "r") do file
arr1d = reversedims(read(file["1d"]))
arr2d = reversedims(read(file["2d"]))
arr3d = reversedims(read(file["3d"]))
@test arr1d == [1,2,3]
@test arr2d == [1 2 3; 4 5 6]
@test arr3d == reshape(arr2d, (1,2,3))
end
```

Similarly `reversedims` can be used before saving arrays intended for use from python.
If copying of data is undesirable, other options are:
* using Fortran memory layout on the python side
* using C-memory layout on the Julia side (e.g. replace `permutedims` by `PermutedDimsArray` above)

```@example h5py
using HDF5
path = "created_by_h5py.h5"
h5open(path, "r") do file
arr1d = read(file["1d"])
arr2d = read(file["2d"])
arr3d = read(file["3d"])
@test size(arr1d) == (3,)
@test size(arr2d) == (3,2)
@test size(arr3d) == (3,2,1)
end

using HDF5
function reversedims(arr)
dims = ntuple(identity, Val(ndims(arr)))
return permutedims(arr, reverse(dims))
end

path = "created_by_h5py.h5"
h5open(path, "r") do file
arr1d = reversedims(read(file["1d"]))
arr2d = reversedims(read(file["2d"]))
arr3d = reversedims(read(file["3d"]))
@test arr1d == [1,2,3]
@test arr2d == [1 2 3; 4 5 6]
@test arr3d == reshape(arr2d, (1,2,3))
end
```