Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chunked broadcast with objects of smaller dimensions is broken. #41

Closed
rafaqz opened this issue Jul 19, 2021 · 3 comments
Closed

Chunked broadcast with objects of smaller dimensions is broken. #41

rafaqz opened this issue Jul 19, 2021 · 3 comments

Comments

@rafaqz
Copy link
Collaborator

rafaqz commented Jul 19, 2021

Using the setup code from the tests (below) this chunked 2*2 broadcast with a vector seems broken:

julia> a_disk = _DiskArray([1 10; 2 20; 3 30; 4 40], chunksize=(2, 2))
Disk Array with size 4 x 2

julia> a_disk .* [1, 2, 3, 4] |> collect
4×2 Matrix{Int64}:
  1    9
  4   16
 10   90
 40  160

julia> [1 10; 2 20; 3 30; 4 40] .* [1, 2, 3, 4]
4×2 Matrix{Int64}:
  1   10
  4   40
  9   90
 16  160

We get the chunks as columns. The same problem occurs for a Tuple, I found this trying to add Tuple support in #39.

Setup code:

using DiskArrays
using DiskArrays: ReshapedDiskArray, PermutedDiskArray
struct _DiskArray{T,N,A<:AbstractArray{T,N}} <: AbstractDiskArray{T,N}
  getindex_count::Ref{Int}
  setindex_count::Ref{Int}
  parent::A
  chunksize::NTuple{N,Int}
end
_DiskArray(a;chunksize=size(a)) = _DiskArray(Ref(0),Ref(0),a,chunksize)
Base.size(a::_DiskArray) = size(a.parent)
DiskArrays.haschunks(::_DiskArray) = DiskArrays.Chunked()
DiskArrays.eachchunk(a::_DiskArray) = DiskArrays.GridChunks(a,a.chunksize)
getindex_count(a::_DiskArray) = a.getindex_count[]
setindex_count(a::_DiskArray) = a.setindex_count[]
trueparent(a::_DiskArray) = a.parent
getindex_count(a::ReshapedDiskArray) = getindex_count(a.parent)
setindex_count(a::ReshapedDiskArray) = setindex_count(a.parent)
trueparent(a::ReshapedDiskArray) = trueparent(a.parent)
getindex_count(a::PermutedDiskArray) = getindex_count(a.a.parent)
setindex_count(a::PermutedDiskArray) = setindex_count(a.a.parent)
trueparent(a::PermutedDiskArray{T,N,<:PermutedDimsArray{T,N,perm,iperm}}) where {T,N,perm,iperm} = permutedims(trueparent(a.a.parent),perm)
function DiskArrays.readblock!(a::_DiskArray,aout,i::AbstractUnitRange...)
  ndims(a) == length(i) || error("Number of indices is not correct")
  all(r->isa(r,AbstractUnitRange),i) || error("Not all indices are unit ranges")
  #println("reading from indices ", join(string.(i)," "))
  a.getindex_count[] += 1
  aout .= a.parent[i...]
end
function DiskArrays.writeblock!(a::_DiskArray,v,i::AbstractUnitRange...)
  ndims(a) == length(i) || error("Number of indices is not correct")
  all(r->isa(r,AbstractUnitRange),i) || error("Not all indices are unit ranges")
  #println("Writing to indices ", join(string.(i)," "))
  a.setindex_count[] += 1
  view(a.parent,i...) .= v
end
@rafaqz
Copy link
Collaborator Author

rafaqz commented Jul 19, 2021

This seems to be the result of how iterate works on disk arrays. This is the trace to readblock!. copyto! is calling iterate on the DiskArray and iterating over readblock! which givers us the results out of order - two column from the chunks as show in the output above:

  [2] readblock!(::DiskArrays.BroadcastDiskArray{Int64, 2, Base.Broadcast.Broadcasted{DiskArrays.ChunkStyle{2}, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}, typeof(*), Tuple{_DiskArray{Int64, 2, Matrix{Int64}}, Vector{Int64}}}}, ::Matrix{Int64}, ::UnitRange{Int64}, ::Vararg{UnitRange{Int64}, N} where N)
    @ DiskArrays ~/.julia/dev/DiskArrays/src/ops.jl:22
  [3] getindex_disk(::DiskArrays.BroadcastDiskArray{Int64, 2, Base.Broadcast.Broadcasted{DiskArrays.ChunkStyle{2}, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}, typeof(*), Tuple{_DiskArray{Int64, 2, Matrix{Int64}}, Vector{Int64}}}}, ::UnitRange{Int64}, ::Vararg{UnitRange{Int64}, N} where N)
    @ DiskArrays ~/.julia/dev/DiskArrays/src/DiskArrays.jl:60
  [4] getindex
    @ ~/.julia/dev/DiskArrays/src/DiskArrays.jl:195 [inlined]
  [5] iterate
    @ ~/.julia/dev/DiskArrays/src/iterator.jl:8 [inlined]
  [6] copyto_unaliased!(deststyle::IndexLinear, dest::Matrix{Int64}, srcstyle::IndexCartesian, src::DiskArrays.BroadcastDiskArray{Int64, 2, Base.Broadcast.Broadcasted{DiskArrays.ChunkStyle{2}, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}, typeof(*), Tuple{_DiskArray{Int64, 2, Matrix{Int64}}, Vector{Int64}}}})
    @ Base ./abstractarray.jl:975
  [7] copyto!
    @ ./abstractarray.jl:950 [inlined]
  [8] _collect_indices
    @ ./array.jl:620 [inlined]
  [9] collect
    @ ./array.jl:604 [inlined]
 [10] |>(x::DiskArrays.BroadcastDiskArray{Int64, 2, Base.Broadcast.Broadcasted{DiskArrays.ChunkStyle{2}, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}, typeof(*), Tuple{_DiskArray{Int64, 2, Matrix{Int64}}, Vector{Int64}}}}, f::typeof(collect))
    @ Base ./operators.jl:858
 [11] top-level scope
    @ REPL[69]:1

Again #37 would fix this, as we would iterate over the whole column as Julia expects. Otherwise we could add `copyto! methods, but that will break for custom array types. I think iterate just has to work linearly or we will have a lot of problems like this.

@rafaqz rafaqz changed the title Chunked broadcast with a vector seems broken Chunked broadcast with objects of smaller dimensions is broken. Aug 28, 2021
@rafaqz
Copy link
Collaborator Author

rafaqz commented Aug 28, 2021

This also affects all higher dimension arrays that have less dimensions than the chunked array. It's quite a serious bug, especially as it works without complaint and just gives the wrong answer.

julia>   a = ones(10,9,2);

julia>   a_disk = _DiskArray(a, chunksize=(5,3,2));

julia>   mat = reshape(1:90, 10, 9)
10×9 reshape(::UnitRange{Int64}, 10, 9) with eltype Int64:
  1  11  21  31  41  51  61  71  81
  2  12  22  32  42  52  62  72  82
  3  13  23  33  43  53  63  73  83
  4  14  24  34  44  54  64  74  84
  5  15  25  35  45  55  65  75  85
  6  16  26  36  46  56  66  76  86
  7  17  27  37  47  57  67  77  87
  8  18  28  38  48  58  68  78  88
  9  19  29  39  49  59  69  79  89
 10  20  30  40  50  60  70  80  90

julia>   a_disk .* mat |> collect
10×9×2 Array{Float64, 3}:
[:, :, 1] =
  1.0  21.0  11.0   6.0  26.0  16.0  31.0  51.0  41.0
  2.0  22.0  12.0   7.0  27.0  17.0  32.0  52.0  42.0
  3.0  23.0  13.0   8.0  28.0  18.0  33.0  53.0  43.0
  4.0  24.0  14.0   9.0  29.0  19.0  34.0  54.0  44.0
  5.0  25.0  15.0  10.0  30.0  20.0  35.0  55.0  45.0
 11.0   1.0  21.0  16.0   6.0  26.0  41.0  31.0  51.0
 12.0   2.0  22.0  17.0   7.0  27.0  42.0  32.0  52.0
 13.0   3.0  23.0  18.0   8.0  28.0  43.0  33.0  53.0
 14.0   4.0  24.0  19.0   9.0  29.0  44.0  34.0  54.0
 15.0   5.0  25.0  20.0  10.0  30.0  45.0  35.0  55.0

[:, :, 2] =
 36.0  56.0  46.0  61.0  81.0  71.0  66.0  86.0  76.0
 37.0  57.0  47.0  62.0  82.0  72.0  67.0  87.0  77.0
 38.0  58.0  48.0  63.0  83.0  73.0  68.0  88.0  78.0
 39.0  59.0  49.0  64.0  84.0  74.0  69.0  89.0  79.0
 40.0  60.0  50.0  65.0  85.0  75.0  70.0  90.0  80.0
 46.0  36.0  56.0  71.0  61.0  81.0  76.0  66.0  86.0
 47.0  37.0  57.0  72.0  62.0  82.0  77.0  67.0  87.0
 48.0  38.0  58.0  73.0  63.0  83.0  78.0  68.0  88.0
 49.0  39.0  59.0  74.0  64.0  84.0  79.0  69.0  89.0
 50.0  40.0  60.0  75.0  65.0  85.0  80.0  70.0  90.0

julia>   a .* mat
10×9×2 Array{Float64, 3}:
[:, :, 1] =
  1.0  11.0  21.0  31.0  41.0  51.0  61.0  71.0  81.0
  2.0  12.0  22.0  32.0  42.0  52.0  62.0  72.0  82.0
  3.0  13.0  23.0  33.0  43.0  53.0  63.0  73.0  83.0
  4.0  14.0  24.0  34.0  44.0  54.0  64.0  74.0  84.0
  5.0  15.0  25.0  35.0  45.0  55.0  65.0  75.0  85.0
  6.0  16.0  26.0  36.0  46.0  56.0  66.0  76.0  86.0
  7.0  17.0  27.0  37.0  47.0  57.0  67.0  77.0  87.0
  8.0  18.0  28.0  38.0  48.0  58.0  68.0  78.0  88.0
  9.0  19.0  29.0  39.0  49.0  59.0  69.0  79.0  89.0
 10.0  20.0  30.0  40.0  50.0  60.0  70.0  80.0  90.0

[:, :, 2] =
  1.0  11.0  21.0  31.0  41.0  51.0  61.0  71.0  81.0
  2.0  12.0  22.0  32.0  42.0  52.0  62.0  72.0  82.0
  3.0  13.0  23.0  33.0  43.0  53.0  63.0  73.0  83.0
  4.0  14.0  24.0  34.0  44.0  54.0  64.0  74.0  84.0
  5.0  15.0  25.0  35.0  45.0  55.0  65.0  75.0  85.0
  6.0  16.0  26.0  36.0  46.0  56.0  66.0  76.0  86.0
  7.0  17.0  27.0  37.0  47.0  57.0  67.0  77.0  87.0
  8.0  18.0  28.0  38.0  48.0  58.0  68.0  78.0  88.0
  9.0  19.0  29.0  39.0  49.0  59.0  69.0  79.0  89.0
 10.0  20.0  30.0  40.0  50.0  60.0  70.0  80.0  90.0

@rafaqz
Copy link
Collaborator Author

rafaqz commented Sep 14, 2023

Closing, this is fixed now

@rafaqz rafaqz closed this as completed Sep 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant