You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
i see in src/compressors.jl that support for zlib exists. have you considered refactoring zarr.jl to access this compressor via the transcodingstream API instead? then you could get lz4, zstd, etc. too for free.
i ask, because blosc1.jl is not thread safe (in my hands anyway), and i'd like to read from multiple zarr chunks simultaneously. codecZstd.jl is thread safe i believe, and i'd like to switch my zarr data set to use it. i'm not sure about codecZlib.jl.
note that currently transcodingstreams.jl does not support blosc, however there is an issue suggesting it do so, with code in a link: JuliaIO/Blosc.jl#79. so zarr.jl wouldn't have to drop support for blosc1 were that issue followed through.
i actually attempted to add zstd myself to zarr.jl this morning through a package extension directly using codecZstd. first started refactoring the existing blosc and zlib code to use extensions, but then realized that structs defined in extension modules cannot be exported. so the user would not have access to e.g. BloscCompressor. but i think a refactoring to use transcodingstreams.jl wouldn't even need an extension. i'm filing this issue because i want to get a zarr.jl dev's opinion before proceeding further.
The text was updated successfully, but these errors were encountered:
see in src/compressors.jl that support for zlib exists. have you considered refactoring zarr.jl to access this compressor via the transcodingstream API instead? then you could get lz4, zstd, etc. too for free.
I have considered this in the very beginning, but as you mention the main road blocker for this was the missing blosc support and blosc was what was used by almost all zarr dataset I was working with. A re-factoring towards transcodingstream api would be great and could lead to speedups as we could already start decompressing while the data is downloaded. This should be made easier now that we rely on Channels to pass data down the chain from storage -> decompression -> dump into output.
Regarding using blosc1 context api: I have experimented with this myself but never got this working in multiple threads. For some reason I was still running into frequent segmentation faults despite being sure to only use ctx versions of the c api. Probably I was doing something awfully wrong and it would be interesting to look at the implementation in SmallZarrGroups.jl
i see in src/compressors.jl that support for zlib exists. have you considered refactoring zarr.jl to access this compressor via the transcodingstream API instead? then you could get lz4, zstd, etc. too for free.
i ask, because blosc1.jl is not thread safe (in my hands anyway), and i'd like to read from multiple zarr chunks simultaneously. codecZstd.jl is thread safe i believe, and i'd like to switch my zarr data set to use it. i'm not sure about codecZlib.jl.
note that currently transcodingstreams.jl does not support blosc, however there is an issue suggesting it do so, with code in a link: JuliaIO/Blosc.jl#79. so zarr.jl wouldn't have to drop support for blosc1 were that issue followed through.
i actually attempted to add zstd myself to zarr.jl this morning through a package extension directly using codecZstd. first started refactoring the existing blosc and zlib code to use extensions, but then realized that structs defined in extension modules cannot be exported. so the user would not have access to e.g. BloscCompressor. but i think a refactoring to use transcodingstreams.jl wouldn't even need an extension. i'm filing this issue because i want to get a zarr.jl dev's opinion before proceeding further.
The text was updated successfully, but these errors were encountered: