-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Padding is not handled gracefully #31
Comments
What code is producing a bzip2 format data stream with trailing garbage? Is this part of some other format? |
Indeed, part of another file format where I know how large the compressed section is. I now have to jump through some hoops to find the end markers and truncate the section so that the decompressor can handle it: https://github.com/JuliaGPU/Metal.jl/blob/60a9e34ebc98714a705af1d28b47bff67f25dcb9/src/compiler/library.jl#L339-L385 |
Okay, I think you want to stop at the "end of a chunk" There is no simple function for doing this but the following should work: julia> function decode_first_bzip2_data_stream(compressed::Vector{UInt8}; max_size=typemax(Int))
stream = Bzip2DecompressorStream(IOBuffer(compressed); stop_on_end=true)
try
u = read(stream, max_size)
eof(stream) || error("max_size is too small")
return u
finally
close(stream) # needed to prevent memory leaks
end
end
decode_first_bzip2_data_stream (generic function with 1 method)
julia> u = zeros(UInt8, 1000000);
julia> c = transcode(Bzip2Compressor, u);
julia> decode_first_bzip2_data_stream(c) == u
true
julia> decode_first_bzip2_data_stream([c; c;]) == u
true
julia> decode_first_bzip2_data_stream([c; zeros(UInt8,10);]) == u
true
julia> decode_first_bzip2_data_stream(c; max_size=20)
ERROR: max_size is too small
Stacktrace:
[1] error(s::String)
@ Base ./error.jl:35
[2] decode_first_bzip2_data_stream(compressed::Vector{UInt8}; max_size::Int64)
@ Main ./REPL[20]:5
[3] top-level scope
@ REPL[26]:1 |
With #43 this can be simplified to: julia> function decode_first_bzip2_data_stream(compressed::Vector{UInt8}; max_size=typemax(Int))
stream = Bzip2DecompressorStream(IOBuffer(compressed); stop_on_end=true)
u = read(stream, max_size)
eof(stream) || error("max_size is too small")
return u
end Could you reopen this issue if it doesn't work? |
MWE:
The
bunzip2
tool generates a warning, but continues to decompress:CodecBzip2.jl fails:
... where -5 seems to be BZ_DATA_ERROR_MAGIC.
The text was updated successfully, but these errors were encountered: