Deserialization is extremely slow for messages with many small sub-messages. #179

lassepe · 2021-05-02T20:50:48Z

First of all, thank you for making ProtoBuf available in Julia! I am currently using this package to read scenarios form the waymo-open-dataset. While this has worked reliably, I am running into quite some performance issues due the "topology" of the messages.

Some relevant details:

Each of the scenarios consists of approximately 2MB of binary data. However, each scenario contains map features which in turn consist of some meta-data and a number of MapPoints. Each map point only encodes three floats for the x, y, and z position. Thus they are quite small. However, a single scenario can contain > 100.000 MapPoints (spread over the different features of the map).

As a result, loading a single scenario takes about 150ms on my machine. This may not sound all too bad but in my current usecase I need to iterate over about 400.000 of these scenarios; potentially multiple times.

After doing some profiling, I found that 60% of the time is spent pre-allocating memory for all the small dicts that are needed for each MapPoint object. Specifically, most of the time is spent in this line of julia base.

Just to provide some comparison: The same kind of data is loaded in about 11ms from a json file, and in about 3ms from the binary format created by Serialization.serialize. Of course, there are other issues with these kinds of formats. Therefore, I would really like to use ProtoBuf here. But I cannot get it up to speed.

Are there known ways to reduce the amount of overhead that arises in this setting with many small proto sub-messages? Or is this issue inherent to the current design of the library?

The text was updated successfully, but these errors were encountered:

tanmaykm · 2021-05-27T02:56:49Z

From what I understand, this looks like the Dict used to store members of a protobuf struct? E.g. this? Would you be able to check/put up a more detailed profiler stack to confirm this?

In that case, maybe something more lightweight than a Dict will help here. The count and names of fields is known during code generation, so one way could be to generate specialized code for each struct instead of using Dict.

lassepe · 2021-05-27T07:46:33Z

Yes, the bottleneck is due to the dict that is used to represent the protobuf stuct. So there is not really a quick-fix for this because it would require a fundamentally different way of generating the code.

I created a MWE that also includes some results in the profile_results subdirectory. I'm not sure what else would be a good way to share more information on this because the flame graph is not really helpful unless you interact with it (i.e. hover etc.). I think you can ignore the large chunk on the bottom right in this chart because it is just the overhead that comes from running things in the REPL. The actual workload is the part on the left.

hs-ye · 2021-08-08T23:57:49Z

Hi guys, wondering what the patterns in protobuf.jl is to deserialise multiple sub messages in the one message? (not sure if this i should open a separate issue)

I'm seeing that this is not officially part of the protobuf spec https://developers.google.com/protocol-buffers/docs/techniques and couldn't see it in any of the Readme's.

blegat · 2021-08-11T20:45:16Z

We have the same issue with OpenStreetMapX: pszufe/OpenStreetMapX.jl#44 (comment)

The count and names of fields is known during code generation, so one way could be to generate specialized code for each struct instead of using Dict.

Is there any technical issue blocking the code of the struct to be generated without using Dict but fields instead?

lassepe · 2021-08-12T05:37:29Z

Technically it should be possible. The point is just that one would have to re-write the entire code-generation logic.

blegat · 2021-08-27T09:31:54Z

The point is just that one would have to re-write the entire code-generation logic.

Indeed. It's best to try it out by manually doing it for one package. I tried it by manually changing the generated code of OpenStreetMapX, see here for the proof of concept: pszufe/OpenStreetMapX.jl#52
It was quite simple since the PBF specification of OpenStreetMap does not use oneof but it should be doable for oneof as well.

NHDaly · 2022-01-10T19:43:06Z

Hi all, we're seeing the same thing here: allocating a small Dict per message is prohibitively expensive.

We remembered that the last time we used ProtoBuf.jl, a few years ago (for JuliaPerf/PProf.jl@d6d525c), it generated a struct per proto message, instead of an object containing a Dict.

We found that this change from struct to Dict was made as part of #141, which fixed a thread-safety issue regarding the use of meta. It seems that that PR also changed to a Dict representation, but the PR description doesn't contain any explanation about why we made that change.

@tanmaykm can you elaborate on how we came to the current design with a Dict in each object? Why did we change away from generating a struct per message? We'd be interested to help with implementing any future work here to improve the performance, and we'd like to help get rid of this allocation. 😊

Thanks so much! :)

CC: @dewilson

dewilson · 2022-01-10T19:55:39Z

For this example proto, here's what it looks like before and after #141
Proto:

syntax = "proto3";

message plant {
    string name = 3;
}

message seed {
    int32 cold_hours = 1;
    int32 germination_hours = 2;
    plant species = 3;
}

Here's the seed proto after #141:

mutable struct seed <: ProtoType
    __protobuf_jl_internal_meta::ProtoMeta
    __protobuf_jl_internal_values::Dict{Symbol,Any}

    function seed(; kwargs...)
        obj = new(meta(seed), Dict{Symbol,Any}())
        values = obj.__protobuf_jl_internal_values
        symdict = obj.__protobuf_jl_internal_meta.symdict
        for nv in kwargs
            fldname, fldval = nv
            fldtype = symdict[fldname].jtyp
            (fldname in keys(symdict)) || error(string(typeof(obj), " has no field with name ", fldname))
            values[fldname] = isa(fldval, fldtype) ? fldval : convert(fldtype, fldval)
        end
        obj
    end
end # mutable struct seed
const __meta_seed = Ref{ProtoMeta}()
function meta(::Type{seed})
    ProtoBuf.metalock() do
        if !isassigned(__meta_seed)
            __meta_seed[] = target = ProtoMeta(seed)
            allflds = Pair{Symbol,Union{Type,String}}[:cold_hours => Int32, :germination_hours => Int32, :species => plant]
            meta(target, seed, allflds, ProtoBuf.DEF_REQ, ProtoBuf.DEF_FNUM, ProtoBuf.DEF_VAL, ProtoBuf.DEF_PACK, ProtoBuf.DEF_WTYPES, ProtoBuf.DEF_ONEOFS, ProtoBuf.DEF_ONEOF_NAMES)
        end
        __meta_seed[]
    end
end
function Base.getproperty(obj::seed, name::Symbol)
    if name === :cold_hours
        return (obj.__protobuf_jl_internal_values[name])::Int32
    elseif name === :germination_hours
        return (obj.__protobuf_jl_internal_values[name])::Int32
    elseif name === :species
        return (obj.__protobuf_jl_internal_values[name])::plant
    else
        getfield(obj, name)
    end
end

And before:

mutable struct plant <: ProtoType
    name::AbstractString
    plant(; kwargs...) = (o=new(); fillunset(o); isempty(kwargs) || ProtoBuf._protobuild(o, kwargs); o)
end #mutable struct plant
const __fnum_plant = Int[3]
meta(t::Type{plant}) = meta(t, ProtoBuf.DEF_REQ, __fnum_plant, ProtoBuf.DEF_VAL, true, ProtoBuf.DEF_PACK, ProtoBuf.DEF_WTYPES, ProtoBuf.DEF_ONEOFS, ProtoBuf.DEF_ONEOF_NAMES, ProtoBuf.DEF_FIELD_TYPES)

mutable struct seed <: ProtoType
    cold_hours::Int32
    germination_hours::Int32
    species::plant
    seed(; kwargs...) = (o=new(); fillunset(o); isempty(kwargs) || ProtoBuf._protobuild(o, kwargs); o)
end #mutable struct seed

NHDaly · 2022-01-10T20:30:13Z

:) We're wondering the same thing! I think it probably is related to trying to support protobuf reflection in a good way, or may have been related to supporting mutually recursive message definitions. Whatever the reason, we'd be happy to help with some group engineering to get to a solution we all love! 😊

tanmaykm · 2022-01-11T03:25:43Z

I was hoping that some sort of lightweight dict could be used here, but that has not happened yet unfortunately. @NHDaly has rightly pointed out reasons for switching to a dict based representation.

avoided the earlier global dict used to store what attributes are set in each instance of a struct, this is needed to be able to serve defaults and set/unset members
no need to mangle struct member names that are not Julia compatible (e.g. are keywords and such)
recursive struct definitions easier to handle

It will sure be great to have folks contribute to this!

NHDaly · 2022-01-11T16:00:45Z

Thanks for the explanation, @tanmaykm! :) Makes sense.

Okay, I think we should be able to achieve those goals without having dictionaries in the structs. The proto implementations in other languages have all found ways to work around these, so I think we should be able to do so as well.

For the recursive struct definitions one:
I think the standard way to do mutually recursive struct definitions is to introduce a type parameter for one of them.

So for example, given this proto definition:

message RecurseA {
    required string a = 1;
    optional RecurseB b = 2;
}

message RecurseB {
    required string b = 1;
    optional RecurseA a = 2;
}

You could generate julia code like this:

struct _RecurseA{__RecurseB}
    a::AbstractString
    b::__RecurseB
end
struct RecurseB
    b::AbstractString
    a::_RecurseA{RecurseB}
end
const RecurseA = _RecurseA{RecurseB}

I think we can work through the remaining issues, and go back to doing code generation while still avoiding the thread safety and recursion issues you had before! :)

Thanks Tanmay.

JeffBezanson · 2022-01-20T22:02:26Z

For the field name issue, we now have this syntax:

struct foo
    var"invalid field name!!"::Int32
end

Drvi · 2022-08-18T09:16:36Z

Hey folks, I encourage you to take the new version (1.0) of this package for a spin, should be a bit faster. Please see the docs for some guidance on how to migrate to this new version. Closing for now as this issue is a little all over to place, but please, feel free to reopen (or better, start a new issue) if you feel the performance is still bad.

lassepe · 2024-01-22T13:41:29Z

FWIW, I returned to this recently to re-assess the feasibility of using ProtoBuf for another project. The MWE from above now only takes about 4ms on my machine --- a solid 30x improvement.

Thanks, @Drvi and team!

blegat mentioned this issue Aug 27, 2021

Use concrete types instead of dictionaries pszufe/OpenStreetMapX.jl#52

Merged

Drvi mentioned this issue Aug 16, 2022

ProtoBuf 1.0.0 #202

Merged

Drvi closed this as completed Aug 18, 2022

lassepe mentioned this issue Jan 22, 2024

Make empty Dict only have space for 0 rather than 16 elements (and the minimum size for non-empty Dict down to 4). JuliaLang/julia#52912

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deserialization is extremely slow for messages with many small sub-messages. #179

Deserialization is extremely slow for messages with many small sub-messages. #179

lassepe commented May 2, 2021 •

edited

Loading

tanmaykm commented May 27, 2021

lassepe commented May 27, 2021

hs-ye commented Aug 8, 2021

blegat commented Aug 11, 2021 •

edited

Loading

lassepe commented Aug 12, 2021

blegat commented Aug 27, 2021

NHDaly commented Jan 10, 2022 •

edited

Loading

dewilson commented Jan 10, 2022 •

edited

Loading

NHDaly commented Jan 10, 2022

tanmaykm commented Jan 11, 2022

NHDaly commented Jan 11, 2022 •

edited

Loading

JeffBezanson commented Jan 20, 2022

Drvi commented Aug 18, 2022

lassepe commented Jan 22, 2024

Deserialization is extremely slow for messages with many small sub-messages. #179

Deserialization is extremely slow for messages with many small sub-messages. #179

Comments

lassepe commented May 2, 2021 • edited Loading

tanmaykm commented May 27, 2021

lassepe commented May 27, 2021

hs-ye commented Aug 8, 2021

blegat commented Aug 11, 2021 • edited Loading

lassepe commented Aug 12, 2021

blegat commented Aug 27, 2021

NHDaly commented Jan 10, 2022 • edited Loading

dewilson commented Jan 10, 2022 • edited Loading

NHDaly commented Jan 10, 2022

tanmaykm commented Jan 11, 2022

NHDaly commented Jan 11, 2022 • edited Loading

JeffBezanson commented Jan 20, 2022

Drvi commented Aug 18, 2022

lassepe commented Jan 22, 2024

lassepe commented May 2, 2021 •

edited

Loading

blegat commented Aug 11, 2021 •

edited

Loading

NHDaly commented Jan 10, 2022 •

edited

Loading

dewilson commented Jan 10, 2022 •

edited

Loading

NHDaly commented Jan 11, 2022 •

edited

Loading