Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deserialization is extremely slow for messages with many small sub-messages. #179

Closed
lassepe opened this issue May 2, 2021 · 14 comments
Closed

Comments

@lassepe
Copy link

lassepe commented May 2, 2021

First of all, thank you for making ProtoBuf available in Julia! I am currently using this package to read scenarios form the waymo-open-dataset. While this has worked reliably, I am running into quite some performance issues due the "topology" of the messages.

Some relevant details:

Each of the scenarios consists of approximately 2MB of binary data. However, each scenario contains map features which in turn consist of some meta-data and a number of MapPoints. Each map point only encodes three floats for the x, y, and z position. Thus they are quite small. However, a single scenario can contain > 100.000 MapPoints (spread over the different features of the map).

As a result, loading a single scenario takes about 150ms on my machine. This may not sound all too bad but in my current usecase I need to iterate over about 400.000 of these scenarios; potentially multiple times.

After doing some profiling, I found that 60% of the time is spent pre-allocating memory for all the small dicts that are needed for each MapPoint object. Specifically, most of the time is spent in this line of julia base.

Just to provide some comparison: The same kind of data is loaded in about 11ms from a json file, and in about 3ms from the binary format created by Serialization.serialize. Of course, there are other issues with these kinds of formats. Therefore, I would really like to use ProtoBuf here. But I cannot get it up to speed.

Are there known ways to reduce the amount of overhead that arises in this setting with many small proto sub-messages? Or is this issue inherent to the current design of the library?

@tanmaykm
Copy link
Member

From what I understand, this looks like the Dict used to store members of a protobuf struct? E.g. this? Would you be able to check/put up a more detailed profiler stack to confirm this?

In that case, maybe something more lightweight than a Dict will help here. The count and names of fields is known during code generation, so one way could be to generate specialized code for each struct instead of using Dict.

@lassepe
Copy link
Author

lassepe commented May 27, 2021

Yes, the bottleneck is due to the dict that is used to represent the protobuf stuct. So there is not really a quick-fix for this because it would require a fundamentally different way of generating the code.

I created a MWE that also includes some results in the profile_results subdirectory. I'm not sure what else would be a good way to share more information on this because the flame graph is not really helpful unless you interact with it (i.e. hover etc.). I think you can ignore the large chunk on the bottom right in this chart because it is just the overhead that comes from running things in the REPL. The actual workload is the part on the left.

image

@hs-ye
Copy link

hs-ye commented Aug 8, 2021

Hi guys, wondering what the patterns in protobuf.jl is to deserialise multiple sub messages in the one message? (not sure if this i should open a separate issue)

I'm seeing that this is not officially part of the protobuf spec https://developers.google.com/protocol-buffers/docs/techniques and couldn't see it in any of the Readme's.

@blegat
Copy link
Contributor

blegat commented Aug 11, 2021

We have the same issue with OpenStreetMapX: pszufe/OpenStreetMapX.jl#44 (comment)

The count and names of fields is known during code generation, so one way could be to generate specialized code for each struct instead of using Dict.

Is there any technical issue blocking the code of the struct to be generated without using Dict but fields instead?

@lassepe
Copy link
Author

lassepe commented Aug 12, 2021

Technically it should be possible. The point is just that one would have to re-write the entire code-generation logic.

@blegat
Copy link
Contributor

blegat commented Aug 27, 2021

The point is just that one would have to re-write the entire code-generation logic.

Indeed. It's best to try it out by manually doing it for one package. I tried it by manually changing the generated code of OpenStreetMapX, see here for the proof of concept: pszufe/OpenStreetMapX.jl#52
It was quite simple since the PBF specification of OpenStreetMap does not use oneof but it should be doable for oneof as well.

@NHDaly
Copy link

NHDaly commented Jan 10, 2022

Hi all, we're seeing the same thing here: allocating a small Dict per message is prohibitively expensive.

We remembered that the last time we used ProtoBuf.jl, a few years ago (for JuliaPerf/PProf.jl@d6d525c), it generated a struct per proto message, instead of an object containing a Dict.

We found that this change from struct to Dict was made as part of #141, which fixed a thread-safety issue regarding the use of meta. It seems that that PR also changed to a Dict representation, but the PR description doesn't contain any explanation about why we made that change.

@tanmaykm can you elaborate on how we came to the current design with a Dict in each object? Why did we change away from generating a struct per message? We'd be interested to help with implementing any future work here to improve the performance, and we'd like to help get rid of this allocation. 😊

Thanks so much! :)

CC: @dewilson

@dewilson
Copy link
Contributor

dewilson commented Jan 10, 2022

For this example proto, here's what it looks like before and after #141
Proto:

syntax = "proto3";

message plant {
    string name = 3;
}

message seed {
    int32 cold_hours = 1;
    int32 germination_hours = 2;
    plant species = 3;
}

Here's the seed proto after #141:

mutable struct seed <: ProtoType
    __protobuf_jl_internal_meta::ProtoMeta
    __protobuf_jl_internal_values::Dict{Symbol,Any}

    function seed(; kwargs...)
        obj = new(meta(seed), Dict{Symbol,Any}())
        values = obj.__protobuf_jl_internal_values
        symdict = obj.__protobuf_jl_internal_meta.symdict
        for nv in kwargs
            fldname, fldval = nv
            fldtype = symdict[fldname].jtyp
            (fldname in keys(symdict)) || error(string(typeof(obj), " has no field with name ", fldname))
            values[fldname] = isa(fldval, fldtype) ? fldval : convert(fldtype, fldval)
        end
        obj
    end
end # mutable struct seed
const __meta_seed = Ref{ProtoMeta}()
function meta(::Type{seed})
    ProtoBuf.metalock() do
        if !isassigned(__meta_seed)
            __meta_seed[] = target = ProtoMeta(seed)
            allflds = Pair{Symbol,Union{Type,String}}[:cold_hours => Int32, :germination_hours => Int32, :species => plant]
            meta(target, seed, allflds, ProtoBuf.DEF_REQ, ProtoBuf.DEF_FNUM, ProtoBuf.DEF_VAL, ProtoBuf.DEF_PACK, ProtoBuf.DEF_WTYPES, ProtoBuf.DEF_ONEOFS, ProtoBuf.DEF_ONEOF_NAMES)
        end
        __meta_seed[]
    end
end
function Base.getproperty(obj::seed, name::Symbol)
    if name === :cold_hours
        return (obj.__protobuf_jl_internal_values[name])::Int32
    elseif name === :germination_hours
        return (obj.__protobuf_jl_internal_values[name])::Int32
    elseif name === :species
        return (obj.__protobuf_jl_internal_values[name])::plant
    else
        getfield(obj, name)
    end
end

And before:

mutable struct plant <: ProtoType
    name::AbstractString
    plant(; kwargs...) = (o=new(); fillunset(o); isempty(kwargs) || ProtoBuf._protobuild(o, kwargs); o)
end #mutable struct plant
const __fnum_plant = Int[3]
meta(t::Type{plant}) = meta(t, ProtoBuf.DEF_REQ, __fnum_plant, ProtoBuf.DEF_VAL, true, ProtoBuf.DEF_PACK, ProtoBuf.DEF_WTYPES, ProtoBuf.DEF_ONEOFS, ProtoBuf.DEF_ONEOF_NAMES, ProtoBuf.DEF_FIELD_TYPES)

mutable struct seed <: ProtoType
    cold_hours::Int32
    germination_hours::Int32
    species::plant
    seed(; kwargs...) = (o=new(); fillunset(o); isempty(kwargs) || ProtoBuf._protobuild(o, kwargs); o)
end #mutable struct seed

@NHDaly
Copy link

NHDaly commented Jan 10, 2022

:) We're wondering the same thing! I think it probably is related to trying to support protobuf reflection in a good way, or may have been related to supporting mutually recursive message definitions. Whatever the reason, we'd be happy to help with some group engineering to get to a solution we all love! 😊

@tanmaykm
Copy link
Member

I was hoping that some sort of lightweight dict could be used here, but that has not happened yet unfortunately. @NHDaly has rightly pointed out reasons for switching to a dict based representation.

  • avoided the earlier global dict used to store what attributes are set in each instance of a struct, this is needed to be able to serve defaults and set/unset members
  • no need to mangle struct member names that are not Julia compatible (e.g. are keywords and such)
  • recursive struct definitions easier to handle

It will sure be great to have folks contribute to this!

@NHDaly
Copy link

NHDaly commented Jan 11, 2022

Thanks for the explanation, @tanmaykm! :) Makes sense.

Okay, I think we should be able to achieve those goals without having dictionaries in the structs. The proto implementations in other languages have all found ways to work around these, so I think we should be able to do so as well.

For the recursive struct definitions one:
I think the standard way to do mutually recursive struct definitions is to introduce a type parameter for one of them.

So for example, given this proto definition:

message RecurseA {
    required string a = 1;
    optional RecurseB b = 2;
}

message RecurseB {
    required string b = 1;
    optional RecurseA a = 2;
}

You could generate julia code like this:

struct _RecurseA{__RecurseB}
    a::AbstractString
    b::__RecurseB
end
struct RecurseB
    b::AbstractString
    a::_RecurseA{RecurseB}
end
const RecurseA = _RecurseA{RecurseB}

I think we can work through the remaining issues, and go back to doing code generation while still avoiding the thread safety and recursion issues you had before! :)

Thanks Tanmay.

@JeffBezanson
Copy link
Contributor

For the field name issue, we now have this syntax:

struct foo
    var"invalid field name!!"::Int32
end

@Drvi Drvi mentioned this issue Aug 16, 2022
@Drvi
Copy link
Member

Drvi commented Aug 18, 2022

Hey folks, I encourage you to take the new version (1.0) of this package for a spin, should be a bit faster. Please see the docs for some guidance on how to migrate to this new version. Closing for now as this issue is a little all over to place, but please, feel free to reopen (or better, start a new issue) if you feel the performance is still bad.

@Drvi Drvi closed this as completed Aug 18, 2022
@lassepe
Copy link
Author

lassepe commented Jan 22, 2024

FWIW, I returned to this recently to re-assess the feasibility of using ProtoBuf for another project. The MWE from above now only takes about 4ms on my machine --- a solid 30x improvement.

Thanks, @Drvi and team!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants