Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

options for avoiding name collisions #220

Open
ExpandingMan opened this issue Oct 19, 2022 · 4 comments
Open

options for avoiding name collisions #220

ExpandingMan opened this issue Oct 19, 2022 · 4 comments

Comments

@ExpandingMan
Copy link
Contributor

Most name collisions aren't really a problem here thanks to modules, but conflicts with Base can still be a problem (for example names like Vector and Array).

One simple solution to this would be to provide the option to prepend all generated types names with a user-provided prefix. For example with protojl(files, dir, outdir, type_prefix="Pr") a message called Vector in Julia code would be called PrVector.

@Drvi
Copy link
Member

Drvi commented Oct 20, 2022

Yeah, the list of reserved names used to be very long, I even considered including all exported names from Base. At some point we trimmed it back, don't really remember why, probably for cosmetic reasons:) So, good you bring this up.

As for custom type prefixes, there is AFAIK only one problem -- the Any (#194 (comment)) protobuf type contains two fields: the qualified name of the type (a "type_url") and then the binary payload to be decoded into that type. When we decode an Any field, since we only get the type name, we need to parse it and look it up among the ProtoBuf modules that are available in the session. For this reason, I prefer type names that are known statically -- when you get a message with an Any field, you don't need to provide more runtime info to the decode method to make it work.

My preferred option would be to expand the list of reserved names until we catch all of those that cause trouble (it shouldn't be a breaking change, since we're fixing something in that case).

Thoughts?

@ExpandingMan
Copy link
Contributor Author

Just want to make sure I'm understanding this correctly: the serialization algorithm needs to know the type names (in addition to the field names)?

The current behavior of adding # seems a bit awkward to deal with, unless I'm missing something. In my opinion the ideal solution would be to decouple the Julia type names from the protobuf names which I think could be done by adding a method, though I can see how this might be an onerous solution.

I might fool around with this solution later this evening when I go to update the EnumX import, just to see how complicated it would be, let me know whether you agree that it is desirable. A way of "cheating" which might be too hackish would be to leave the names as is but allow an option to create aliases for them via module-level const statements.

Otherwise, yes of course we should add Base type names to the reserved list, as those really do cause bugs since the generated code itself tries to use some of these names (e.g. Vector). I can take care of this later as well.

@Drvi
Copy link
Member

Drvi commented Oct 20, 2022

the serialization algorithm needs to know the type names (in addition to the field names)?

For the proto Any type, yes, we need to know the name of the type you are serializing (and if you are sending Any messages between different languages or runtimes, the names must be consistent for this to work).

The current behavior of adding # seems a bit awkward to deal with

Yeah... The nice thing about prefixing with # is that the resulting name cannot collide with any other valid ProtoBuf definition (and maybe a smaller chance it would collide with another Julia name).

In my opinion the ideal solution would be to decouple the Julia type names from the protobuf names

I mean, if it was completely unconditional (i.e. we'd prefix every definition with statically known prefix), then we could just always strip it before sending the Any type and always prepend the parsed name when decoding... (we'll have to do similar thing with the # prefix). But I don't think I understand why we'd want the Julia vs ProtoBuf names to be different (apart from avoiding collisions with Base)?

Anyway, I think I'd prefer us expanding the list of reserved keywords... cc: @quinnj

@ExpandingMan
Copy link
Contributor Author

ExpandingMan commented Oct 20, 2022

I think what you just said about why to use # pretty much says it all about why the names should be decoupled.

No worries, I'll add some keywords later and we'll call it a day.

Also once collisions are fixed, there's nothing stopping users from creating their own aliases like I suggested, so I guess there's not really anything that has to be done in the package itself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants