Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Naming collision with nested enums #212

Open
qria opened this issue Feb 16, 2021 · 9 comments · May be fixed by #597
Open

Naming collision with nested enums #212

qria opened this issue Feb 16, 2021 · 9 comments · May be fixed by #597
Assignees
Labels
enhancement New feature or request help wanted Extra attention is needed high priority
Milestone

Comments

@qria
Copy link
Contributor

qria commented Feb 16, 2021

Problem

Currently these two enums in the same file will be both compiled into the same name of ContentStatus

message Content {enum Status{ ... }}
enum ContentStatus{ ... }

Currently the behaviour is to silently overwrite one of the enums, therefore making all subsequent call of the enum possibly wrong.

Workaround

I have consulted @nat-n and they say they don't see a way to work around the issue currently, except to directly modify the generated code.

@nat-n
Copy link
Collaborator

nat-n commented Feb 22, 2021

The ideal solution for this would be a (breaking) change for nested enums to instead compile to nested classes in python instead of relying on naming concatenation.

@qria
Copy link
Contributor Author

qria commented Feb 23, 2021

What is the policy on breaking changes of this project?

@cetanu
Copy link
Collaborator

cetanu commented Oct 14, 2023

I am looking into this, I note it's still a problem and was able to generate an example of it.

@cetanu cetanu self-assigned this Oct 14, 2023
@cetanu
Copy link
Collaborator

cetanu commented Oct 15, 2023

I might need a tl;dr on how the enum ends up getting named and which python module in the generated tree it ends up in.

If I look at how this case is handled in other libraries/languages, I see that they put the nested enum into a module with the name of its parent

So in the tree there would be:

  • __init__.py with the message Content and ContentStatus
  • content.py with the enum Status

I am currently reading through betterproto/plugin/models.py and betterproto/plugin/parser.py to try and understand how the decision is made to put the enum Status in the same file as the message Content. Cannot say I have reached comprehension as yet.

@cetanu
Copy link
Collaborator

cetanu commented Oct 15, 2023

I've reached some understanding after looking at the OutputTemplate that is created for this nested enum example.
The avenue of creating another file, while not impossible (and IMO the preferable way to do this), will probably take too much work. The suggestion by @nat-n to make a nested class is probably the way to go, and is how I am going to approach this. Will make an attempt soon.

@cetanu
Copy link
Collaborator

cetanu commented Oct 16, 2023

This is much harder than I thought...

I've managed to turn this protobuf

syntax = "proto3";

package nested_enum;

message Test {
    enum Inner {
        NONE = 0;
        THIS = 1;
    }
    Inner status = 1;

    message Doubly {
        enum Inner {
            NONE = 0;
            THIS = 1;
        }
        Inner status = 1;
    }
}


message TestInner {
  int32 foo = 1;
}

message TestDoublyInner {
  int32 foo = 1;
  string bar = 2;
}

enum Outer {
    foo = 0;
    bar = 1;
}

message Content {
    message Status {
        string code = 1;
    }
    Status status = 1;
}

message ContentStatus {
    int32 id = 1;
}

into this code

@dataclass(eq=False, repr=False)
class Test(betterproto.Message):
    @dataclass(eq=False, repr=False)
    class TestDoubly(betterproto.Message):
        # It would be nice if I could strip the prefix from these class names
        class TestDoublyInner(betterproto.Enum):
            NONE = 0
            THIS = 1

        # this type hint is wrong; it is referencing the global TestDoublyInner
        status: "TestDoublyInner" = betterproto.enum_field(1)

    class TestInner(betterproto.Enum):
        NONE = 0
        THIS = 1

    status: "TestInner" = betterproto.enum_field(1)


@dataclass(eq=False, repr=False)
class TestInner(betterproto.Message):
    foo: int = betterproto.int32_field(1)


@dataclass(eq=False, repr=False)
class TestDoublyInner(betterproto.Message):
    foo: int = betterproto.int32_field(1)
    bar: str = betterproto.string_field(2)


@dataclass(eq=False, repr=False)
class Content(betterproto.Message):
    @dataclass(eq=False, repr=False)
    class ContentStatus(betterproto.Message):
        code: str = betterproto.string_field(1)

    status: "ContentStatus" = betterproto.message_field(1)


@dataclass(eq=False, repr=False)
class ContentStatus(betterproto.Message):
    id: int = betterproto.int32_field(1)


class Outer(betterproto.Enum):
    foo = 0
    bar = 1

See the inline comments for two of the problems I'm having so far.

@cetanu
Copy link
Collaborator

cetanu commented Oct 16, 2023

I've parked my progress in a branch. If someone else could give it a glance that would be good, maybe I'm off-track or approaching this the wrong way.

b9ec59f

@nat-n
Copy link
Collaborator

nat-n commented Oct 16, 2023

Hey @cetanu, nice work, you're close.

In my view, the ideal approach is for the Python module/class structure to mirror that of the proto config. Unless I'm missing something following this principle should have the most elegant and least surprising or problematic result.

So just to be clear I think what we want is for the Test class to look like so:

@dataclass(eq=False, repr=False)
class Test(betterproto.Message):
    @dataclass(eq=False, repr=False)
    class Doubly(betterproto.Message):
        class Inner(betterproto.Enum):
            NONE = 0
            THIS = 1

        status: "Test.Doubly.Inner" = betterproto.enum_field(1)  # Note the use of fully qualified class name here

    class Inner(betterproto.Enum):
        NONE = 0
        THIS = 1

    status: "Test.Inner" = betterproto.enum_field(1)

I guess you agree?

So the problem is how to avoid using concatenated class names? Do you know where the concatenation is taking place? Could it be refactored to represent class identification with an object that retains the structure so it can be formatted in a context appropriate way? (e.g. a tuple subclass, sometimes so you can use take self[0], sometimes you take ".".join(self) ).

I'm afraid making progress in this codebase requires a willingness to refactor :P

I don't have time/motivation to really dig through the code right now, but I'm happy to offer this kind of high level collaboration.

@cetanu
Copy link
Collaborator

cetanu commented Oct 16, 2023

I agree completely, I just haven't quite worked out:

  1. where the names are concatenated
  2. where the type hints are also formed
    However what you've put as an example is the ultimate goal.

@Gobot1234 Gobot1234 linked a pull request Aug 14, 2024 that will close this issue
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed high priority
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants