Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Object class and schema definition combined, ORM-style #2000

Open
tgross35 opened this issue Jun 9, 2022 · 5 comments
Open

Object class and schema definition combined, ORM-style #2000

tgross35 opened this issue Jun 9, 2022 · 5 comments

Comments

@tgross35
Copy link

tgross35 commented Jun 9, 2022

Hello all,

I would like to ask if there has been consideration about adding a model that allows for data storage and serialization/deserialization in one. This is a pretty common use case, avoiding redundancy between defining schemas and the data they produce. This could be done via an ORM-style model like SQLAlchemy has.

Something like this was discussed before (#1043) but since the recipe is simple and useful, I can see value adding it to marshmallow.

There are some libraries that accomplish similar goals, but they have some drawbacks:

  • marshmallow-dataclass is available but requires some boilerplate, plus some workarounds (e.g. must use dataclass fields rather than marshmallow fields)
  • marshmallow-objects exists but is currently archived

A possible API could look like the following:

from datetime import datetime

from marshmallow import MarshModel, INCLUDE, fields, ValidationError

class UserModel(MarshModel):
    __meta_args__ = {'unknown': INCLUDE}
    
    name: str = fields.Str()
    email: str = fields.Email()
    created_at: datetime = fields.DateTime()

    @validates('email'):
    def validate_email(self, value: Any) -> None:
        if '@' not in value:
            raise ValidationError('Not an email address!')

class BlogModel(MarshModel):
    title: str = fields.String()
    author: dict = fields.Nested(UserModel)

user_data = {"name": "Ronnie", "email": "[email protected]"}
user = UserModel.load(user_data)

blog = BlogModel(title="Something Completely Different", author=user)
blog.dump()
pprint(result)
# {'title': 'Something Completely Different',
#  'author': {'name': 'Ronnie',
#             'email': [email protected]',
#             'created_at': '2021-08-17T14:58:57.600623+00:00'}}

MarshModel would require:

  • Something that creates a schema from the defined class (maybe via an init_subclass hook to store it as cls.__schema or something)
  • Implicit __init__ function in similar way to dataclasses
  • Implicit load() classmethod that returns an instance
  • Typing - something similar to how SQLA handles this https://docs.sqlalchemy.org/en/14/orm/extensions/mypy.html#installation

If there is interest, I may be able to submit a PR

@tgross35 tgross35 changed the title Object class and schema definition in one, ORM-style Object class and schema definition combined, ORM-style Jun 9, 2022
@tgross35
Copy link
Author

I went ahead and figured out some of the needed logic. The following works:

class UserModel(MarshModel):
    __meta_args__ = {"unknown": INCLUDE}

    name: str = fields.Str()
    email: str = fields.Email()
    created_at: datetime = fields.DateTime()


class BlogModel(MarshModel):
    title: str = fields.String()
    author: UserModel = MMNested(UserModel)


user_data = {"name": "Ronnie", "email": "[email protected]"}
user = UserModel().load(user_data)

blog = BlogModel(title="Something Completely Different", author=user)
result = blog.dump()
pprint(result)

# {'author': {'created_at': None, 'email': '[email protected]', 'name': 'Ronnie'},
#  'title': 'Something Completely Different'}

From the main logic here:

class MarshModel:
    _ma_schema: Schema
    _field_names: list[str]
    __meta_args__: dict[str, typing.Any] = {}

    def __init_subclass__(cls, **kw) -> None:
        super(MarshModel).__init_subclass__(**kw)

        cls._field_names = [
            f for f in dir(cls) if isinstance(getattr(cls, f), fields.Field)
        ]
        cls._ma_schema = _get_meta_class(cls.__meta_args__).from_dict(
            {name: getattr(cls, name) for name in cls._field_names}
        )()

        cls.__init__ = _create_init_fn(cls._ma_schema.fields)

        _register_model(cls)

    def load(self, *args, **kw):
        loaded = self._ma_schema.load(*args, **kw)
        for k, v in loaded.items():
            setattr(self, k, v)
        return self

    def dump(self):
        return self._ma_schema.dump(self)

Full relevant file here https://github.com/tgross35/marshmallow-mapper-test/blob/f346dca5ab47dbcf0971eb97d8f185b3c294cf72/mapper.py

Typing is wrong, signatures aren't working right, and hooks don't work, but basic implementation doesn't seem too bad.

@tgross35
Copy link
Author

Let me generalize this a bit: the main goal is to have the return type of .load() be something that can be statically type checked, which helps with IDEs (autocomplete is awesome) but also allows for better validation of code use (e.g. to help catch errors with MyPy). The class implementation here is one way to go about this, but similar results can be created (I think) by dynamically creating a TypedDict for the return type, which has the same benefits

@kyleposluns
Copy link

This is the most desired marshmallow feature among my colleagues.

@likeyiyy
Copy link

I want this feature too

@terra-alex
Copy link

Hello! What do you think of using Desert as a workaround at the moment? It seems like it accomplishes most of this

(with some issues dealing with things like marshmallow-oneofschema, but as far as I could tell that was my only issue with it)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants