Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/pdct 1542 change bulk import tool to not allow non strings as metadata #263

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

annaCPR
Copy link
Contributor

@annaCPR annaCPR commented Dec 5, 2024

Description

  • add validation to the bulk import endpoint which checks that all metadata values are strings

Please include:

  • a summary of the changes
  • links to any related issue/ticket
  • any additional relevant motivation and context
  • details of any dependency updates that are required for this change

Proposed version

Please select the option below that is most relevant from the list below. This
will be used to generate the next tag version name during auto-tagging.

  • Skip auto-tagging
  • Patch
  • Minor version
  • Major version

Visit the Semver website to understand the
difference between MAJOR, MINOR, and PATCH versions.

Notes:

  • If none of these options are selected, auto-tagging will fail
  • Where multiple options are selected, the most senior option ticked will be
    used -- e.g. Major > Minor > Patch
  • If you are selecting the version in the list above using the textbox, make
    sure your selected option is marked [x] with no spaces in between the
    brackets and the x

Type of change

Please select the option(s) below that are most relevant:

  • Bug fix
  • New feature
  • Breaking change
  • GitHub workflow update
  • Documentation update
  • Remove legacy code
  • Dependency update

How Has This Been Tested?

Please describe the tests that you added to verify your changes.

Reviewer Checklist

  • DB_CLIENT DEPENDENCY IS ON THE LATEST VERSION
  • The PR represents a single feature (small drive-by fixes are also ok)
  • The PR includes tests that are sufficient for the level of risk
  • The code is sufficiently commented, particularly in hard-to-understand areas
  • Any required documentation updates have been made
  • Any TODOs added are captured in future tickets
  • No FIXMEs remain

@annaCPR annaCPR requested a review from a team as a code owner December 5, 2024 12:58
Copy link

linear bot commented Dec 5, 2024

Copy link
Contributor

@jamesgorrie jamesgorrie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great. There is a suggestion though that we use a parser, which might work better at scale, and also allows us to define the type of Metadata.

for value in e.get("metadata", {}).values()
]

_validate_values_are_strings(metadata_values)
Copy link
Contributor

@jamesgorrie jamesgorrie Dec 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of writing our own parser, we could use Pydantic's parser? e.g.

from pydantic import BaseModel, RootModel, ValidationError
from typing import Dict, Union, List

Metadata = RootModel[Dict[str, Union[str, List[str]]]]

if __name__ == "__main__":
    good_json = '{"name": "John", "age": "30", "list": ["item1", "item2"]}'
    bad_json = '{"name": "John", "age": 30, "list": ["item1", "item2"]}'

    try: 
        print(Metadata.model_validate_json(good_json))
        print("Successful") # reaches here
    except ValidationError as e:
        print(e)
        print("Failed")

    try: 
        print(Metadata.model_validate_json(bad_json))
        print("Successful")
    except ValidationError as e:
        print(e)
        print("Failed") # reaches here with a nice descriptive error

@annaCPR annaCPR marked this pull request as draft December 5, 2024 16:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants