Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tempest coerces the type of records when querying a GSI #183

Open
szabado-faire opened this issue May 29, 2024 · 2 comments
Open

Tempest coerces the type of records when querying a GSI #183

szabado-faire opened this issue May 29, 2024 · 2 comments

Comments

@szabado-faire
Copy link
Contributor

Tempest assumes the type of records when querying a GSI, and coerces all records to be that type without checking the entity sort key prefix.


Example

Let's say we have a library hold system that tracks holds on Books and Movies. We might have:

  • Book Hold
    • Hold Token (PK): String
    • Book Token: String
    • Hold placed at: Instant
    • Title, author, other metadata
  • Movie Hold
    • Hold Token (PK): String
    • Movie Token: String
    • Hold placed at: Instant
    • Director, actors, other metadata

When a loaned book/movie gets returned, the library would need to give it to the next person in line. To faciliate that, we'd need a GSI. We could build one for books and one for movies, but the idiomatic dynamo approach would be to share a GSI, and have the following schema:

  • Book/movie token (PK)
  • Hold placed at (SK)

That way the system can easily allocate the pending holds by looking at the oldest hold for a given book.


The Problem

Tempest has no type safety here. If you look up book_token_123 in the tempest movie GSI, it'll try to return a book (ignoring the sort key prefix), and can very well succeed if you have enough nullable fields.

This surfaced for me as a bug - I bumped the schema version on my table (including giving it a new sort key prefix), and then my code proceeded to pull the old records out of the table, coerced as new records, and caused lots of mayhem.


Solutions

This feels like it's safely in bug territory but I wanted to consult with you folks on solutions before putting up a fix. In my mind tempest just needs to be checking that the correct prefix exists before using the Codec to parse it.

@kyeotic
Copy link
Collaborator

kyeotic commented May 29, 2024

Yes, this is a known issue with GSIs. Using a separate GSI is a workaround, though one that is usually applicable. Generally, given the 50 GSI limit, a separate GSI will be better given that each one will be sparse.

I definitely consider this a bug though.

@szabado-faire
Copy link
Contributor Author

szabado-faire commented May 29, 2024

There's two main ways of resolving this in my mind:

  1. Filter out invalid records. Could conceivably break existing users, but it might also make them more correct
  2. Add a new version of Page that supports multiple types - effectively an ItemSet that has an offset. This would probably involve deprecating the existing Page implementations, but would let tempest solve the same issue when it comes to scan operations as well

Thoughts on either approach?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants