Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add two new unique delimited error code for better caller handling #1449

Merged
merged 3 commits into from
Aug 29, 2023

Conversation

thomasvl
Copy link
Collaborator

No description provided.

@thomasvl
Copy link
Collaborator Author

I'm also curious what folks thoughts are on merging this to the 1.x branch.

Since it makes the api more usable, it seems like it might be a good thing. But since there are new error codes it could be considered api breaking, but the only folks that would break would be folks that tried to handle every possible error code, and the api already was throwing the underlying stream error, so I don't think anyone was completely exhaustive. There is the edge case where someone has something specific to the .truncated Error, but we could also release note the change calling it out.

@thomasvl thomasvl force-pushed the unique_delimited_errors branch from c73ca69 to 618433a Compare August 22, 2023 14:55
@thomasvl
Copy link
Collaborator Author

Ignoring the 1.x branch question.

The more I think about it, I'm starting to wonder if we should take the 2.0 time frame to change a few things up:

  • We could change BinaryDelimited.parse(...) to return an optional Message to deal with the no data case, but merge(...) is still a problem. That would still need to be an error, or we just give up on providing a the merge() api.
  • For all the Errors, we did unique enums for Decode vs. Encode and per api. Should we take the time to do that now? That way the docs become much more clear - BinaryDelimited.parse(...) can throw StreamErrors, BinaryDelimitedDecodeError or BinaryDecodeErrors based on which operation failed, etc. Likewise, we could give the stream api it's own Error enums, so it is more clear which Errors are possible from each api instead of having only a subset apply to them.

Thoughts?


/// While attempting to read the length of a message on the stream, the
/// bytes were malformed for the protobuf format.
case malformedLength
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should change this enum to be a struct on main now. Furthermore, we should double check if there are any other public enums that we might want to convert. This is due to the fact that adding enum cases is a breaking change and is going to limit us down the line. Lately, we have been picking this error pattern: https://github.com/swift-server/swift-memcache-gsoc/blob/main/Sources/SwiftMemcache/MemcachedError.swift

This has a bunch of useful information and let's us extend the codes without breaking API.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've had revisiting the errors on the list for a while #1223, but this small fix to get better error codes in this one case is all the time I've got at the moment.

If someone else has the type to go look at converting over the others that would be great.

@FranzBusch
Copy link
Member

For all the Errors, we did unique enums for Decode vs. Encode and per api. Should we take the time to do that now? That way the docs become much more clear - BinaryDelimited.parse(...) can throw StreamErrors, BinaryDelimitedDecodeError or BinaryDecodeErrors based on which operation failed, etc. Likewise, we could give the stream api it's own Error enums, so it is more clear which Errors are possible from each api instead of having only a subset apply to them.

I am fine with just having a single ProtobufError for the whole module to be honest. I don't think we have to create multiple different errors for each action. @glbrntt WDYT?

@thomasvl
Copy link
Collaborator Author

For all the Errors, we did unique enums for Decode vs. Encode and per api. Should we take the time to do that now? That way the docs become much more clear - BinaryDelimited.parse(...) can throw StreamErrors, BinaryDelimitedDecodeError or BinaryDecodeErrors based on which operation failed, etc. Likewise, we could give the stream api it's own Error enums, so it is more clear which Errors are possible from each api instead of having only a subset apply to them.

I am fine with just having a single ProtobufError for the whole module to be honest. I don't think we have to create multiple different errors for each action. @glbrntt WDYT?

I was thinking it might make things a little easier to document. i.e. - we don't have to call out individual errors, just say which sets of errors it might raise.

@thomasvl
Copy link
Collaborator Author

For all the Errors, we did unique enums for Decode vs. Encode and per api. Should we take the time to do that now? That way the docs become much more clear - BinaryDelimited.parse(...) can throw StreamErrors, BinaryDelimitedDecodeError or BinaryDecodeErrors based on which operation failed, etc. Likewise, we could give the stream api it's own Error enums, so it is more clear which Errors are possible from each api instead of having only a subset apply to them.

I am fine with just having a single ProtobufError for the whole module to be honest. I don't think we have to create multiple different errors for each action. @glbrntt WDYT?

I was thinking it might make things a little easier to document. i.e. - we don't have to call out individual errors, just say which sets of errors it might raise.

Also, there's the open issue about making the different format optional, so someone could only get the binary support without the TextFormat or JSON. So to maybe eventually support that, it might make sense to keep the error classes unique to the different formats so they are ready if we ever do that kind of division to the library.

@FranzBusch
Copy link
Member

I was thinking it might make things a little easier to document. i.e. - we don't have to call out individual errors, just say which sets of errors it might raise.

I am personally not a fan of listing out all the errors that an API can throw, at least not one untyped throws which we currently have. It is just too easy to slip and throw another error from a nested try call. What we have learned in other places tough is that it is good to have one error type that is thrown across the module. @glbrntt has put a lot of thought into that recently.

Also, there's the open issue about making the different format optional, so someone could only get the binary support without the TextFormat or JSON. So to maybe eventually support that, it might make sense to keep the error classes unique to the different formats so they are ready if we ever do that kind of division to the library.

If we separate the serialisation, I think it would still be fine to have one error struct in the core module.

However, we decide on the top two we should really convert the enum to struct for now.

@glbrntt
Copy link
Contributor

glbrntt commented Aug 23, 2023

For all the Errors, we did unique enums for Decode vs. Encode and per api. Should we take the time to do that now? That way the docs become much more clear - BinaryDelimited.parse(...) can throw StreamErrors, BinaryDelimitedDecodeError or BinaryDecodeErrors based on which operation failed, etc. Likewise, we could give the stream api it's own Error enums, so it is more clear which Errors are possible from each api instead of having only a subset apply to them.

I am fine with just having a single ProtobufError for the whole module to be honest. I don't think we have to create multiple different errors for each action. @glbrntt WDYT?

I like the single top-level error type approach where the error contains a code a more detailed underlying error. The code gives the user an idea of the "domain" of the error which is useful for bucketing errors. The message and underlying error are useful for providing additional context about the error and for debugging.

However, I don't think we should necessarily prescribe this error throughout the ecosystem. I have found it very useful in recent projects where the causes of failure across the library were wide and errors could be thrown from deep in the stack. This allows you to compose a useful error at the top-level while retaining the information about underlying cause. If there are very few causes of failure then this approach might not make as much sense.

I am personally not a fan of listing out all the errors that an API can throw, at least not one untyped throws which we currently have. It is just too easy to slip and throw another error from a nested try call.

This is an important one. As Franz points out: it's very easy for an undocumented error to escape and once you list different errors which can be thrown they effectively become part of a loose API contract.

However, we decide on the top two we should really convert the enum to struct for now.

💯

@thomasvl
Copy link
Collaborator Author

For all the Errors, we did unique enums for Decode vs. Encode and per api. Should we take the time to do that now? That way the docs become much more clear - BinaryDelimited.parse(...) can throw StreamErrors, BinaryDelimitedDecodeError or BinaryDecodeErrors based on which operation failed, etc. Likewise, we could give the stream api it's own Error enums, so it is more clear which Errors are possible from each api instead of having only a subset apply to them.

I am fine with just having a single ProtobufError for the whole module to be honest. I don't think we have to create multiple different errors for each action. @glbrntt WDYT?

I like the single top-level error type approach where the error contains a code a more detailed underlying error. The code gives the user an idea of the "domain" of the error which is useful for bucketing errors. The message and underlying error are useful for providing additional context about the error and for debugging.

However, I don't think we should necessarily prescribe this error throughout the ecosystem. I have found it very useful in recent projects where the causes of failure across the library were wide and errors could be thrown from deep in the stack. This allows you to compose a useful error at the top-level while retaining the information about underlying cause. If there are very few causes of failure then this approach might not make as much sense.

I think I generally agree with this, but at the moment, I think the two apis around binary delimited are sorta the only ones that can have a few types of errors. The apis for parsing/serialization for the three formats main formats, only throw their specific errors, I don't believe anything other errors can come into play. But the open issue for revisiting errors (moving to a Struct, etc.) make sense to someone to look at eventually.

In the mean time, can folks look at this from the POV of a simple improvement of the current api (vs. taking on the larger api design change)? This PR addresses two shortcoming of the existing api:

  • Developers can't tell what I'll call framing errors from message payload errors.
  • Since some InputStreams lie about available bytes, a developer can't tell when they have reached the end of the stream compared to when they have actually gotten wire corruption. Both cases current result in .truncated, which isn't helpful.

So I'd sorta like to land this to address those to usability issues now.

This allows the caller to tell the difference between when a message *within*
the delimited stream was bad vs. when the framing on stream itself was bad.
`InputStream` documents that `hasBytesAvailable` might return `True` when a
`read` is needed to determine the real status. This case caused some attempts to
use the delimited api then fall into seeing "end" as an error because it caused
folks to see `.truncated` without really knowing nothing was there to read. This
provide a distinct case for this condition so callers then have the option to
gracefully handle reaching the end.

Closed apple#1410
Don't call out too specific of things and instead just mention the Errors types
that are possible to avoid being overly specific and leading folks in the wrong
direction.
@thomasvl thomasvl force-pushed the unique_delimited_errors branch from 78a6260 to 27414bb Compare August 24, 2023 15:20
@thomasvl
Copy link
Collaborator Author

Bump on landing this to improve the current situation until someone takes on the larger errors revision.

Copy link
Contributor

@glbrntt glbrntt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the mean time, can folks look at this from the POV of a simple improvement of the current api (vs. taking on the larger api design change)? This PR addresses two shortcoming of the existing api:

  • Developers can't tell what I'll call framing errors from message payload errors.
  • Since some InputStreams lie about available bytes, a developer can't tell when they have reached the end of the stream compared to when they have actually gotten wire corruption. Both cases current result in .truncated, which isn't helpful.

So I'd sorta like to land this to address those to usability issues now.

Assuming this doesn't get backported to 1.x, this looks good in the context of the above.

@thomasvl
Copy link
Collaborator Author

Assuming this doesn't get backported to 1.x, this looks good in the context of the above.

Ack, will go ahead and land as is, while the end of stream case might be nice to also get on 1.x, I don't think we have too do it.

@thomasvl thomasvl merged commit a3ac2e8 into apple:main Aug 29, 2023
@thomasvl thomasvl deleted the unique_delimited_errors branch August 29, 2023 15:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants