Add two new unique delimited error code for better caller handling #1449

thomasvl · 2023-08-21T14:21:48Z

No description provided.

thomasvl · 2023-08-21T14:24:03Z

I'm also curious what folks thoughts are on merging this to the 1.x branch.

Since it makes the api more usable, it seems like it might be a good thing. But since there are new error codes it could be considered api breaking, but the only folks that would break would be folks that tried to handle every possible error code, and the api already was throwing the underlying stream error, so I don't think anyone was completely exhaustive. There is the edge case where someone has something specific to the .truncated Error, but we could also release note the change calling it out.

Tests/SwiftProtobufTests/Test_BinaryDelimited.swift

Sources/SwiftProtobuf/BinaryDelimited.swift

thomasvl · 2023-08-22T14:59:22Z

Ignoring the 1.x branch question.

The more I think about it, I'm starting to wonder if we should take the 2.0 time frame to change a few things up:

We could change BinaryDelimited.parse(...) to return an optional Message to deal with the no data case, but merge(...) is still a problem. That would still need to be an error, or we just give up on providing a the merge() api.
For all the Errors, we did unique enums for Decode vs. Encode and per api. Should we take the time to do that now? That way the docs become much more clear - BinaryDelimited.parse(...) can throw StreamErrors, BinaryDelimitedDecodeError or BinaryDecodeErrors based on which operation failed, etc. Likewise, we could give the stream api it's own Error enums, so it is more clear which Errors are possible from each api instead of having only a subset apply to them.

Thoughts?

FranzBusch · 2023-08-23T10:07:28Z

Sources/SwiftProtobuf/BinaryDelimited.swift

+
+    /// While attempting to read the length of a message on the stream, the
+    /// bytes were malformed for the protobuf format.
+    case malformedLength


We should change this enum to be a struct on main now. Furthermore, we should double check if there are any other public enums that we might want to convert. This is due to the fact that adding enum cases is a breaking change and is going to limit us down the line. Lately, we have been picking this error pattern: https://github.com/swift-server/swift-memcache-gsoc/blob/main/Sources/SwiftMemcache/MemcachedError.swift

This has a bunch of useful information and let's us extend the codes without breaking API.

We've had revisiting the errors on the list for a while #1223, but this small fix to get better error codes in this one case is all the time I've got at the moment.

If someone else has the type to go look at converting over the others that would be great.

FranzBusch · 2023-08-23T10:08:56Z

For all the Errors, we did unique enums for Decode vs. Encode and per api. Should we take the time to do that now? That way the docs become much more clear - BinaryDelimited.parse(...) can throw StreamErrors, BinaryDelimitedDecodeError or BinaryDecodeErrors based on which operation failed, etc. Likewise, we could give the stream api it's own Error enums, so it is more clear which Errors are possible from each api instead of having only a subset apply to them.

I am fine with just having a single ProtobufError for the whole module to be honest. I don't think we have to create multiple different errors for each action. @glbrntt WDYT?

thomasvl · 2023-08-23T12:29:04Z

For all the Errors, we did unique enums for Decode vs. Encode and per api. Should we take the time to do that now? That way the docs become much more clear - BinaryDelimited.parse(...) can throw StreamErrors, BinaryDelimitedDecodeError or BinaryDecodeErrors based on which operation failed, etc. Likewise, we could give the stream api it's own Error enums, so it is more clear which Errors are possible from each api instead of having only a subset apply to them.

I am fine with just having a single ProtobufError for the whole module to be honest. I don't think we have to create multiple different errors for each action. @glbrntt WDYT?

I was thinking it might make things a little easier to document. i.e. - we don't have to call out individual errors, just say which sets of errors it might raise.

thomasvl · 2023-08-23T12:34:34Z

For all the Errors, we did unique enums for Decode vs. Encode and per api. Should we take the time to do that now? That way the docs become much more clear - BinaryDelimited.parse(...) can throw StreamErrors, BinaryDelimitedDecodeError or BinaryDecodeErrors based on which operation failed, etc. Likewise, we could give the stream api it's own Error enums, so it is more clear which Errors are possible from each api instead of having only a subset apply to them.

I am fine with just having a single ProtobufError for the whole module to be honest. I don't think we have to create multiple different errors for each action. @glbrntt WDYT?

I was thinking it might make things a little easier to document. i.e. - we don't have to call out individual errors, just say which sets of errors it might raise.

Also, there's the open issue about making the different format optional, so someone could only get the binary support without the TextFormat or JSON. So to maybe eventually support that, it might make sense to keep the error classes unique to the different formats so they are ready if we ever do that kind of division to the library.

FranzBusch · 2023-08-23T12:36:40Z

I was thinking it might make things a little easier to document. i.e. - we don't have to call out individual errors, just say which sets of errors it might raise.

I am personally not a fan of listing out all the errors that an API can throw, at least not one untyped throws which we currently have. It is just too easy to slip and throw another error from a nested try call. What we have learned in other places tough is that it is good to have one error type that is thrown across the module. @glbrntt has put a lot of thought into that recently.

Also, there's the open issue about making the different format optional, so someone could only get the binary support without the TextFormat or JSON. So to maybe eventually support that, it might make sense to keep the error classes unique to the different formats so they are ready if we ever do that kind of division to the library.

If we separate the serialisation, I think it would still be fine to have one error struct in the core module.

However, we decide on the top two we should really convert the enum to struct for now.

glbrntt · 2023-08-23T16:06:18Z

For all the Errors, we did unique enums for Decode vs. Encode and per api. Should we take the time to do that now? That way the docs become much more clear - BinaryDelimited.parse(...) can throw StreamErrors, BinaryDelimitedDecodeError or BinaryDecodeErrors based on which operation failed, etc. Likewise, we could give the stream api it's own Error enums, so it is more clear which Errors are possible from each api instead of having only a subset apply to them.

I am fine with just having a single ProtobufError for the whole module to be honest. I don't think we have to create multiple different errors for each action. @glbrntt WDYT?

I like the single top-level error type approach where the error contains a code a more detailed underlying error. The code gives the user an idea of the "domain" of the error which is useful for bucketing errors. The message and underlying error are useful for providing additional context about the error and for debugging.

However, I don't think we should necessarily prescribe this error throughout the ecosystem. I have found it very useful in recent projects where the causes of failure across the library were wide and errors could be thrown from deep in the stack. This allows you to compose a useful error at the top-level while retaining the information about underlying cause. If there are very few causes of failure then this approach might not make as much sense.

I am personally not a fan of listing out all the errors that an API can throw, at least not one untyped throws which we currently have. It is just too easy to slip and throw another error from a nested try call.

This is an important one. As Franz points out: it's very easy for an undocumented error to escape and once you list different errors which can be thrown they effectively become part of a loose API contract.

However, we decide on the top two we should really convert the enum to struct for now.

💯

thomasvl · 2023-08-24T15:02:09Z

For all the Errors, we did unique enums for Decode vs. Encode and per api. Should we take the time to do that now? That way the docs become much more clear - BinaryDelimited.parse(...) can throw StreamErrors, BinaryDelimitedDecodeError or BinaryDecodeErrors based on which operation failed, etc. Likewise, we could give the stream api it's own Error enums, so it is more clear which Errors are possible from each api instead of having only a subset apply to them.

I am fine with just having a single ProtobufError for the whole module to be honest. I don't think we have to create multiple different errors for each action. @glbrntt WDYT?

I like the single top-level error type approach where the error contains a code a more detailed underlying error. The code gives the user an idea of the "domain" of the error which is useful for bucketing errors. The message and underlying error are useful for providing additional context about the error and for debugging.

However, I don't think we should necessarily prescribe this error throughout the ecosystem. I have found it very useful in recent projects where the causes of failure across the library were wide and errors could be thrown from deep in the stack. This allows you to compose a useful error at the top-level while retaining the information about underlying cause. If there are very few causes of failure then this approach might not make as much sense.

I think I generally agree with this, but at the moment, I think the two apis around binary delimited are sorta the only ones that can have a few types of errors. The apis for parsing/serialization for the three formats main formats, only throw their specific errors, I don't believe anything other errors can come into play. But the open issue for revisiting errors (moving to a Struct, etc.) make sense to someone to look at eventually.

In the mean time, can folks look at this from the POV of a simple improvement of the current api (vs. taking on the larger api design change)? This PR addresses two shortcoming of the existing api:

Developers can't tell what I'll call framing errors from message payload errors.
Since some InputStreams lie about available bytes, a developer can't tell when they have reached the end of the stream compared to when they have actually gotten wire corruption. Both cases current result in .truncated, which isn't helpful.

So I'd sorta like to land this to address those to usability issues now.

This allows the caller to tell the difference between when a message *within* the delimited stream was bad vs. when the framing on stream itself was bad.

`InputStream` documents that `hasBytesAvailable` might return `True` when a `read` is needed to determine the real status. This case caused some attempts to use the delimited api then fall into seeing "end" as an error because it caused folks to see `.truncated` without really knowing nothing was there to read. This provide a distinct case for this condition so callers then have the option to gracefully handle reaching the end. Closed apple#1410

Don't call out too specific of things and instead just mention the Errors types that are possible to avoid being overly specific and leading folks in the wrong direction.

thomasvl · 2023-08-28T16:03:56Z

Bump on landing this to improve the current situation until someone takes on the larger errors revision.

glbrntt

In the mean time, can folks look at this from the POV of a simple improvement of the current api (vs. taking on the larger api design change)? This PR addresses two shortcoming of the existing api:

Developers can't tell what I'll call framing errors from message payload errors.

Since some InputStreams lie about available bytes, a developer can't tell when they have reached the end of the stream compared to when they have actually gotten wire corruption. Both cases current result in .truncated, which isn't helpful.

So I'd sorta like to land this to address those to usability issues now.

Assuming this doesn't get backported to 1.x, this looks good in the context of the above.

thomasvl · 2023-08-29T15:03:30Z

Assuming this doesn't get backported to 1.x, this looks good in the context of the above.

Ack, will go ahead and land as is, while the end of stream case might be nice to also get on 1.x, I don't think we have too do it.

thomasvl requested review from allevato, Lukasa, FranzBusch and tbkka August 21, 2023 14:21

allevato reviewed Aug 22, 2023

View reviewed changes

Tests/SwiftProtobufTests/Test_BinaryDelimited.swift Outdated Show resolved Hide resolved

Sources/SwiftProtobuf/BinaryDelimited.swift Outdated Show resolved Hide resolved

thomasvl force-pushed the unique_delimited_errors branch from c73ca69 to 618433a Compare August 22, 2023 14:55

FranzBusch reviewed Aug 23, 2023

View reviewed changes

thomasvl added 3 commits August 24, 2023 11:20

Use distinct errors for errors during delimited parsing.

c69c730

This allows the caller to tell the difference between when a message *within* the delimited stream was bad vs. when the framing on stream itself was bad.

Tweak the docs on the errors throw.

27414bb

Don't call out too specific of things and instead just mention the Errors types that are possible to avoid being overly specific and leading folks in the wrong direction.

thomasvl force-pushed the unique_delimited_errors branch from 78a6260 to 27414bb Compare August 24, 2023 15:20

glbrntt approved these changes Aug 29, 2023

View reviewed changes

thomasvl merged commit a3ac2e8 into apple:main Aug 29, 2023

thomasvl deleted the unique_delimited_errors branch August 29, 2023 15:03

gjcairo mentioned this pull request May 22, 2024

Un-revert breaking changes before releasing v2 #1657

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add two new unique delimited error code for better caller handling #1449

Add two new unique delimited error code for better caller handling #1449

thomasvl commented Aug 21, 2023

thomasvl commented Aug 21, 2023

thomasvl commented Aug 22, 2023

FranzBusch Aug 23, 2023

thomasvl Aug 23, 2023

FranzBusch commented Aug 23, 2023

thomasvl commented Aug 23, 2023

thomasvl commented Aug 23, 2023

FranzBusch commented Aug 23, 2023

glbrntt commented Aug 23, 2023

thomasvl commented Aug 24, 2023

thomasvl commented Aug 28, 2023

glbrntt left a comment

thomasvl commented Aug 29, 2023

Add two new unique delimited error code for better caller handling #1449

Add two new unique delimited error code for better caller handling #1449

Conversation

thomasvl commented Aug 21, 2023

thomasvl commented Aug 21, 2023

thomasvl commented Aug 22, 2023

FranzBusch Aug 23, 2023

Choose a reason for hiding this comment

thomasvl Aug 23, 2023

Choose a reason for hiding this comment

FranzBusch commented Aug 23, 2023

thomasvl commented Aug 23, 2023

thomasvl commented Aug 23, 2023

FranzBusch commented Aug 23, 2023

glbrntt commented Aug 23, 2023

thomasvl commented Aug 24, 2023

thomasvl commented Aug 28, 2023

glbrntt left a comment

Choose a reason for hiding this comment

thomasvl commented Aug 29, 2023