Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved Ruby error classes #113

Open
jhawthorn opened this issue Aug 9, 2023 · 6 comments
Open

Improved Ruby error classes #113

jhawthorn opened this issue Aug 9, 2023 · 6 comments
Assignees

Comments

@jhawthorn
Copy link
Member

cc @adrianna-chang-shopify @casperisfine @composerinteralia @dhruvCW @matthewd

We made some improvements to the Ruby extension's error classes recently but based on recent discussions I think still have a way to go. I wanted to gather opinions in this issue and am happy to implement what we can come to a consensus on.

Problems I see:

  • QueryError is our default fallback error class. I don't think this makes any sense, as this is a ProtocolError subclass but all errors other than TRILOGY_ERR are IMO connection-level errors. I think we should make BaseConnectionError the default error class for things which haven't been given a specific meaning (alternatively we could exhaustively define error classes for all trilogy status codes).
  • "TRILOGY_CLOSED_CONNECTION". This error represents essentially an EOF over our socket, and it's one of the more common errors we come across. It's also the error we'll see if we attempt communication after a trilogy_sock_shutdown. Currently this appears as QueryError: TRILOGY_CLOSED_CONNECTION and we're forced to do string matching to detect it. I think this deserves its own error class, but also it should probably be distinct from ConnectionClosed, which is only raised from an explicitly closed connection. How about Trilogy::EOFError? For backwards compatibility I'd probably include "TRILOGY_CLOSED_CONNECTION" in the string.
    • Another possibility I'd like to entertain is deleting the translation of EPIPE errors to TRILOGY_CLOSED_CONNECTION, which would break compatibility with those doing string matching but I think otherwise is just better and more accurate information.
  • SyscallError vs ConnectionError. As raised in Handle EADDRNOTAVAIL as a BaseConnectionError instead of SyscallError #98, there's some confusion between SyscallError and BaseConnectionError. One option is that we can make Trilogy::SyscallError inherit from ConnectionError and then have only one thing to rescue, or we could better document that users should rescue both classes (I don't feel very strongly either way).
  • I think ProtocolError's name is confusing. I read this as "protocol violation" rather than "we were told over the protocol that an error occurred", and I've seen confusion from others about the same. I preferred the previous DatabaseError naming but am open to other options. How about ServerError "an error reported by the server"? Whatever we choose we can make the classes aliases of each other, so there should be minimal compatibility problems.
@jhawthorn jhawthorn self-assigned this Aug 9, 2023
@casperisfine
Copy link
Contributor

I think the key thing to consider when designing error hierarchies, is the use case.

Which errors do users need to rescue?

For network libraries my answer is generally that errors fall in two main groups:

  • All network / IO related errors: you likely want to rescue them and eventually be able to retry.
  • Client errors (or query error), e.g. invalid query: you want them to be very distinct so you don't rescue them by accident.

You can then subclass this as much as you want for readability.

e.g. https://github.com/redis-rb/redis-client/blob/68dbe6b27d259f29af099128fe6ad83193160253/lib/redis_client.rb#L80-L131

@adrianna-chang-shopify
Copy link
Collaborator

I think we should make BaseConnectionError the default error class for things which haven't been given a specific meaning

This makes sense to me 👍 Also makes treating all the syscall errors as connection errors easy to reason about.

"TRILOGY_CLOSED_CONNECTION"

Definitely agree that this shouldn't be a QueryError as it currently is -- we're matching against TRILOGY_CLOSED_CONNECTION all over the place in our Rails monolith at Shopify, which is definitely a pain. Trilogy::EOFError would make sense to me.

Another possibility I'd like to entertain is deleting the translation of EPIPE errors to TRILOGY_CLOSED_CONNECTION

What are you thinking instead @jhawthorn ? Just let it bubble up as a syscall error?

SyscallError vs ConnectionError.

I think it's fair to treat all of the errno errors as connection errors. Seems like most of the ones consumers would actually encounter would be connection errors. That gets us around the headache of trying to figure out which ones we should classify as conn errors versus not, and I think, as the issue opened around EADDRNOTAVAIL demonstrates, it's confusing to users to have to rescue these as separate error classes.

I think ProtocolError's name is confusing.

Yeah, might as well just revert back to DatabaseError, since it's still there for compatibility reasons? Versus introducing a new error that needs to be aliased to both DatabaseError and ProtocolError.

@dhruvCW
Copy link

dhruvCW commented Aug 16, 2023

I entirely agree with the opinions shared here 💯 .

SyscallError vs ConnectionError. As raised in #98, there's some confusion between SyscallError and BaseConnectionError. One option is that we can make Trilogy::SyscallError inherit from ConnectionError and then have only one thing to rescue, or we could better document that users should rescue both classes (I don't feel very strongly either way).

specifically about this usecase, IMO Trilogy::SyscallError should use BaseConnectionError as the base class. Thus if a user cares for details they can rescue the SyscallError and evaluate it. Otherwise if they don't they can only handle BaseConnectionError.

@matthewd
Copy link
Contributor

All network / IO related errors: you likely want to rescue them and eventually be able to retry.

The thing that jumps into my head here is something like CommunicationError -- "[especially if you have a raw syscall error,] we potentially have no idea exactly what went wrong, it's (very likely, but) not certainly an issue between our networking stack and the remote end... but either way, we didn't manage to talk to the remote server".

(But that's just a passing thought, and even if we did think it's more descriptive, I'm doubtful it's worth the rename effort to change; ConnectionError is already a very reasonable spelling of the same concept: there are extremely few possible sources of syserror that aren't strictly a "connection" problem.)

I agree that SyscallError should include ConnectionError.

I don't like that BaseConnectionError sounds like it's the root of all connection errors, when it's actually the opposite... GeneralConnectionError? I was going to suggest "Unknown..", but while that might be reasonable for the fallback case (and honestly, many of the currently-enumerated code that boil down to "protocol violation"), it sounds too uncertain for the "expected" errors.

Agree not-TRILOGY_ERR should default to some sort of connection error, not a query error.

Also agree that I've misread ProtocolError as "error in protocol implementation" rather than "error encoded via protocol" more than once. I don't love DatabaseError just because of how generic it sounds: we're here to talk to a database, so that's arguably true of anything that could go wrong.

One idea I had was that we could consciously deviate from the "FooError" pattern as a gentle nudge that we're [using an exception as a means of] communicating an error from elsewhere, rather than directly reporting a "local" error up the stack: ErrorResponse, ErrorFromServer. Alternatively: QueryError? ISTM most of the other things that can go wrong all boil down to problems inside the supplied query; is there a lot of value in us uniquely distinguishing syntax/parse errors from other ("that column doesn't exist") errors?

@composerinteralia
Copy link
Contributor

In support of our SyscallError being a ConnectionError, turns out GitHub is running a small patch that didn't make it upstream for whatever reason that basically adds this around the call to ::Trilogy.new:

rescue Errno::ETIMEDOUT, Errno::ECONNREFUSED, Errno::ECONNRESET => error
  raise ActiveRecord::ConnectionNotEstablished.new(error.message)

@jhawthorn
Copy link
Member Author

  • QueryError is our default fallback error class [...] this is a ProtocolError

I learned I was wrong about this. QueryError isn't a subclass of ProtocolError, but I think it should be (or whatever ProtocolError is renamed to) as it is just a specialization of that error in the "parse error" case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants