Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new customizable retry logic introduced unnecessary retries and retry_check is not executed #988

Open
MarkusSchildhauer opened this issue Jan 11, 2023 · 8 comments

Comments

@MarkusSchildhauer
Copy link

This commit aimed to provide a customizable retry logic. It can be seen in line 48, that the new retry_check function is only executed if retryable(req) returns false:

retry = retryable(req) || retryablebody(req) && _retry_check(s, ex, req, retry_check)

But with default arguments on retry_delays and retries, retryable(req) is typically true.
An ugly workaround to make retry_check accessible is this:

HTTP.get(<url>, retries=1, retry_delays=ExponentialBackOff(n=100),  retry_check=(args...)->(@info(: retry_check); true))

I think those changes are critical since by default there are no further retry checks anymore, which causes a lot of unnecessary time-consuming retries.
Was it actually intended that neither the response status is checked with "retryable" nor all the other possibilities ("isrecoverable") that make a recovery very unlikely?

What is actually the use of the retryattempt[] counter, because the ExponentialBackOff iterator is already limiting the number of retries?

  • Julia 1.8.4
  • HTTP.jl 1.6.3
  • MbedTLS.jl 1.1.7
@nickrobinson251
Copy link
Collaborator

oops! looks like a typo, @quinnj ?

i.e. we meant

- retry = retryable(req) || retryablebody(req) && _retry_check(s, ex, req, retry_check)
+ retry = (retryable(req) || retryablebody(req)) && _retry_check(s, ex, req, retry_check)

What we have:

julia> for (x, y) in (
           (true, true),
           (true, false),
           (false, true),
           (false, false),
       )
           @show x, y
           x || y && println(" yes!")
       end
(x, y) = (true, true)
(x, y) = (true, false)
(x, y) = (false, true)
 yes!
(x, y) = (false, false)

what we want:

julia> for (x, y) in (
           (true, true),
           (true, false),
           (false, true),
           (false, false),
       )
           @show x, y
           (x || y) && println(" yes!")
       end
(x, y) = (true, true)
 yes!
(x, y) = (true, false)
 yes!
(x, y) = (false, true)
 yes!
(x, y) = (false, false)

@MarkusSchildhauer
Copy link
Author

Many thanks for the quick response. But I doubt that this is all about it. If retryablebody(req) returns false retryable(req) returns false, too (see here). So your proposed statement reduces to:

retry = retryablebody(req) && _retry_check(s, ex, req, retry_check)

Which basically means, that by default there are no retires at all, since retry_check=FALSE.

Please don't forget about the other subissues which are related and somehow influenced by your proposal:

  1. currently unnecessary retries (or no retries at all -> your proposal) if retry_check is not configured manually
  2. retryattempt[] counter. More precise: currently, when I set retry_delays with n > 4, I additionally have to set retries=n. This seems unexpected and not intentional. The proposal would solve that but the retryattempt counter is entirely useless than.

@nickrobinson251
Copy link
Collaborator

ah, yeah, that retry_check would have to default to always true.

But even still, having thought about it, i relise i think i just misunderstood the goal of retry_check -- in the proposal of my comment above it'd change to be a check that let's you opt out of retrying certain retryable requests, whereas i think it's intended to let you opt in to retrying certain requests which for which retryable(req) == false

Probably best we wait for Jacob to answer this one 😊

@MarkusSchildhauer
Copy link
Author

I think this pull request could be helpful.

@quinnj
Copy link
Member

quinnj commented Jan 11, 2023

i think it's intended to let you opt in to retrying certain requests which for which retryable(req) == false

This is correct. The idea is that there is a default set of cases defined by the HTTP spec that dictate whether you should retry or not. Those are covered by retryable. If those default cases return false, however, retry_check provides an escape hatch where you can still force a retry to happen.

@MarkusSchildhauer
Copy link
Author

I'm still not sure if the problem is understood. A little example:

HTTP.get("http://www.google.com", body="bad request")

causes a 400 status exception. But it retries unnecessarily 4 times which takes altogether almost 10s. Which HTTP Spec dictates this retry? The old implementation would have avoided retries on 400.
If I wanted to avoid retries specifically on a 400 status with the new implementation I would expect something like:

HTTP.get("http://www.google.com", body="bad request", retry_check=(s,e,args...) -> !(e isa HTTP.StatusError && e.status==400))

which is not working. The best I can do is to reduce it to one retry on 400 while keeping 4 retries on all others:

HTTP.get("http://www.google.com", body="bad request", retries=1, retry_delays=ExponentialBackOff(n=4),  retry_check=(s,e,args...) -> !(e isa HTTP.StatusError && e.status==400))

It really doesn't look intentional to me, are you really happy with this idea?

@quinnj
Copy link
Member

quinnj commented Jan 11, 2023

But it retries unnecessarily 4 times which takes altogether almost 10s

There was indeed a bug from the new retry_check change in that we weren't considering the response status when a request successfully completed. This has been fixed in #990.

@MarkusSchildhauer
Copy link
Author

MarkusSchildhauer commented Jan 12, 2023

Thanks for solving the main issue and reflecting the behaviour in the docs. I still would like to promote my pull request, because it also ensures expected behaviour (esp. retry_delays should determine retries) and it interprets retry_check as a general way of customizing instead of a providing only an opt-in. Without the docs that is what I would have expected, since the list of hardcoded retry-statuses (403, 408, 409, 429, 500, 502, 503, 504, 599) is discussible according to the RFC 9110:

  • 403: The client SHOULD NOT automatically repeat the request with the same credentials. The client MAY repeat the request with new or different credentials
  • 408, 409, 429: MAY retry under certain circumstances
  • 503: recovery is likely after delay
  • other 5xx: no spec on retry

On top of this we should not rely on a spec-compliant server behaviour. That said, I would be glad if we could treat that status-list as a "may-retry" or "recovery-is-likely-list" and 403 should be removed anyway (I'm not sure about 500).

An usage example for opt-out 408, 409 and opt-in 404 combined with a custom retry_delay-setting could look like this:

isstatuserror(e, status) = e isa HTTP.StatusError && e.status in status
HTTP.get("http://www.google.com", retry_delays = 1:9, retry_check = (s,e,args...) -> 
    HTTP.retry_check(s,e,args...) && !isstatuserror(e, (408, 409)) || isstatuserror(e, 404)
)

May be you could have a second thought on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants