-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transfer retries fix #329
base: master
Are you sure you want to change the base?
Transfer retries fix #329
Conversation
492229d
to
f549bf9
Compare
Rebased against master |
Looks good to me, thanks! I suspect this would not result in a "green-but-actually-failed" job since Galaxy will still expect the output to exist in its own finish method, did you happen to include that in testing? |
Haven't tested that Nate, @cat-bro is the only one who got this running on an actual Pulsar server. Unfortunately, I can't figure out why the tests are failing. Apparently this change causes the output file transfer test to fail, which I guess must mean that the source file path doesn't exist in the test case, which doesn't make any sense to me... I spent some time looking at the tests and couldn't figure it out or debug it. @mvdbeek I don't suppose you'd be able to take a look at this? I assume the explanation lies in https://github.com/galaxyproject/pulsar/blob/master/pulsar/client/test/check.py. Maybe there is a test that checks whether a transfer will retry when the source file is missing on the first attempt...? |
I haven't forgotten about this, but I didn't think this was quite the right layer to fix this at. You'd probably want to parameterize this on the job state. |
Sorry @mvdbeek I'm not sure what you mean. Only skip retrying if the job has failed and the output file doesn't exist? Isn't job state decided by Galaxy and not Pulsar anyway? 🤔 |
Pulsar knows the exit code, but you're right, we don't really know when that's a failure or not. I'll try getting back to it. |
Thanks! I'm interested to see what you find. |
Potentially resolves #298.
Tested manually on one of our dev pulsar nodes by @cat-bro.