Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(operator): allow retries to consider exit code from init container and don't consider node as pending if init failed. Fixes #11354/#10717/#10045 #13858

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

tooptoop4
Copy link
Contributor

@tooptoop4 tooptoop4 commented Nov 3, 2024

Fixes #11354 and #10717 and #10045

Before this fix it would always go into pending because main container was waiting state (

woc.markNodePhase(ctrNodeName, wfv1.NodePending)
) even though init container already terminated with non-0 exit

This supersedes #13852

cc @terrytangyuan

@tooptoop4 tooptoop4 changed the title fix(operator): allow retries to consider exit code from init container and don't consider node as pending if init failed. Fixes #11354 fix(operator): allow retries to consider exit code from init container and don't consider node as pending if init failed. Fixes #11354/#10717/#10045 Nov 3, 2024
@jswxstw
Copy link
Member

jswxstw commented Nov 4, 2024

Before this fix it would always go into pending because main container was waiting state even though init container already terminated with non-0 exit

This will only be encountered when using ContainerSet, right?

@tooptoop4
Copy link
Contributor Author

Before this fix it would always go into pending because main container was waiting state even though init container already terminated with non-0 exit

This will only be encountered when using ContainerSet, right?

no, see the logs/comments in the linked issues

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pod failed: Error (exit code 255) but no retry
2 participants