Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exit group coordinator if member pid gone #611

Merged

Conversation

fmcgeough
Copy link
Contributor

Fixes #607.

Terminate Group Coordinator if MemberPid is no longer alive when assignments_revoked, assign_partitions
or assignments_received must be called.

This is a first attempt to explore how to fix the issue raised in #607.

Terminate Group Coordinator if MemberPid is no longer alive when assignments_revoked, assign_partitions
or assignments_received must be called.
@zmstone
Copy link
Contributor

zmstone commented Dec 1, 2024

Thank you for the PR @fmcgeough
But this will not eliminate the race condition completely.
this should:

try
  MemberModule:assignments_revoked(MemberPid)
catch
  exit:{noproc,{gen_server,call,[MemberPid | _]}}} ->
    exit({shutdown, member_down})
end,

to avoid having to repeat for all three calls, a macro can be better:

-define(CALL_MEMBER(MemberPid, EXPR),
  try
    EXPR
  exit:{noproc,{gen_server,call,[MemberPid | _]}}} ->
    exit({shutdown, member_down})
  end).

and use it like this: ?CALL_MEMBER(MemberPid, MemberModule:assignments_revoked(MemberPid))

the alive check does not work because a race condition still exists.
@fmcgeough
Copy link
Contributor Author

I've modified the PR to catch the exception rather than attempting to check if the process is alive. I'd appreciate any suggestions on how to do unit testing for this change.

@zmstone zmstone merged commit e18151c into kafka4beam:master Dec 12, 2024
14 checks passed
@zmstone
Copy link
Contributor

zmstone commented Dec 12, 2024

Thank you for the PR @fmcgeough

@fmcgeough fmcgeough deleted the exit_group_coordinator_if_member_pid_gone branch December 13, 2024 14:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Process abnormal exit during termination when custom partitioning strategy is used
2 participants