Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set externalTrafficPolicy as Local for agones-allocator #4019

Open
osterante opened this issue Oct 17, 2024 · 3 comments · May be fixed by #4022
Open

Set externalTrafficPolicy as Local for agones-allocator #4019

osterante opened this issue Oct 17, 2024 · 3 comments · May be fixed by #4022
Labels
kind/feature New features for Agones

Comments

@osterante
Copy link

Is your feature request related to a problem? Please describe.
When using two node pools in an Agones cluster, one for Agones and one for GameServers, allocation requests sometimes fail during the process of reducing nodes in the GameServers’ node pool, especially when reducing many nodes. I think it shouldn't be affected by changes in the GameServers’ node pool.

Describe the solution you'd like
Set the externalTrafficPolicy for the agones-allocator service to Local from Cluster(default).
https://cloud.google.com/kubernetes-engine/docs/concepts/service-load-balancer?hl=ja#health_check

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

@osterante osterante added the kind/feature New features for Agones label Oct 17, 2024
@gongmax
Copy link
Collaborator

gongmax commented Nov 5, 2024

Hi @osterante , thanks for the contribution. Just a few questions, can you explain more on the problem? what metrics did you observed and why you think it's related to the health check? Did you test the fix in #4022 and verify it mitigates the issue?

@osterante
Copy link
Author

osterante commented Nov 8, 2024

@gongmax
While reducing nodes, allocation requests to the agones-allocator (via gRPC) sometimes fail with the following error:

rpc error: code = Unavailable desc = error reading from server: EOF

The agones-allocator is the Kubernetes Service of type LoadBalancer. In this case, GKE creates Pass-through Load Balancer.
When externalTrafficPolicy is set to Cluster, allocation requests can be processed by any node, even if it does not contain
agones-allocator Pods and only contains GameServer Pods. Then, the node routes the packets to another node that contains an agones-allocator Pod. However, if the node terminates while routing the packets, the allocation request will fail with the error mentioned above.
If externalTrafficPolicy is set to Local, allocation requests are processed only by nodes that contains the agones-allocator Pod.

ref: https://cloud.google.com/kubernetes-engine/docs/concepts/service-load-balancer?hl=ja#node-packet-processing

@osterante
Copy link
Author

Hi @gongmax , do you have some time to review the PR? If there are any concerns, I can keep the default value unchanged for backward compatibility. What I want is to be able to change that value.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature New features for Agones
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants