Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarrification on the metric gameserver_creation_duration #3980

Closed
ldufresnegs opened this issue Sep 9, 2024 · 5 comments
Closed

Clarrification on the metric gameserver_creation_duration #3980

ldufresnegs opened this issue Sep 9, 2024 · 5 comments
Assignees

Comments

@ldufresnegs
Copy link

Hi,

I wanted to ask for clarrification about the intention from the metric gameserver_creation_duration, because when I read the description of it, I either don't understand the intention or it doesn't behave as expected.

My understanding is that it would measure the times it takes for a gameserver from the moment it's requested by the user to the moment it's ready in seconds. But, the longest histogram bucket is 3 seconds and the data I get are not realistic. I tried to follow a bit the code and if I didn't made a mistake, it seems to measure the duration of this function.

So, is the intention of the function to measure this post request or is it to also include the time it took for the gameserver to be in the ready state?

To give some additional details on the reason why I noticed that is that I wanted use the metric to track how often a new nodes ends up being created in order for the gameserver to be created which takes a significant amount of time.

Thanks

@gongmax
Copy link
Collaborator

gongmax commented Oct 3, 2024

@vicentefb You may want to take a look at this

@vicentefb
Copy link
Collaborator

The metric effectively captures the total time spent within the addMoreGameServers function:

func (c *Controller) addMoreGameServers(ctx context.Context, gsSet *agonesv1.GameServerSet, count int) (err error) {
Once the function completes and returns, the metric recording stops.

This aligns with the intention of measuring the performance of the server-side creation process handled by this specific function.

The latency.record() call specifically measures the time taken for this function to execute, which includes:

  • Creating GameServer objects in Kubernetes.
  • Potentially waiting for resources to become available.
  • Updating internal state and event logs.

But it doesn't include time for the game server to be Ready.

Additionally, there's a PR in review #3947 which adds a dashboard for agones_gameserver_state_duration which measures the average time spent in every stage of a game server.

@ldufresnegs
Copy link
Author

ldufresnegs commented Oct 8, 2024

Thanks for the follow up, but when you say:

Potentially waiting for resources to become available.

it usually doesn't include the time needed to allocate a pod, right? In my tests it didn't seem so (and the metric durations were way too low to include that), but I wonder if I didn't test correctly.

@vicentefb
Copy link
Collaborator

vicentefb commented Oct 8, 2024

Correct, it doesn't include the time needed to allocate a pod. You can see the detailed Game Server state diagram here: https://agones.dev/site/docs/reference/gameserver/ it shows the Creation state being independent from the Starting and Allocating the pod.

If you'd like to, you can share with us your current test setup details so that we can try and reproduce it.

@ldufresnegs
Copy link
Author

Ok, thanks for the confirmation. It all make sense, though I wonder if the most interesting metrics isn't the actual "start to create game server" up to "game server can be joined", but that might just be in our case. I was able to get something along those line with using the metrics tracking time spent in states, but still not quite the same.

In any case, thank you and I believe the issue can be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants