Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mtail fails to scrape with > 70k metrics presented. #903

Open
JWThorne opened this issue Jul 18, 2024 · 3 comments
Open

mtail fails to scrape with > 70k metrics presented. #903

JWThorne opened this issue Jul 18, 2024 · 3 comments

Comments

@JWThorne
Copy link

We find that with our current deployment, even though the scrape time is under 2.5 seconds, HTTP GET on the /metrics endpoint will just fail if mtail has more than 70k metrics. There are no errors in the logs, no issues, just a failed response and a connection close after 2 seconds. Reducing the metric count appears to restore operation

However, we need more metrics.

@jaqx0r
Copy link
Contributor

jaqx0r commented Jul 19, 2024

Which version please?

https://github.com/google/mtail/blob/main/docs/Troubleshooting.md#reporting-a-problem

Does it look like mtail has also stopped processing lines when a GET is being processed?

@terencehonles
Copy link
Contributor

This may indeed be related to the issue I was seeing and the change #908. I was testing with the /json handler and it does emit the headers and then stream the response. I had noticed that testing /metrics was returning an empty response (when I was in the browser).

For /json I was seeing a E0805 16:26:05.490112 435505 json.go:27] write tcp [::1]:3903->[::1]:55250: i/o timeout message.

From curl with verbose logging I was seeing:

* transfer closed with outstanding read data remaining
* Closing connection
curl: (18) transfer closed with outstanding read data remaining

When rebuilding mtail without #908 (I need #906 for my mtail program) and testing /metrics again, I do see that there's nothing written to the logs, and curl looks like:

* Empty reply from server
* Closing connection
curl: (52) Empty reply from server

@JWThorne you can probably look at one of the other exporters to confirm you're seeing partial output from them, and you can either build from the source or wait till #908 is released

@terencehonles
Copy link
Contributor

the /metrics endpoint will just fail if mtail has more than 70k metrics

This is the number of outputted metrics or the number of log lines you're processing?

For my case we had a number of counters with a large number of labels, so it was generating a large JSON payload and hitting the timeout.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants