Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conformance with Prometheus metric names and types #72

Open
bwerthmann opened this issue Feb 11, 2022 · 0 comments
Open

Conformance with Prometheus metric names and types #72

bwerthmann opened this issue Feb 11, 2022 · 0 comments
Assignees
Labels
enhancement Enhancement to existing functionality

Comments

@bwerthmann
Copy link
Contributor

bwerthmann commented Feb 11, 2022

Conformance to Prometheus metric name standards.

Suggested Prerequisite: #71

Compliance with metric name/TYPE standards improves operator/developer experience by enabling advanced features in Prometheus and related tools like promlens, Grafana, etc. Features such as context aware auto-complete / and PromQL linters.

Prometheus is unable to infer the type due to non compliance

image

Prometheus is able to infer the type

image

promtool has a linter for this

This tool compares the names and the types, but is not aware what the actual type is. Thus why #71 is suggested first.

curl -sS localhost:7777/metrics | ./prometheus-2.32.1.linux-amd64/promtool check metrics
nats_core_active_account_count non-histogram and non-summary metrics should not have "_count" suffix
nats_core_connection_count non-histogram and non-summary metrics should not have "_count" suffix
nats_core_core_count non-histogram and non-summary metrics should not have "_count" suffix
nats_core_gateway_count non-histogram and non-summary metrics should not have "_count" suffix
nats_core_gateway_inbound_msg_count non-histogram and non-summary metrics should not have "_count" suffix
nats_core_gateway_recv_msg_count non-histogram and non-summary metrics should not have "_count" suffix
nats_core_gateway_sent_msgs_count non-histogram and non-summary metrics should not have "_count" suffix
nats_core_recv_msgs_count non-histogram and non-summary metrics should not have "_count" suffix
nats_core_route_count non-histogram and non-summary metrics should not have "_count" suffix
nats_core_route_recv_msg_count non-histogram and non-summary metrics should not have "_count" suffix
nats_core_route_sent_msg_count non-histogram and non-summary metrics should not have "_count" suffix
nats_core_sent_msgs_count non-histogram and non-summary metrics should not have "_count" suffix
nats_core_slow_consumer_count non-histogram and non-summary metrics should not have "_count" suffix
nats_core_subs_count non-histogram and non-summary metrics should not have "_count" suffix
nats_core_total_connection_count non-histogram and non-summary metrics should not have "_count" suffix
nats_jetstream_advisory_count non-histogram and non-summary metrics should not have "_count" suffix
nats_latency_observations_count non-histogram and non-summary metrics should not have "_count" suffix
nats_survey_expected_count non-histogram and non-summary metrics should not have "_count" suffix
nats_survey_late_replies_count counter metrics should have "_total" suffix
nats_survey_late_replies_count non-histogram and non-summary metrics should not have "_count" suffix
nats_survey_no_replies_count counter metrics should have "_total" suffix
nats_survey_no_replies_count non-histogram and non-summary metrics should not have "_count" suffix
nats_survey_surveyed_count non-histogram and non-summary metrics should not have "_count" suffix

two other cases

nats_core_rtt_nanoseconds use base unit "seconds" instead of "nanoseconds"
nats_survey_nats_reconnects counter metrics should have "_total" suffix

Example counter metric: nats_core_total_connection_count

Example Scrape

# HELP nats_core_total_connection_count Total number of client connections serviced gauge
# TYPE nats_core_total_connection_count gauge
nats_core_total_connection_count{server_cluster="...",server_id="...",server_name="..."} 6345789

Linter for nats_core_total_connection_count

curl -sS localhost:7777/metrics | grep -F 'nats_core_total_connection_count' |head -n3 | ./prometheus-2.32.1.linux-amd64/promtool check metrics
nats_core_total_connection_count non-histogram and non-summary metrics should not have "_count" suffix

issue #71 will find this is a counter

curl -sS localhost:7777/metrics | grep -F 'nats_core_total_connection_count' |head -n3 |sed 's/gauge/counter/'| ./prometheus-2.32.1.linux-amd64/promtool check metrics

Two problems.

nats_core_total_connection_count counter metrics should have "_total" suffix
nats_core_total_connection_count non-histogram and non-summary metrics should not have "_count" suffix

Example "fixed" scrape

# HELP nats_core_total_connection_total Total number of client connections serviced counter
# TYPE nats_core_total_connection_total counter
nats_core_total_connection_total{server_cluster="...",server_id="...",server_name="..."} 42224222
@bwerthmann bwerthmann added the enhancement Enhancement to existing functionality label Feb 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement to existing functionality
Projects
None yet
Development

No branches or pull requests

2 participants