Undiagnosed rare bug will set `heartbeatStats.LongestMissingBeat` to a much higher number than is accurate #29

5HT2 · 2022-04-03T06:06:41Z

Note that this bug does not affect uptime (nor device.LongestMissingBeat). This is only an issue with where time.Now() is used, I think.

The text was updated successfully, but these errors were encountered:

5HT2 · 2022-04-09T00:11:42Z

Bumping priority, I can't tell why but this issue seemingly still happens after rebooting heartbeat. It appears to always set around ~447 hours. I don't know if this number has any significance (447 - 12 (my initial hours) = 435 hours).

This seems to come back even after the system time has been set and heartbeat has been restarted.

5HT2 · 2022-04-09T00:15:13Z

Oh, it's ALWAYS 1610833 seconds

5HT2 · 2022-04-09T00:16:40Z

I don't think it's caused by malformed data as my devices look fine

5HT2 · 2022-04-09T02:42:41Z

I suspect this might have just been caused by erroneous data in the DB, I fixed it manually and will monitor for further changes.

5HT2 · 2022-12-18T10:30:33Z

Can't reproduce. This issue has not happened since April 8th.

5HT2 · 2023-01-24T22:49:14Z

Reopening this as some sort of weird Redis behavior. Here's the appendonly.aof from the last time this happened on production, which is fairly recently:

{"last_beat_formatted":"2 minutes, 4 seconds","total_devices_formatted":"3","total_visits_formatted":"903,733","total_uptime_formatted":"11,773 hours, 27 minutes, 8 seconds"{"device_name":"mango","timestamp":1673974698}
*4
$14
JSON.ARRAPPEND
$5
beats
$1
.
$46
{"device_name":"mango","timestamp":1673974698}
*4
$8
JSON.SET
$5
stats
$1
.
$303
{"last_beat_formatted":"30 seconds","total_devices_formatted":"3","total_visits_formatted":"900,588","total_uptime_formatted":"11,599 hours, 47 minutes, 8 seconds","total_beats_formatted":"376,612","total_visits":900588,"total_uptime_milli":41759228479,"total_beats":376612,"longest_missing_beat":63127}
*4
$8
JSON.SET
$7
devices
$1
.
$399
[{"device_name":"lychee","last_beat":{"device_name":"lychee","timestamp":1673928714},"total_beats":4951,"longest_missing_beat":219481},{"device_name":"mango","last_beat":{"device_name":"mango","timestamp":1673974698},"total_beats":15873,"longest_missing_beat":40847},{"device_name":"blueberry","last_beat":{"device_name":"blueberry","timestamp":1671482520},"total_beats":1,"longest_missing_beat":0}]
*4
$8
JSON.SET
$9
last_beat
$1
.
$46
{"device_name":"mango","timestamp":1673974758}
*4
$14

5HT2 · 2023-01-24T22:50:39Z

This time it is 692 hours (exactly 2492358 seconds in the DB). Unrelated to system timezone updates, I believe this issue might either be correlated to restarting Docker or adding a new device??? idfk I will solve this later, it happens so rarely that switching to the Rust rewrite might just be the way to go

5HT2 · 2023-02-19T00:42:42Z

Happened again. 1681153 seconds, or 466 hours. This HAS to be fixed ASAP because of canary / dead man switch reasons.

{"last_beat_formatted":"11 seconds","total_devices_formatted":"3","total_visits_formatted":"912,240","total_uptime_formatted":"12,354 hours, 39 minutes, 8 seconds","total_beats_formatted":"397,665","total_visits":912240,"total_uptime_milli":44476748495,"total_beats":397665,"longest_missing_beat":63127}
*4
$8
JSON.SET
$7
devices
$1
.
$407
[{"device_name":"lychee","last_beat":{"device_name":"lychee","timestamp":1676693440},"total_beats":10839,"longest_missing_beat":219481},{"device_name":"mango","last_beat":{"device_name":"mango","timestamp":1676693414},"total_beats":31029,"longest_missing_beat":47177},{"device_name":"blueberry","last_beat":{"device_name":"blueberry","timestamp":1675012321},"total_beats":10,"longest_missing_beat":2499120}]
*4
$8
JSON.SET
$9
last_beat
$1
.
$46
{"device_name":"mango","timestamp":1676693474}
*4
$14
JSON.ARRAPPEND
$5
beats
$1
.
$46
{"device_name":"mango","timestamp":1676693474}
*4
$8
JSON.SET
$9
last_beat
$1
.
$47
{"device_name":"lychee","timestamp":1676693500}
*4
$14
JSON.ARRAPPEND
$5
beats
$1
.
$47
{"device_name":"lychee","timestamp":1676693500}
*4
$8
JSON.SET
$5
stats
$1
.
$305
{"last_beat_formatted":"11 seconds","total_devices_formatted":"3","total_visits_formatted":"912,241","total_uptime_formatted":"12,354 hours, 40 minutes, 8 seconds","total_beats_formatted":"397,667","total_visits":912241,"total_uptime_milli":44476808495,"total_beats":397667,"longest_missing_beat":1681153}

5HT2 · 2023-02-19T00:46:09Z

For future reference, if this happens again before I fix it, this is handy.

curl -X POST -H "Auth: $HEARTBEAT_AUTH" -i https://frog.icu/api/update/stats -d "$(curl -s https://frog.icu/api/stats | jq '.longest_missing_beat = 63127')"

5HT2 · 2023-03-14T00:16:49Z

This is happening more frequently, and I need to fix it.

{"last_beat_formatted":"28 seconds","total_devices_formatted":"3","total_visits_formatted":"919,081","total_uptime_formatted":"12,806 hours, 35 minutes, 8 seconds","total_beats_formatted":"412,428","total_visits":919081,"total_uptime_milli":46103708499,"total_beats":412428,"longest_missing_beat":1618410}

5HT2 added bug Something isn't working priority:critical labels Apr 3, 2022

5HT2 closed this as not planned Won't fix, can't repro, duplicate, stale Dec 18, 2022

5HT2 reopened this Jan 24, 2023

5HT2 pinned this issue Jan 24, 2023

5HT2 changed the title ~~Changing system timezone while heartbeat is running will set heartbeatStats.LongestMissingBeat to a much higher number than is accurate~~ Undiagnosed rare bug will set heartbeatStats.LongestMissingBeat to a much higher number than is accurate Jan 24, 2023

lmaotrigine mentioned this issue Aug 6, 2023

Rewrite lmaotrigine/heartbeat#1

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Undiagnosed rare bug will set `heartbeatStats.LongestMissingBeat` to a much higher number than is accurate #29

Undiagnosed rare bug will set `heartbeatStats.LongestMissingBeat` to a much higher number than is accurate #29

5HT2 commented Apr 3, 2022 •

edited

Loading

5HT2 commented Apr 9, 2022

5HT2 commented Apr 9, 2022

5HT2 commented Apr 9, 2022

5HT2 commented Apr 9, 2022

5HT2 commented Dec 18, 2022

5HT2 commented Jan 24, 2023 •

edited

Loading

5HT2 commented Jan 24, 2023

5HT2 commented Feb 19, 2023

5HT2 commented Feb 19, 2023 •

edited

Loading

5HT2 commented Mar 14, 2023

Undiagnosed rare bug will set heartbeatStats.LongestMissingBeat to a much higher number than is accurate #29

Undiagnosed rare bug will set heartbeatStats.LongestMissingBeat to a much higher number than is accurate #29

Comments

5HT2 commented Apr 3, 2022 • edited Loading

5HT2 commented Apr 9, 2022

5HT2 commented Apr 9, 2022

5HT2 commented Apr 9, 2022

5HT2 commented Apr 9, 2022

5HT2 commented Dec 18, 2022

5HT2 commented Jan 24, 2023 • edited Loading

5HT2 commented Jan 24, 2023

5HT2 commented Feb 19, 2023

5HT2 commented Feb 19, 2023 • edited Loading

5HT2 commented Mar 14, 2023

Undiagnosed rare bug will set `heartbeatStats.LongestMissingBeat` to a much higher number than is accurate #29

Undiagnosed rare bug will set `heartbeatStats.LongestMissingBeat` to a much higher number than is accurate #29

5HT2 commented Apr 3, 2022 •

edited

Loading

5HT2 commented Jan 24, 2023 •

edited

Loading

5HT2 commented Feb 19, 2023 •

edited

Loading