-
-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Undiagnosed rare bug will set heartbeatStats.LongestMissingBeat
to a much higher number than is accurate
#29
Comments
Bumping priority, I can't tell why but this issue seemingly still happens after rebooting heartbeat. It appears to always set around ~447 hours. I don't know if this number has any significance (447 - 12 (my initial hours) = 435 hours). This seems to come back even after the system time has been set and heartbeat has been restarted. |
Oh, it's ALWAYS 1610833 seconds |
I suspect this might have just been caused by erroneous data in the DB, I fixed it manually and will monitor for further changes. |
Can't reproduce. This issue has not happened since April 8th. |
Reopening this as some sort of weird Redis behavior. Here's the {"last_beat_formatted":"2 minutes, 4 seconds","total_devices_formatted":"3","total_visits_formatted":"903,733","total_uptime_formatted":"11,773 hours, 27 minutes, 8 seconds"{"device_name":"mango","timestamp":1673974698}
*4
$14
JSON.ARRAPPEND
$5
beats
$1
.
$46
{"device_name":"mango","timestamp":1673974698}
*4
$8
JSON.SET
$5
stats
$1
.
$303
{"last_beat_formatted":"30 seconds","total_devices_formatted":"3","total_visits_formatted":"900,588","total_uptime_formatted":"11,599 hours, 47 minutes, 8 seconds","total_beats_formatted":"376,612","total_visits":900588,"total_uptime_milli":41759228479,"total_beats":376612,"longest_missing_beat":63127}
*4
$8
JSON.SET
$7
devices
$1
.
$399
[{"device_name":"lychee","last_beat":{"device_name":"lychee","timestamp":1673928714},"total_beats":4951,"longest_missing_beat":219481},{"device_name":"mango","last_beat":{"device_name":"mango","timestamp":1673974698},"total_beats":15873,"longest_missing_beat":40847},{"device_name":"blueberry","last_beat":{"device_name":"blueberry","timestamp":1671482520},"total_beats":1,"longest_missing_beat":0}]
*4
$8
JSON.SET
$9
last_beat
$1
.
$46
{"device_name":"mango","timestamp":1673974758}
*4
$14 |
This time it is 692 hours (exactly 2492358 seconds in the DB). Unrelated to system timezone updates, I believe this issue might either be correlated to restarting Docker or adding a new device??? idfk I will solve this later, it happens so rarely that switching to the Rust rewrite might just be the way to go |
heartbeatStats.LongestMissingBeat
to a much higher number than is accurateheartbeatStats.LongestMissingBeat
to a much higher number than is accurate
Happened again. 1681153 seconds, or 466 hours. This HAS to be fixed ASAP because of canary / dead man switch reasons. {"last_beat_formatted":"11 seconds","total_devices_formatted":"3","total_visits_formatted":"912,240","total_uptime_formatted":"12,354 hours, 39 minutes, 8 seconds","total_beats_formatted":"397,665","total_visits":912240,"total_uptime_milli":44476748495,"total_beats":397665,"longest_missing_beat":63127}
*4
$8
JSON.SET
$7
devices
$1
.
$407
[{"device_name":"lychee","last_beat":{"device_name":"lychee","timestamp":1676693440},"total_beats":10839,"longest_missing_beat":219481},{"device_name":"mango","last_beat":{"device_name":"mango","timestamp":1676693414},"total_beats":31029,"longest_missing_beat":47177},{"device_name":"blueberry","last_beat":{"device_name":"blueberry","timestamp":1675012321},"total_beats":10,"longest_missing_beat":2499120}]
*4
$8
JSON.SET
$9
last_beat
$1
.
$46
{"device_name":"mango","timestamp":1676693474}
*4
$14
JSON.ARRAPPEND
$5
beats
$1
.
$46
{"device_name":"mango","timestamp":1676693474}
*4
$8
JSON.SET
$9
last_beat
$1
.
$47
{"device_name":"lychee","timestamp":1676693500}
*4
$14
JSON.ARRAPPEND
$5
beats
$1
.
$47
{"device_name":"lychee","timestamp":1676693500}
*4
$8
JSON.SET
$5
stats
$1
.
$305
{"last_beat_formatted":"11 seconds","total_devices_formatted":"3","total_visits_formatted":"912,241","total_uptime_formatted":"12,354 hours, 40 minutes, 8 seconds","total_beats_formatted":"397,667","total_visits":912241,"total_uptime_milli":44476808495,"total_beats":397667,"longest_missing_beat":1681153} |
For future reference, if this happens again before I fix it, this is handy. curl -X POST -H "Auth: $HEARTBEAT_AUTH" -i https://frog.icu/api/update/stats -d "$(curl -s https://frog.icu/api/stats | jq '.longest_missing_beat = 63127')" |
This is happening more frequently, and I need to fix it.
|
Note that this bug does not affect uptime (nor
device.LongestMissingBeat
). This is only an issue with wheretime.Now()
is used, I think.The text was updated successfully, but these errors were encountered: