Event loop occasionally hangs after redisClusterAsyncDisconnect under high connection error conditions #219

plainbanana · 2024-05-18T08:21:25Z

Hello hiredis-cluster team,

We have encountered a situation where the event loop hangs after issuing redisClusterAsyncDisconnect in conditions where connection errors are likely to occur.

Here is a reproducible example on the current master branch. The program hangs only when .heavyoperation is set to true. Sorry for the rough example.
https://gist.github.com/plainbanana/0f287c6afec4527dba566d01e98aca63

I may be wrong, but I have looked into possible causes:

A debug sleep 1 added in redisClusterAsyncCommandToNode timed out.
This timeout triggered updateSlotMapAsync, adding a CLUSTER NODES command.
A heavy operation (e.g. sleep 1) was executed in the commandCallback.
redisClusterAsyncDisconnect was executed in the commandCallback.
Due to the timeout of the CLUSTER NODES command (?), a retry was attempted in clusterNodesReplyCallback, which established a new connection and added another CLUSTER NODES command.
The hiredis redisProcessCallbacks did not recognize REDIS_DISCONNECTING, leading to a hang.

Is there a way to avoid the hang and successfully terminate the connections in such a scenario? In my humble opinion, it might be better not to initiate a new connection in (5) if redisClusterAsyncDisconnect has already been issued. I would appreciate your input.

Thank you.

The text was updated successfully, but these errors were encountered:

bjosv · 2024-05-20T08:02:49Z

In my humble opinion, it might be better not to initiate a new connection in (5) if redisClusterAsyncDisconnect has already been issued.

Yes, it sounds reasonable that hiredis-cluster shouldn't initiate new connections when a user wants to disconnect.
When disconnecting the flag REDIS_DISCONNECTING is set in hiredis which blocks new commands on existing nodes.
But we should probably add a similar flag in hiredis-cluster to avoid creating new connections to nodes.

bjosv · 2024-08-15T07:59:23Z

Closed by #225

plainbanana mentioned this issue May 22, 2024

Hangs during the termination process in version 0.091. plainbanana/Redis-Cluster-Fast#45

Closed

bjosv mentioned this issue Jun 10, 2024

Don't initiate new connections during a client shutdown (async API) #225

Merged

bjosv closed this as completed Aug 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Event loop occasionally hangs after redisClusterAsyncDisconnect under high connection error conditions #219

Event loop occasionally hangs after redisClusterAsyncDisconnect under high connection error conditions #219

plainbanana commented May 18, 2024 •

edited

Loading

bjosv commented May 20, 2024

bjosv commented Aug 15, 2024

Event loop occasionally hangs after redisClusterAsyncDisconnect under high connection error conditions #219

Event loop occasionally hangs after redisClusterAsyncDisconnect under high connection error conditions #219

Comments

plainbanana commented May 18, 2024 • edited Loading

bjosv commented May 20, 2024

bjosv commented Aug 15, 2024

plainbanana commented May 18, 2024 •

edited

Loading