-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Graceful stop is ignored when running SQL tests via xk6-sql plugin #3958
Comments
Hi @alex-dubrouski, I'll allocate some time to try to reproduce this issue, but meanwhile, here's a couple of questions:
Also, could you please share a redacted example of how you set up Thanks! |
Hi @joanlopez, Here is the config
I also asked TiDB team to provide details on their co fig and experiment. |
Hi @joanlopez, We currently don't know if this is specific to TiDB, but I think it could be reproducible with MySQL as well. In our experiment we started the load using K6 that supposed to complete in X mins and after some time we sent In our test we used: 1 HAProxy (that round robins TiDB hosts) and a TiDB Cluster of (3 PD nodes, 4 TiDB nodes, and 4 TiKV nodes). Bu I think it might be reproducible with 1-2 TiDB Hosts as well. Potentially even with MySql. |
Great, thanks @alex-dubrouski and @ol-neostyle for the additional details. |
Brief summary
We are using K6 with xk6-sql plugin to load test our local TiDB cluster. On top of regular load tests we run basic chaos engineering like cluster failover and node restart during the test. We noticed that TiDB cluster disruptions cause K6 process to hang for a long time.
k6 version
v0.53.0
OS
Linux x86-64
Docker version and image (if applicable)
No response
Steps to reproduce the problem
When running simple select test and manually restarting one of the cluster nodes:
this test is expected to finish in 4mins 50sec, but it actually takes:
Please note
running (19m19.8s)
, traffic correctly drops to 0 after 4mins 50sec, but process keeps sitting idle for over 15 more minutesI built K6 locally from the trunk with debug symbols and profiled the app when it hangs with perf and bcc-tools:
Offcpu (idle descheduled threads)
Offwake (re-scheduled threads and threads that woke them up combined in one stack)
and perf:
Test script is very simple it just randomly selects data range from a set of configured tables:
Expected behaviour
K6 process supposed to exit after configured graceful stop
Actual behaviour
K6 process keeps sitting idle for a prolonged amount of time
The text was updated successfully, but these errors were encountered: