-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
performance waves in seconds #1303
Comments
Hi @divfor, thanks for reporting this. It's a bit difficult to follow all your screenshots, but can you summarize your main complaint in text? Is the issue that k6 is using 100% of 2 CPU cores, or that the CPU usage fluctuates, or that the RPS itself fluctuates? I would consider RPS fluctuation to be the only problem, as the other 2 can be expected, depending on your environment. RPS variation can also have many causes (network, system under test, etc.), so it's difficult to pin it down as a k6 issue. |
I think there is one (or two) lock wait limit(s) inside k6 which uses a core each. when scaling VUs up to 500, the two core will reach 100% and then k6 starts fluctuation. if I set max VUs to less than 500 (the two cpu cores never hit 100%), fluctuation will not be seen. |
I am going to do some experiments, trying to reproduce the issue on my dual X5660 server this week, but I am currently doing some changes on the machine and doing perfomance benchmarking is probably not a good idea ;). |
Hi, I did some testing ... finally. Sorry, it took so long I needed to finish some work on #1007 and #1285 in particular. So my tests were over 1gbps connection and I was serving a 10kb and 100kb files through Nginx (as that is what I thought "10k", "100k" meant in your file names) with 10 IPs on the interface on the k6 machine that as mention before has 2 X5660 and 24GB of ram. The Nginx machine is a 4core i7 . I more or less run the exact same script (only for smaller durations and VUs as I ... don't have the time to wait for it and the memory required).
As seen above I do the same amount of traffic, iterations and requests but with 50x less VUs (and lesser CPU and memory usage). In my particular case, I didn't have two CPUs that are pinned to 100, I had 1 that was ... close with 50-80% of the CPU being system and I am pretty sure this is the thread doing epool for us in go , but it wasn't affected by me disabling or enabling thresholds and so on as we first thought it would. It also disappears completely if I just run empty default function (which still eats a lot of CPU for the record), so my epool theory seems good. I did run with So my recommendation to get ... better performance is to try and see if lowering your VUs will still yield the same amount of throughput (you can try with 5 minutes tests for example :)) You can also try and use the new javascript compatibility mode which will lower the memory per VU significantly for your test and as it seems pretty simple you won't need to rewrite a lot to keep it ES5.1 compatible :). On a related note ... the stages implementation pre #1007 is kind of ... not great, and given what you are doing you are probably better served by just setting I hope this was useful and I will try to take another stab at it with the changes from #1007 at some point, that time using some profiling and possibly some tracing in order to better understand what is happening :D |
The team discussed this during our backlog grooming, and we decided we would likely not pursue addressing this (ourselves, anyway). If you encounter this issue and have the interest and capacity to address it, please feel free to reopen and discuss it further. 🙇♂️ |
I noted that when I use an powerful machines to stress another two servers, the requesting rate fluctuates periodically with two cpu cores pinned in 100%:
cpu% in high position of period:
cpu% in low position of period:
The text was updated successfully, but these errors were encountered: