Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect current user count after user stops in distributed mode #2906

Open
2 tasks done
CharleneHu-42 opened this issue Sep 18, 2024 · 3 comments
Open
2 tasks done
Labels
bug stale Issue had no activity. Might still be worth fixing, but dont expect someone else to fix it

Comments

@CharleneHu-42
Copy link

Prerequisites

Description

I have a test scenario where the custom load shape will generate random number of new users at each tick, and each user will stop itself after receiving the response of a HTTP GET request.

We observed that if we use distributed load generation by setting --process parameter while starting locust, sometimes when users are stopped between two ticks, the get_current_user_count() method is not able to to return the correct user number, some stopped users still counts in get_current_user_count() in subsequent ticks. This issue does not occur in local runner (w/o setting --process).

See below logs:

[2024-09-18 16:35:56,625] host/DEBUG/test_load_shape: Current users: 4, New users: 0, Target users: 4
[2024-09-18 16:35:57,075] host/DEBUG/urllib3.connectionpool: http://host-ip:80 "GET /sleep/5 HTTP/11" 200 None
[2024-09-18 16:35:57,075] host/INFO/root: Stopping current user: <test_load_shape.MyUser object at 0x7f362c904310>
[2024-09-18 16:35:57,626] host/DEBUG/test_load_shape: Current users: 4, New users: 0, Target users: 4
[2024-09-18 16:35:58,628] host/DEBUG/test_load_shape: Current users: 4, New users: 2, Target users: 6

One user has received the response and stopped, the correct user number should be 3, but in the subsequent ticks, the current user returned from get_current_user_count() still remains at 4.

Command line

locust -f test_load_shape.py -H http:///sleep/5 --headless --process 1 --loglevel DEBUG -s 120

Locustfile contents

import numpy as np
import logging
from locust import HttpUser, task, LoadTestShape

logger = logging.getLogger(__name__)

class PoissonLoadShape(LoadTestShape):
    use_common_options = True

    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.arrival_rate = 0.4

    def tick(self):

        new_users = np.random.poisson(lam=self.arrival_rate)
        current_users=self.get_current_user_count()
        user_count = current_users + new_users
        logger.debug(
            "Current users: {current_users}, New users: {new_users}, Target users: {target_users}".format(
                current_users=current_users, new_users=new_users, target_users=user_count
            )
        )
        # Avoid illegal spawn_rate value of 0
        spawn_rate = max(0.01, new_users)
        return (user_count, spawn_rate)

class MyUser(HttpUser):
    
    @task
    def my_task(self):
        self.client.get("")
        logging.info("Stopping current user: {user_id}".format(user_id=self))
        self.stop(force=True)

Python version

3.11.9

Locust version

2.31.4

Operating system

Ubuntu 22.04.1

@cyberw
Copy link
Collaborator

cyberw commented Sep 18, 2024

Hi!

Users stopping themselves is not really a supported scenario. I think it should be, but right now it isn't.

I'm not sure if this is an easy or hard fix, but either way I don't have time to fix it myself so someone would have to volunteer.

@AdityaS8804
Copy link

Hey,
It seems that the issue is related to the get_current_user_count() method not returning the correct number of active users when running Locust with distributed processes. The problem occurs when users stop between ticks, and the count doesn't get updated across processes correctly, leading to stale data.

One potential fix could involve synchronizing the user count across all processes before the tick() method uses it for decision-making. By ensuring that the user count is up-to-date across distributed workers, we can avoid counting stopped users in subsequent ticks. Additionally, making sure that users are cleanly stopped and immediately removed from the active count will help prevent race conditions.

I am interested in contributing to this issue and I'd appreciate any thoughts or feedback on this approach!

Copy link

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 10 days.

@github-actions github-actions bot added the stale Issue had no activity. Might still be worth fixing, but dont expect someone else to fix it label Dec 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug stale Issue had no activity. Might still be worth fixing, but dont expect someone else to fix it
Projects
None yet
Development

No branches or pull requests

3 participants