Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jobs randomly changing position only if throttle is used #86

Open
feliperaul opened this issue Jul 9, 2020 · 4 comments
Open

Jobs randomly changing position only if throttle is used #86

feliperaul opened this issue Jul 9, 2020 · 4 comments
Labels
enhancement help wanted Contributions are highly appreciated

Comments

@feliperaul
Copy link

Ruby version: 2.6.5
Sidekiq / Pro / Enterprise version(s): 6.0.7

Initializer:

:concurrency: 25
:queues:
  - mailers
  - default
  - importers
  - searchkick
  - other

We are currently inspecting a queue that is taking a long time to run ("importers").

On investigating, we noticed a very unusual behavior that we think is a bug and might be causing performance degradation and/or unexpected behavior.

If we activate "Live poll" and watch the queue on the web ui, every 2 seconds we have a completely different order of jobs, even tough no new jobs have been added and no jobs have been processed (these jobs take a few minutes each, and they only receive jobs manually, so it should be very stable during processing).

Using Rails console to inspect it further, we could confirm that on every poll (a few seconds apart), we get a completely different array of job.jids, even tough all the Job Ids remain the same (only the order changes, but the jobs ids remain constant, so no new jobs are being added or removed).

We're using this debug code to confirm the order is changing very rapidly:

def debug
  queue = Sidekiq::Queue.new("importers")
  jids = queue.map {|job| job.jid}; p "I have #{jids.size} jobs and my first one id is #{jids[0]}"
  sleep 1
  jids2 = queue.map {|job| job.jid}; p "I have #{jids2.size} jobs and my first one id is #{jids2[0]}"
  p "This is jids2 - jids"
  p jids2 - jids
  p "This is jids - jids2:"
  p jids - jids2
  p "Are they the same?"
  p jids == jids2
  p "But what if I order them?"
  p jids.sort == jids2.sort
end

This is the result:

(main)> debug
"I have 151 jobs and my first one id is 20480233f53e0b8d62231615"
"I have 151 jobs and my first one id is 0dc7e77f17c1f6b3fc6e59a8"
"This is jids2 - jids"
[]
"This is jids - jids2:"
[]
"Are they the same?"
false
"But what if I order them?"
true

AFAIK, Sidekiq guarantees that jobs are going to be fetched in order (even tough, of course, it's an Async job queue, so there's no guarantee they will finish in the order they were added), so we think this to be a bug.

@feliperaul
Copy link
Author

I just bumped on #52

So this appears to be by design :)

Closing.

@ixti
Copy link
Owner

ixti commented Jul 22, 2020

@feliperaul can you point me out where Sidekiq guarantees that jobs are going to be fetched in order? Because if it is so - we need to think on how to make sure we follow that.

@feliperaul
Copy link
Author

Hi @ixti , sure.

It's on the FAQ, here: https://github.com/mperham/sidekiq/wiki/FAQ#how-can-i-process-a-certain-queue-in-serial

It reads:

How can I process a certain queue in serial?

You can't, by design. Sidekiq is designed for asynchronous processing of jobs that can be completed in isolation and independent of each other. Jobs will be popped off of Redis in the order in which they were pushed but there's no guarantee that Job #1 will execute fully before Job #2 is started.

If you need serial execution, you should look into other systems which give those types of guarantees.

To be quite honest, it hasn't been a big deal for us, but if #52 states that the job queue, when throttled, is paused from polling for two seconds, I think that it would be much better to push the jobs back to the top of the queue, instead of pushing them back to the end of the line.

Imagine a huge e-mail sending queue (like 100.000 jobs) that you are throttling to use only 5 workers ... if jobs are being pushed to the END of the queue, they would take much, much longer to send than if they were pushed back to the top of the queue to wait for the next poll window (2 seconds).

@feliperaul feliperaul reopened this Jul 22, 2020
@ixti
Copy link
Owner

ixti commented May 30, 2023

I think there's no one size fits all solution. But we can make throttling as customizing as possible:

  • allowing throttle queue rather than class – that can easily guarantee FIFO
  • allowing to configure how throttled jobs are pushed back – to the head or to the tail (as it is now)

I'll be happy to review and merge any improvements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement help wanted Contributions are highly appreciated
Projects
None yet
Development

No branches or pull requests

2 participants