Allow a slave to "fail" a piece of work #7

mvr · 2015-07-31T00:55:36Z

In Kesha, some of the zones might be way harder than others. If a zone takes too long, we want to "fail", and send that piece of work to the end of the queue to be tried again later. This way we can get most of the zones done early.

jamiecook · 2015-07-31T01:06:29Z

why would we want to repeat the work?

mvr · 2015-07-31T03:53:01Z

The idea is we would kill a particular bit of work after some preset timeout, then put it at the end of the queue to be done for real after everything else.

jamiecook · 2015-07-31T05:16:27Z

To what end? Are we streaming the results back so that partial results are useful?

mvr · 2015-07-31T06:21:10Z

Yeah, results get streamed back. The ruby code around Kesha does some post-processing that can happen as soon as each bit of work is done by a slave. Also, if for some reason one of the slaves hangs, we don't want it to be useless for there rest of a run.

jamiecook · 2015-07-31T06:24:56Z

Cool - just wanted to know the motivation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow a slave to "fail" a piece of work #7

Allow a slave to "fail" a piece of work #7

mvr commented Jul 31, 2015

jamiecook commented Jul 31, 2015 via email

mvr commented Jul 31, 2015

jamiecook commented Jul 31, 2015

mvr commented Jul 31, 2015

jamiecook commented Jul 31, 2015 via email

Allow a slave to "fail" a piece of work #7

Allow a slave to "fail" a piece of work #7

Comments

mvr commented Jul 31, 2015

jamiecook commented Jul 31, 2015 via email

mvr commented Jul 31, 2015

jamiecook commented Jul 31, 2015

mvr commented Jul 31, 2015

jamiecook commented Jul 31, 2015 via email