Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request for Comment on setting a batch queue maximum wallclock and a new "long" queue #407

Open
tatarsky opened this issue May 3, 2016 · 9 comments

Comments

@tatarsky
Copy link
Contributor

tatarsky commented May 3, 2016

We are attempting to allow the use by the batch queue of a set of nodes purchased by a specific group when idle. But some concerns have been expressed about the walltime limit on the batch queue being currently unlimited and the possibility jobs could be scheduled there of long duration interfering with their needs.

I've seen discussed off and on over the years the setting of a walltime limit on batch but wanted to ask per discussions with @juanperin for comments here.

We have a few options which I'll try to explain and if folks have opinions please comment so we can make sure we understand all possible impacts. Our goal is maximizing node usage. We will make no changes without discussion.

Modification to batch queue maximum walltime

  1. We would set a wallclock limit on batch as a way of preventing these additional nodes from being tied up for long duration. For example four days (96 hours) could become the new batch maximum walltime.
  2. I do not believe I can or would want to set a "per node" walltime limit. That I feel would be very confusing. I believe walltimes should be a queue level configuration and all nodes in a queue should support the same limits.

Creation of a queue for longer running jobs

  1. Longer jobs would be allowed via a queue that has a limited per node slot count. (Perhaps 12). And its ability to run on nodes would be controlled by a different Torque attribute than batch allowing us to adjust its pool of nodes based on monitored demand.
  2. The longer queue could have a walltime limit representing some rational job length value decided upon here. I will state "42 days" just based on some of the 1000 hour jobs I've seen lately. But I defer to comments on the job mix people see as useful.

Thank you for any opinions you might have.

@jchodera
Copy link
Member

jchodera commented May 3, 2016

I wonder if there is a simpler way. Is there a way to restrict jobs that run on the nodes from this mysterious "specific group" to only those with a walltime limit set to, say, <24h?

That way, only jobs that are guaranteed to complete in a reasonable amount of time are ever run on the "specific group" hardware, and no special queues are needed---it is transparent to the submitter. The "specific group" can have a special queue with no such limits.

@akahles
Copy link

akahles commented May 3, 2016

I think having a separate queue for long running jobs is a good idea. One thing that I have seen implemented in other systems I am using, is an automatic queue assignment based on walltime. So to the end user everything would more or less stay the same. Default queue stays batch but a job exceeding 96h walltime requirement will be scheduled to a different queue, e.g., batch_long, automatically. A subset of nodes could be excluded from this queue, which essentially means that if I have a very long job I might need to wait a little longer until it gets scheduled. Which I personally find ok.

@tatarsky
Copy link
Contributor Author

tatarsky commented May 3, 2016

The group isn't mysterious I just don't expose names. Its the two groups that purchased nodes for their queues that were added over the shutdown. If you pop into Slack I can elaborate.

Your item to only run jobs with a lower walltime will be added to the possible options. To be honest I do not know if I can do that but will look at Moab config options.

@jchodera
Copy link
Member

jchodera commented May 3, 2016

@akahles's idea of "automatic queue assignment based on walltime" may be another way to implement my suggestion.

@tatarsky
Copy link
Contributor Author

tatarsky commented May 3, 2016

@akahles I will look into if that is possible so people do not have to select the longer queue. That would be I believe handled by Torque as it does the initial scheduling.

@tatarsky
Copy link
Contributor Author

tatarsky commented May 3, 2016

It may also be a submit filter.

@callosciurus
Copy link

+1 for limiting walltime of jobs run on group specific nodes. Segmentation into more queues ultimately penalizes certain types of jobs and also creates inefficiencies as @akahles has pointed out. That would be ok if walltime was always very predictable, but in many instances it is not.

@tatarsky
Copy link
Contributor Author

tatarsky commented May 3, 2016

Note for myself the concept of a routing queue is Torque supported. I am trying to determine if walltime is a supported routable attribute however.

@lzamparo
Copy link

lzamparo commented May 5, 2016

+1 to @akahles suggestion for different queues based on wallclock submission time in the script.

I am also in favour of a max wallclock time on the batch queue (say 48 or 96 hrs), in order to decrease latency in the queue, even if this means that we may all have to increase the use of checkpoints for our longer running jobs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants