Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

generating slurm.conf values #34

Open
vphan13 opened this issue May 28, 2023 · 3 comments
Open

generating slurm.conf values #34

vphan13 opened this issue May 28, 2023 · 3 comments

Comments

@vphan13
Copy link

vphan13 commented May 28, 2023

I'm a bit of an ansible noob here. . .but,

When generating the slurm.conf file

Instead of hard coded values:
slurm_nodes:

  • name: "{{ headnode }}"
    CoresPerSocket: "6"
    CPUs: "12"
    Gres: "gpu:p620:1"
    NodeAddr: "{{ headnode }}"
    RealMemory: "31846"
    Sockets: "1"
    ThreadsPerCore: "2"
    Feature: "gpu,intel,ht"
    State: "UNKNOWN"

Is it possible to get the values from ansible_facts . . .something along the lines of

slurm_nodes:

  • name: "{{ headnode }}"
    CoresPerSocket: "{{ ansible_facts['ansible_processor_cores'] }}"
    CPUs: "{{ ansible_facts['ansible_processor_vcpu'] }}"
    Gres: "gpu:p620:1"
    NodeAddr: "{{ headnode }}"
    RealMemory: {{ ansible_facts['ansible_memory_mb.real.total'] }}
    Sockets: "1"
    ThreadsPerCore: "{{ ansible_facts['ansible_processor_threads_per_core'] }}"
    Feature: "gpu,intel,ht"
    State: "UNKNOWN"
@jakob1379
Copy link

Yes, it's possible to use the values from ansible_facts to dynamically populate the values of your slurm.conf file. The template you've written is largely correct, but there are a few potential issues to be aware of:

Ensure Ansible Facts are Collected: Before you can use ansible_facts, you need to make sure they are gathered. Ansible gathers facts about the system it's running on by default, but this behavior can be changed. If you're not seeing the facts you expect, check the gather_facts setting in your playbook.

Variable Existence: Not all ansible_facts variables might exist on every system. For example, ansible_processor_vcpu might not be available on certain systems. It's good practice to include a default value or handle the situation where the variable might not exist. You can do this with the default filter, like so: {{ ansible_facts['ansible_processor_vcpu'] | default(1) }}.

Value Types: Be careful about the types of values that ansible_facts provides. For instance, ansible_memory_mb.real.total provides a number, not a string. In your example, you didn't quote this value, which is correct if the field expects a number. But if the field expects a string, you should convert it with the string filter, like so: {{ ansible_facts['ansible_memory_mb.real.total'] | string }}.

Here's your example with the modifications:

slurm_nodes:
    name: "{{ headnode }}"
    CoresPerSocket: "{{ ansible_facts['ansible_processor_cores'] | default(1) | string }}"
    CPUs: "{{ ansible_facts['ansible_processor_vcpu'] | default(1) | string }}"
    Gres: "gpu:p620:1"
    NodeAddr: "{{ headnode }}"
    RealMemory: "{{ ansible_facts['ansible_memory_mb.real.total'] | default(1024) | string }}"
    Sockets: "1"
    ThreadsPerCore: "{{ ansible_facts['ansible_processor_threads_per_core'] | default(1) | string }}"
    Feature: "gpu,intel,ht"
    State: "UNKNOWN"

Replace the default(1) and default(1024) with the actual default values you want for your use case.

@vphan13
Copy link
Author

vphan13 commented Jun 2, 2023

Thanks for the detailed reply, your example didn't work for me, but I'm pretty sure I have set gather_facts to true somewhere. I think I have enough info to figure it out. I will post up what worked for me in case there are others who have the same question

@vphan13
Copy link
Author

vphan13 commented Sep 22, 2023

Here is a working configuration that queries the values for every node using ansible facts. For large clusters, this is still a manual process since we'd still need to create this stanza for every node in the cluster. It would be nice to be able to loop through the host members of a group to generate the slurm.conf values.

- name: "nodec"
NodeAddr: "nodec"
CPUs: "{{ hostvars['nodec']['ansible_facts']['processor_vcpus'] }}"
RealMemory: "{{ hostvars['nodec']['ansible_memory_mb']['real']['total'] }}"
Sockets: "{{ hostvars['nodec']['ansible_processor_count'] }}"
CoresPerSocket: "{{ hostvars['nodec']['ansible_processor_cores'] }}"
ThreadsPerCore: "2"
State: "UNKNOWN"

edit: Never mind, I saw the example in the README. I believe the following should work

NodeAddr: "node[1-10][a-d]"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants