Starting with SLURM

by Sebastien Mirolo on Sun, 1 Jul 2012

While condor focuses on bringing heterogeneous machines into a compute cloud, Simple Linux Utility for Resource Management (SLURM) deals mostly with homogeneous clusters. Condor assumes that machines not dedicated to a cluster and only available when someone does not sit in front of it. SLURM was made for High-Performance Computer dedicated clusters. As a result, SLURM's daemons architecture is a lot simpler than condor's.

For some reasons slurm doesn't come packaged in Fedora 17. It is though straight forward to build the rpm packages following the starting instructions.

$ yum install readline-devel openssl-devel \
    munge-devel pam-devel perl-ExtUtils-MakeMaker
# Because we want accounting
$ yum install mysql-server mysql-devel mysql
$ QA_RPATHS=$[ 0x0001|0x004 ] rpmbuild -ta slurm*.tar.bz2

That creates a lot of packages most of them need to be installed at some point or another.

$ rpm -i ~/rpmbuild/RPMS/x86_64/slurm-plugins-2.3.5-1.fc17.x86_64.rpm
$ rpm -i ~/rpmbuild/RPMS/x86_64/slurm-munge-2.3.5-1.fc17.x86_64.rpm
$ rpm -i ~/rpmbuild/RPMS/x86_64/slurm-2.3.5-1.fc17.x86_64.rpm
$ adduser slurm

Deploying on a single machine

We will try first to run the whole thing on a single node to figure out the different configuration options.

$ cp /etc/slurm/slurm.conf.example /etc/slurm/slurm.conf
$ diff -u prev /etc/slurm/slurm.conf
-ControlMachine=linux0
-#ControlAddr=
+ControlMachine=fedora17
+ControlAddr=127.0.0.1
...
-NodeName=linux[1-32] Procs=1 State=UNKNOWN
-PartitionName=debug Nodes=linux[1-32] Default=YES MaxTime=INFINITE State=UP
+NodeName=localhost Procs=1 State=UNKNOWN
+PartitionName=debug Nodes=localhost Default=YES MaxTime=INFINITE State=UP

You will also want to specify a directory where all state information goes. That comes very handy when you need to delete state files and start again.

-StateSaveLocation=/tmp
+StateSaveLocation=/tmp/slurm

Then create a munge secret key.

$ sudo sh -c 'dd if=/dev/urandom bs=1 count=1024 >/etc/munge/munge.key'
$ chown munge /etc/munge/munge.key
$ chmod 0400 /etc/munge/munge.key
$ sudo mkdir -p /var/run/munge
$ sudo chown munge /var/run/munge
$ sudo service munged start
$ sudo -u slurm slurmctld -D -vvvvvv

Starting the daemon accepting jobs.

$ sudo slurmd -D -vvvvv

Some commands to check everything is running properly.

$ srun -N1 /bin/hostname
$ scontrol
[scontrol]$ show job
[scontrol]$ quit
$ sudo -u slurm scontrol shutdown

So far so good, both daemons are running and responding. We don't want to stop there though. SLURM has a great feature as the accounting plug-in and we intend to use it. Let's install a few more packages and configure slurmdbd.

$ rpm -i ~/rpmbuild/RPMS/x86_64/slurm-sql-2.3.5-1.fc17.x86_64.rpm
$ rpm -i ~/rpmbuild/RPMS/x86_64/slurm-slurmdbd-2.3.5-1.fc17.x86_64.rpm
$ rpm -i ~/rpmbuild/RPMS/x86_64/slurm-perlapi-2.3.5-1.fc17.x86_64.rpm
$ rpm -i ~/rpmbuild/RPMS/x86_64/slurm-slurmdb-direct-2.3.5-1.fc17.x86_64.rpm
$ cp /etc/slurm/slurmdbd.conf.example /etc/slurm/slurmdbd.conf
$ diff -u prev /etc/slurm/slurm.conf
-#AccountingStorageType=accounting_storage/slurmdbd
+AccountingStorageType=accounting_storage/slurmdbd
+# documentation says 'associations' to be the default
+# but that does not seem be the case.
+AccountingStorageEnforce=associations
# Setting-up the accounting database
$ service mysqld start
$ mysql -u root -e "GRANT ALL ON slurm_acct_db.* TO 'slurm'@'localhost' identified by 'password' with grant option;"

# populate the accounting database
$ /usr/sbin/slurmdbd -D -vvvv
# Add cluster (name as specified in /etc/slurm.conf:ClusterName)
$ sacctmgr add cluster linux
# Add billable account
$ sacctmgr add account none,test Cluster=linux \
  Description="none" Organization="none"
# Add user
$ sacctmgr add user steel DefaultAccount=test
$ mysql -u slurm -p slurm_acct_db
> show tables;
...

Here are a few commands to test everything works correctly.

$ srun -N1 /bin/hostname
$ sacct -a
$ sreport User TopUsage Account start=2012-01-01 end=2012-12-31
$ sinfo -R -l

That's it for a quick run through. It is time now to deploy SLURM on multiple machines configuration.

Deploying on a cluster

SLURM is straightforward to deploy on a cluster of machines but there are three caviats to think about first.

  • The configuration file (/etc/slurm/slurm.conf) needs to be identical on each machine.
  • The result of the hostname command will determine which daemons (slurmctl/slurmd) are started on the machine by service slurm start.
  • srun will open ports back on the slurmctl machine. sbatch will open ports on the slurmd machine (potentially through eth0).
# Finding out which deamons will run on a machine
$ scontrol show daemons

# Configuring a cluster with two nodes
(slurmctl)$ diff -u prev /etc/sysconfig/network
-HOSTNAME=fedora.localdomain
+HOSTNAME=linux0.localdomain

(slurmd)$ diff -u prev /etc/sysconfig/network
-HOSTNAME=fedora.localdomain
+HOSTNAME=linux1.localdomain

$ diff -u prev /etc/slurm/slurm.conf
-ControlMachine=fedora
-ControlAddr=127.0.0.1
+ControlMachine=linux0
+ControlAddr=192.168.144.16
...
-NodeName=localhost Procs=1 State=UNKNOWN
-PartitionName=debug Nodes=localhost Default=YES MaxTime=INFINITE State=UP
+NodeName=linux1 NodeAddr=192.168.144.17 Procs=1 State=UNKNOWN
+PartitionName=debug Nodes=linux1 Default=YES MaxTime=INFINITE State=UP

$ diff prev /etc/sysconfig/iptables
 -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
 -A INPUT -s 127.0.0.1 -j ACCEPT
+-A INPUT -s 192.168.144.17 -j ACCEPT

$ cat myscript
#!/bin/sh
#SBATCH --time=1
srun hostname | sort
$ sbatch -N1 myscript

Moving towards the goal of deploying SLURM on a cloud infrastructure, I realized we will most likely have to upgrade to version 2.4+ at some point (see here).

by Sebastien Mirolo on Sun, 1 Jul 2012