Starting with SLURM
by Sebastien Mirolo on Sun, 1 Jul 2012While condor focuses on bringing heterogeneous machines into a compute cloud, Simple Linux Utility for Resource Management (SLURM) deals mostly with homogeneous clusters. Condor assumes that machines not dedicated to a cluster and only available when someone does not sit in front of it. SLURM was made for High-Performance Computer dedicated clusters. As a result, SLURM's daemons architecture is a lot simpler than condor's.
For some reasons slurm doesn't come packaged in Fedora 17. It is though straight forward to build the rpm packages following the starting instructions.
$ yum install readline-devel openssl-devel \ munge-devel pam-devel perl-ExtUtils-MakeMaker # Because we want accounting $ yum install mysql-server mysql-devel mysql $ QA_RPATHS=$[ 0x0001|0x004 ] rpmbuild -ta slurm*.tar.bz2
That creates a lot of packages most of them need to be installed at some point or another.
$ rpm -i ~/rpmbuild/RPMS/x86_64/slurm-plugins-2.3.5-1.fc17.x86_64.rpm $ rpm -i ~/rpmbuild/RPMS/x86_64/slurm-munge-2.3.5-1.fc17.x86_64.rpm $ rpm -i ~/rpmbuild/RPMS/x86_64/slurm-2.3.5-1.fc17.x86_64.rpm $ adduser slurm
Deploying on a single machine
We will try first to run the whole thing on a single node to figure out the different configuration options.
$ cp /etc/slurm/slurm.conf.example /etc/slurm/slurm.conf $ diff -u prev /etc/slurm/slurm.conf -ControlMachine=linux0 -#ControlAddr= +ControlMachine=fedora17 +ControlAddr=127.0.0.1 ... -NodeName=linux[1-32] Procs=1 State=UNKNOWN -PartitionName=debug Nodes=linux[1-32] Default=YES MaxTime=INFINITE State=UP +NodeName=localhost Procs=1 State=UNKNOWN +PartitionName=debug Nodes=localhost Default=YES MaxTime=INFINITE State=UP
You will also want to specify a directory where all state information goes. That comes very handy when you need to delete state files and start again.
-StateSaveLocation=/tmp +StateSaveLocation=/tmp/slurm
Then create a munge secret key.
$ sudo sh -c 'dd if=/dev/urandom bs=1 count=1024 >/etc/munge/munge.key' $ chown munge /etc/munge/munge.key $ chmod 0400 /etc/munge/munge.key $ sudo mkdir -p /var/run/munge $ sudo chown munge /var/run/munge $ sudo service munged start $ sudo -u slurm slurmctld -D -vvvvvv
Starting the daemon accepting jobs.
$ sudo slurmd -D -vvvvv
Some commands to check everything is running properly.
$ srun -N1 /bin/hostname $ scontrol [scontrol]$ show job [scontrol]$ quit $ sudo -u slurm scontrol shutdown
So far so good, both daemons are running and responding. We don't want to stop there though. SLURM has a great feature as the accounting plug-in and we intend to use it. Let's install a few more packages and configure slurmdbd.
$ rpm -i ~/rpmbuild/RPMS/x86_64/slurm-sql-2.3.5-1.fc17.x86_64.rpm $ rpm -i ~/rpmbuild/RPMS/x86_64/slurm-slurmdbd-2.3.5-1.fc17.x86_64.rpm $ rpm -i ~/rpmbuild/RPMS/x86_64/slurm-perlapi-2.3.5-1.fc17.x86_64.rpm $ rpm -i ~/rpmbuild/RPMS/x86_64/slurm-slurmdb-direct-2.3.5-1.fc17.x86_64.rpm $ cp /etc/slurm/slurmdbd.conf.example /etc/slurm/slurmdbd.conf $ diff -u prev /etc/slurm/slurm.conf -#AccountingStorageType=accounting_storage/slurmdbd +AccountingStorageType=accounting_storage/slurmdbd +# documentation says 'associations' to be the default +# but that does not seem be the case. +AccountingStorageEnforce=associations # Setting-up the accounting database $ service mysqld start $ mysql -u root -e "GRANT ALL ON slurm_acct_db.* TO 'slurm'@'localhost' identified by 'password' with grant option;" # populate the accounting database $ /usr/sbin/slurmdbd -D -vvvv # Add cluster (name as specified in /etc/slurm.conf:ClusterName) $ sacctmgr add cluster linux # Add billable account $ sacctmgr add account none,test Cluster=linux \ Description="none" Organization="none" # Add user $ sacctmgr add user steel DefaultAccount=test $ mysql -u slurm -p slurm_acct_db > show tables; ...
Here are a few commands to test everything works correctly.
$ srun -N1 /bin/hostname $ sacct -a $ sreport User TopUsage Account start=2012-01-01 end=2012-12-31 $ sinfo -R -l
That's it for a quick run through. It is time now to deploy SLURM on multiple machines configuration.
Deploying on a cluster
SLURM is straightforward to deploy on a cluster of machines but there are three caviats to think about first.
- The configuration file (/etc/slurm/slurm.conf) needs to be identical on each machine.
- The result of the hostname command will determine which daemons (slurmctl/slurmd) are started on the machine by service slurm start.
- srun will open ports back on the slurmctl machine. sbatch will open ports on the slurmd machine (potentially through eth0).
# Finding out which deamons will run on a machine $ scontrol show daemons # Configuring a cluster with two nodes (slurmctl)$ diff -u prev /etc/sysconfig/network -HOSTNAME=fedora.localdomain +HOSTNAME=linux0.localdomain (slurmd)$ diff -u prev /etc/sysconfig/network -HOSTNAME=fedora.localdomain +HOSTNAME=linux1.localdomain $ diff -u prev /etc/slurm/slurm.conf -ControlMachine=fedora -ControlAddr=127.0.0.1 +ControlMachine=linux0 +ControlAddr=192.168.144.16 ... -NodeName=localhost Procs=1 State=UNKNOWN -PartitionName=debug Nodes=localhost Default=YES MaxTime=INFINITE State=UP +NodeName=linux1 NodeAddr=192.168.144.17 Procs=1 State=UNKNOWN +PartitionName=debug Nodes=linux1 Default=YES MaxTime=INFINITE State=UP $ diff prev /etc/sysconfig/iptables -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT -A INPUT -s 127.0.0.1 -j ACCEPT +-A INPUT -s 192.168.144.17 -j ACCEPT $ cat myscript #!/bin/sh #SBATCH --time=1 srun hostname | sort $ sbatch -N1 myscript
Moving towards the goal of deploying SLURM on a cloud infrastructure, I realized we will most likely have to upgrade to version 2.4+ at some point (see here).