Table of Contents

(EEMCS) HPC Cluster

Introduction

One of the clusters at the University is the (EEMCS) HPC Cluster. This cluster, funded by the DSI research institute (formerly known as CTIT), started in the year 2017 as joint operation of several research groups to work on deep learning / AI methods. During the years more groups participated and the cluster got expansion with several more nodes containing multi CPU/GPU combinations. The current participating faculties are the BMS, EEMCS, TNW and ET faculty of the University of Twente.

This HPC cluster is a collection of many separate servers (computers), called compute nodes, which are connected via a fast interconnect.

There may be different types of nodes for different types of tasks. The HPC cluster listed on this wiki has

All cluster nodes have the same components as a laptop or desktop: CPU cores, memory and disk space. The difference between personal computer and a cluster node is in quantity, quality and power of the components.

For more information on the list of used hardware for this cluster, see the EEMCS-HPC Hardware page.

This cluster is based on the Slurm scheduler 21.08.5 running on Ubuntu 22.04 LTS.

Login Nodes

You can connect to one of the headnodes : hpc-head1.ewi.utwente.nl or hpc-head2.ewi.utwente.nl
See the connection info page on how to connect.

Do NOT login to the compute nodes, either directly or through ssh, ONLY on one of the head nodes !!!!

Slurm Scheduler

To monitor the jobs and progress you can use the EEMCS-HPC Slurm dashboard page or the available command line tools like squeue or scontrol.

To use specific resources please check the EEMCS-HPC features, resources and partitions page.

See the Slurm/HPC scheduler info page for more information.

Maintenance

Upcoming maintenance :

During the maintenance day, the whole cluster will go offline.

Alternatives

For smaller experiments and interactive jobs, please try other resources like :

or external providers like :

Access

Who has access?

To get access, you need to have an AD account of the University of Twente. All students and employees have such an account and they can be arranged for external persons. To get your AD account enabled for these clusters, you need to contact one of the contact persons.

Partitions

Access to the following partitions are limited to the funders during the first year of investment, these can be reached using their partitions.

The HPC/SLURM cluster contains multiple common partitions :

Partition name available to
main All (default)
dmb eemcs-dmb
ram eemcs-ram
bdsi bms-bdsi
mia eemcs-mia
am eemcs-(dmmp/macs/mast/mia/mms/sor/stat)
mia-pof eemcs-mia & tnw-pof
students eemcs-students

* For now the students partition is only for course related work, BSc and/or MSc will have access to the related research group partition.

Check the EEMCS-HPC specifics, partition option on how to select the these.

HPC Priority

The participating groups who have done investments in the HPC cluster, therefore they will have more priority than other groups not participating.
In order to gain more priority, your group can do an investment in the HPC cluster, depending on the kind of investment this will result in :

This combination will guarantee more priority and calculation time on the cluster.

please consult the corresponding contact for this :

Contact persons.

Admin Page

See the EEMCS-HPC Admin page for more information.

Credentials

Accounts

For staff, the username is probably your family name followed by your initials, for students its your student number starting with the “s”, for guest accounts this would be starting with the “x”.

DSI Computing Lab does not store your password and we are unable to reset your password. If you require password assistance, please visit the ICTS/LISA Servicedesk.

Mailing list

For the HPC/SLURM cluster, two mailing lists are created :

Connecting to the cluster

Access to DSI Computing Lab resources is provided via secure shell (SSH) login.

Most Unix-like operating systems (Mac OS X, Linux, etc) provide an ssh utility by default that can be accessed by typing the command ssh in a terminal window.

You can connect to one of the headnodes : hpc-head1.ewi.utwente.nl or hpc-head2.ewi.utwente.nl

See the connecting page for more information.

Setting up

Software.

The cluster machines run on Ubuntu Server 22.04 lts, some basic packages in the repositories have been installed. Additional software is available using module files.

See the EEMCS-HPC Software page for more information.

Storage

The following folders are available :

Quota

Quota is activated on the /home/<username> folder, this means we limit the amount of data in your personal folder.

Due to change of the file system to ZFS, a soft limit at 1TB and a hard limit on 2TB is not possible anymore. !!!!

Submitting Jobs

Batch Jobs

Slurm sbatch is used to submit a job script for later execution.
The script will typically contain the scheduler parameters, setup commands and the processing task(s) or (if required) multiple Slurm srun commands to launch parallel tasks.
See the Slurm sbatch and Slurm srun wiki page for more details.

Before submitting jobs please note the maximum number of jobs and resources related to your accounts Quality Of Service (QOS).
These numbers can be obtained from the QOS tab on the EEMCS-HPC Slurm dashboard page.

Interactive Jobs

It is possible to request for an interactive job, within this job you can execute small experiments. Use this only for a short time (max 1 hour).
For this you can use the additional Slurm sinteractive command.

Monitoring Jobs

The following commands are located in the software module monitor/node, you should load them on beforehand. Check the Monitoring Computenodes page for more information.

During the job

You can monitor your jobs using the

After the job

When your job is finished you can check the :