Running Ollama on the HPC Cluster

Load slurm utils (sinteractive)

module load slurm/utils

Start an interactive session using sinteractive, see : https://hpc.wiki.utwente.nl/slurm:sinteractive

sinteractive --gres=gpu:1

wait until you got resources assigned !!

Once you got an interactive session with gpu (a bit crowded right now), load the module ollama and start the user daemon

module load ollama/0.9.3
start-ollama

Now you can use the download, list, run commands of ollama (here for example : run llama3)

ollama run llama3

Once completed you can stop the user daemon and terminate the interactive session

stop-ollama
exit

Every model that is downloaded will be stored by default in a central location, for this the environment variable OLLAMA_MODELS defined as follows :

OLLAMA_MODELS=$OLLAMA_INSTALLPATH/models

Because Ollama runs as a user service, it does not have access to the environment variables defined in your bash session. If you need to add environment variables for Ollama, you must include them in your local Ollama service file. To do this, start Ollama, and once it is running, enter:

systemctl --user edit ollama

This will open the local ollama.service configuration file, where you can modify various settings, including environment variables visible to the service.

After making any changes, save the file (Ctrl+O). Then reload the daemon and restart Ollama:

systemctl --user daemon-reload
systemctl --user restart ollama

See the examples below on how to set up a proxy or enable remote connections.

On the cluster, downloads are only possible through a proxy server. If you want to pull new models via Ollama, you can add the proxy settings by modifying the service file as shown below:

### Anything between here and the comment below will become the new contents of the file
 
[Service]
Environment="HTTPS_PROXY=http://proxy.utwente.nl:3128"
 
### Lines below this comment will be discarded

Note: Remember to reload the daemon and restart Ollama after saving your changes!

If you want to run the Ollama daemon on the cluster but connect to it from your local machine, modify your Ollama service file as follows:

### Anything between here and the comment below will become the new contents of the file
 
[Service]
Environment="OLLAMA_HOST=0.0.0.0:8886"
 
### Lines below this comment will be discarded

Note 1: Remember to reload the daemon and restart Ollama after saving your changes.
Note 2: The example above uses port 8886, but you can change it to any open port in range 8800–8899 that is accessible on the nodes.

To connect from your local machine, forward a local port to the port on which Ollama is running on the cluster:

ssh -L 11434:node_name:8886 username@hpc-head1.ewi.utwente.nl

Note 1: In this example, 11434 is the local port on your machine. Ollama uses this port by default, so make sure you are not also running Ollama locally if you use 11434.
Note 2: Replace node_name with the actual node where Ollama is running (for example, ctit082).

To show the logging of the Ollama user daemon, use the following command :

journalctl --user -u ollama

Running Ollama on the HPC Cluster

Starting an interactive session for the ollama user service

Starting the ollama server and interacting with it

Caching of models

Setting Up Environment Variables for Ollama

Proxy Setup

Remote Connection Setup

Ollama service logging