Computing Infrastructure
GPU cluster
We run a GPU cluster with several different compute nodes. Students and assistants only have access to limited nodes. Employees from the working groups who have contributed financially to the cluster are available to calculate at all nodes. Documentation and instructions for typical installations can be found on the documentation page.
Compute
We run a few computers equipped with a few CPU cores and memory. These computers run under Netboot and are designed for longer calculations. You can reach one of these computers from our network via SSH, for example shell.techfak.de
with the command ssh compute.techfak.de
.
Several machines - one name
The computers themselves have different names, as there are several machines that can be accessed under the name compute
. This is to distribute the load evenly. Therefore, you should only ever connect to compute
as long as you do not need to check for processes that are already running. It is also possible that some machines require maintenance and are therefore not accessible.
Starting prozesses
You can easily start processes and then log off again (e.g. for very long-running programs) by working with Tmux . (Alternatively with Screen
).
Example
juser@foobar:$ssh compute
juser@april:$ tmux #starting tmux for the first time. A normal shell will open.
juser@april:$
tmux
communicate with there is an escape character, usually Ctrl+B. This allows you tmux
to send commands to yourself and not to the terminal, e.g. Ctrl+B and then 'c' to open another shell, or Ctrl+B and then 's' for an overview of the open windows.
Now the desired program can be started.
juser@april: nice mylongrunningscript.sh
To log out, you have to press Ctrl+B and then 'd'. A message like
[detached (from session 0)]
will appear. Now you can log out as normal and come back later.
Log in again
To get the shell in which you ran the program back, connect back to the original computer and execute the command tmux a
.
juser@foobar:$ ssh april #only(!) in this case connect directly to one of the compute machines
juser@april:$ tmux a
juser@april:$ nice mylongrunningscript.sh #You get the shell back in the same state as you left it
Example output of the script.
30e23e024rd32
juser@april:$ #The program seems to have ended
juser@april:$ echo $? #Check if the exit code is 0
0
juser@april:$
exit
.[exited]