Using Jupyter notebooks

Jupyter notebooks are a convenient way of creating an interactive notebook for your work. As these are accessed by your web browser but the Atlas servers you want to run these on are not directly reachable from your laptop, you need to create a network path from your laptop to the target server.

Using a head node

The easiest way is to use a head or submit host. You first log into the host and start the notebook server under tmux to keep the server running even when you have logged out of your ssh session (lines starting with # are just comments):

tmux
# assuming you have already installed Jupyter notebook
# in a virtual environment (virtualenv, pipenv, ...), e.g.
# virtualenv --python=/usr/bin/python3 ~/jupyter
# . ~/jupyter/bin/activate
# pip3 install --user notebook

# set a good password for the service!
# Anyone with access could ran nefarious code! E.g.
# import os; os.system('rm -rf ~')
jupyter notebook password

# start the notebook server
jupyter notebook --no-browser
[I 07:34:13.196 NotebookApp] Jupyter Notebook 6.4.10 is running at:
[I 07:34:13.196 NotebookApp] http://localhost:8888/
[I 07:34:13.196 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation)

The important information to be taken from here is the local port number 8888 from the line http://localhost:8888.

You can now close the tmux session by pressing CTRL+b followed by d. When you are logged in again, you can reattach to the session via tmux a.

After stopping your ssh session, you can start a new ssh session which will create a data channel from your laptop to the just started service by adding -L1234:localhost:8888 to your usual ssh command, e.g. ssh -L1234:localhost:8888 condor7.atlas.aei.uni-hannover.de if you started the notebook on condor7. This command tells ssh to connect the local port number 1234 on your laptop to the remote port 8888 on the machine you log into.

You can freely choose any port number from 1024 up to 65535 which is currently not in use. If you happen to request an already used one, ssh will complain if that port is already taken by writing out something like

bind [127.0.0.1]:1234: Address already in use
channel_setup_fwd_listener_tcpip: cannot listen to port: 1234

then simply choose another port number (and remember it!).

After you are logged into the target host, you should be able to connect to the notebook in your browser by opening the URL http://localhost:1234/. Please ensure that you use http and not https! You should be greeted by a web page asking you to enter your password chosen above and then you should be able to browse your file system and start a notebook.

The notebook server can be stopped by pressing CTRL-c twice within the tmux session.

Jupyter Lab

Instead of running a single Jupyter notebook, you can also install Jupyter lab via pip (pip3 install jupyterlab) and start this instead, e.g. jupyter lab password and jupyter lab --no-browser).

Jupyterlab on Atlas

Assuming you have your virtual python/Jupyter environment under ~/jupyter you could simply create a shell script (jlab.sh) like this:

#!/bin/bash

# activate python evironment
. ~/jupyter/bin/activate

# change to directory where your notebooks are living
cd ~/notebooks

# start jupyterlab and ensure that kernels are killed after some time
# half an hour in this example and the server 10 minutes after the
# final kernel was killed - this is just a safe-guard that one does
# not forget a running service and block the resources long term
jupyter lab --no-browser \
        --MappingKernelManager.cull_idle_timeout=1800 \
        --ServerApp.shutdown_no_activity_timeout=600 \
        --ServerApp.ip="$(hostname -f)"

make it executable chmod a+x jlab.sh and let condor start it via a simple submit file, e.g. jlab.sub

Executable = jlab.sh

Error   = jlab.error.$(ClusterId)
Output  = jlab.output.$(ClusterId)
Log = jlab.log
RequestCpus = 1
Request_GPUs = 0
RequestMemory = 2500
Universe = vanilla
accounting_group = cbc.test.jlab
Queue 1

Obviously, tailor these settings to better suit your use case/needs.

Once submitted via condor_submit jlab.sub, you need to wait until the job is matched to a host and then look at the Error output. You can just use tail -f jlab.error.NUMBER or multitail jlab.error.NUMBER where NUMBER is the cluster id of your submitted job.

Once the job is running, it will print out the host name and port number the service is now listening to, e.g.

[...]
[I 2022-04-07 06:51:56.083 ServerApp] http://a3828.atlas.local:8888/lab
[...]

You can then log out of the submit host and log back in (or use a new ssh connection) and as before bind a local port to the remote one, e.g. in this case

ssh condor5.atlas.aei.uni-hannover.de -L 1235:a3828.atlas.local:8888

then you can access the remote jupyter lab within your browser via http://localhost:1235.