Using Jupyter notebooks
Jupyter notebooks are a convenient way of creating an interactive notebook for your work. As these are accessed by your web browser but the Atlas servers you want to run these on are not directly reachable from your laptop, you need to create a network path from your laptop to the target server.
Using a head node
The easiest way is to use a head or submit host. You
first log into the host and start the notebook server under
tmux
to keep the
server running even when you have logged out of your ssh session
(lines starting with #
are just comments):
tmux
# assuming you have already installed Jupyter notebook
# in a virtual environment (virtualenv, pipenv, ...), e.g.
# virtualenv --python=/usr/bin/python3 ~/jupyter
# . ~/jupyter/bin/activate
# pip3 install --user notebook
# set a good password for the service!
# Anyone with access could ran nefarious code! E.g.
# import os; os.system('rm -rf ~')
jupyter notebook password
# start the notebook server
jupyter notebook --no-browser
[I 07:34:13.196 NotebookApp] Jupyter Notebook 6.4.10 is running at:
[I 07:34:13.196 NotebookApp] http://localhost:8888/
[I 07:34:13.196 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation)
The important information to be taken from here is the local port
number 8888
from the line http://localhost:8888
.
You can now close the tmux
session by pressing CTRL+b
followed by
d
. When you are logged in again, you can reattach to the session via
tmux a
.
After stopping your ssh session, you can start a new ssh session which
will create a data channel from your laptop to the just started
service by adding -L1234:localhost:8888
to your usual ssh
command,
e.g. ssh -L1234:localhost:8888 condor7.atlas.aei.uni-hannover.de
if
you started the notebook on condor7
. This command tells ssh to
connect the local port number 1234
on your laptop to the remote port
8888
on the machine you log into.
You can freely choose any port number from 1024 up to 65535 which is currently not in use. If you happen to request an already used one, ssh will complain if that port is already taken by writing out something like
bind [127.0.0.1]:1234: Address already in use
channel_setup_fwd_listener_tcpip: cannot listen to port: 1234
then simply choose another port number (and remember it!).
After you are logged into the target host, you should be able to
connect to the notebook in your browser by opening the URL
http://localhost:1234/
. Please ensure that you use http
and not
https
! You should be greeted by a web page asking you to enter your
password chosen above and then you should be able to browse your file
system and start a notebook.
The notebook server can be stopped by pressing CTRL-c
twice within
the tmux
session.
Jupyter Lab
Instead of running a single Jupyter notebook, you can also install
Jupyter lab via pip (pip3 install jupyterlab
) and start this
instead, e.g. jupyter lab password
and jupyter lab --no-browser
).
Jupyterlab on Atlas
Assuming you have your virtual python/Jupyter environment under
~/jupyter
you could simply create a shell script (jlab.sh
) like
this:
#!/bin/bash
# activate python evironment
. ~/jupyter/bin/activate
# change to directory where your notebooks are living
cd ~/notebooks
# start jupyterlab and ensure that kernels are killed after some time
# half an hour in this example and the server 10 minutes after the
# final kernel was killed - this is just a safe-guard that one does
# not forget a running service and block the resources long term
jupyter lab --no-browser \
--MappingKernelManager.cull_idle_timeout=1800 \
--ServerApp.shutdown_no_activity_timeout=600 \
--ServerApp.ip="$(hostname -f)"
make it executable chmod a+x jlab.sh
and let condor start it via a
simple submit file, e.g. jlab.sub
Executable = jlab.sh
Error = jlab.error.$(ClusterId)
Output = jlab.output.$(ClusterId)
Log = jlab.log
RequestCpus = 1
Request_GPUs = 0
RequestMemory = 2500
Universe = vanilla
accounting_group = cbc.test.jlab
Queue 1
Obviously, tailor these settings to better suit your use case/needs.
Once submitted via condor_submit jlab.sub
, you need to wait until
the job is matched to a host and then look at the Error
output. You
can just use tail -f jlab.error.NUMBER
or multitail
jlab.error.NUMBER
where NUMBER
is the cluster id of your submitted
job.
Once the job is running, it will print out the host name and port number the service is now listening to, e.g.
[...]
[I 2022-04-07 06:51:56.083 ServerApp] http://a3828.atlas.local:8888/lab
[...]
You can then log out of the submit host and log back in (or use a new ssh connection) and as before bind a local port to the remote one, e.g. in this case
ssh condor5.atlas.aei.uni-hannover.de -L 1235:a3828.atlas.local:8888
then you can access the remote jupyter lab within your browser via
http://localhost:1235
.