Skip to content
Graeme Watt edited this page Mar 24, 2020 · 20 revisions

Note: these are internal instructions for deployment on the CERN OpenStack infrastructure.

Therefore, unless you're deploying HEPData.net, you may safely ignore this document.

Running tests locally

It's good practice to check that the tests pass locally before pushing to GitHub. You first need to create a local database called hepdata_test, follow the instructions to run the tests, including the end-to-end tests using Selenium. You can then push to the master branch on GitHub. Check that the tests pass on Travis for the master branch on GitHub.

Pushing to the production branch

The production branch is used for deployment. To synchronise with the master branch, repeat the following steps locally:

git rebase master production
git push
git checkout master

Deploying via hepdata-builder1

First ssh to aiadm.cern.ch, then to root@hepdata-builder1.

If any secrets have been updated using tbag (see below) or the Puppet configuration has changed, first run puppet agent --test to pull the latest catalog. Then run the following commands:

workon hepdata
cdvirtualenv scripts

fab update
fab build
fab -u <user_name> -I deploy

You can also stop and start as well as deploy.

Debugging online

There are some bugs that are difficult or impossible to reproduce locally, therefore debugging online is necessary. This can be done by editing the source code directly on the web servers. Any changes will be overwritten when the code is deployed. First ssh to aiadm.cern.ch, then to root@hepdata-web1 or root@hepdata-web2. The source code is in the /opt/hepdata/src/hepdata directory. Files need to be edited on both machines hepdata-web1 and hepdata-web2, then the updated .pyc files can be generated from the edited .py files using supervisorctl restart gunicorn, before taking some action from a web browser directed at the hepdata.net site. Log files can be found in the /var/log/supervisor directory.

To avoid the need to simultaneously edit files on both machines, one of the two web servers can be temporarily disabled from the load balancer machine:

root@hepdata-lb1$ echo 'disable server ssl_hepdata_production_app/hepdata-web2.cern.ch' | nc -U /var/lib/haproxy/stats

then after debugging on hepdata-web1 (for example), the same command should be run with disable replaced by enable. Note that disabling both web servers will bring the site down.

Celery

Celery is used for tasks such as sending emails and registering DOIs with DataCite. The log files can be found as /var/log/supervisor/celery-worker.err on hepdata-task1 or hepdata-task2. Generally tasks are completed within at most a few seconds of being received. If tasks are received but are held in the queue, you might need to restart the Celery daemon using supervisorctl restart celeryd on hepdata-task1 or hepdata-task2.

CERN EOS disk

The hepdata.net site can occasionally go down or tables fail to load, and redeploying doesn't fix the problem. It might be necessary to restart the EOS daemon on hepdata-web1 and hepdata-web2 with service eosd restart.

If the /eos/hepdata/ disk is still completely inaccessible from the hepdata-* machines, there might be a global problem with the EOSPUBLIC disk. Check https://cern.service-now.com/service-portal/ssb.do for any relevant service incidents and wait for them to be resolved.

Check the quota with eos root://eospublic.cern.ch quota /eos/hepdata on any of the hepdata-* machines. Note that there is a quota on both disk space (currently up to a maximum of 5 TB) and number of files (currently up to a maximum of 4 million).

The EOS recycle bin allows deleted files to be recovered. Check the (global) quota with eos root://eospublic.cern.ch recycle (currently 2.20 PB with a lifetime of 30 days). Unfortunately, there is no per-directory quota for /eos/hepdata. There was recently (13/07/2018) an incident where the global quota was filled, preventing files from being deleted by the HEPData web application. A directory subtree can be removed from the recycle bin policy by contacting CERN IT. List files contained in the recycle bin with eos root://eospublic.cern.ch recycle ls and purge them with eos root://eospublic.cern.ch recycle purge. Check eos root://eospublic.cern.ch recycle help for further information such as how to restore a deleted file.

Managing secrets

Secrets such as passwords can be accessed from aiadm.cern.ch via the tbag command (see man tbag or tbag -h for help). Here are some basic commands:

# Show all keys for the hepdata hostgroup:
tbag showkeys --hg hepdata

# Show the secret with key MY_PASSWORD:
tbag show MY_PASSWORD --hg hepdata

# Set the secret with key MY_PASSWORD:
tbag set MY_PASSWORD --hg hepdata

Monitoring

Service status can only be monitored from inside the CERN network, e.g. by starting Firefox from lxplus.cern.ch:

ssh lxplus.cern.ch firefox https://hepdata-monitor1.cern.ch/nagios/cgi-bin/status.cgi

The user name (MONITOR_USER) and password (MONITOR_PASS) can be obtained using tbag as described above.

hepdata-converter

The hepdata-converter-docker repository uses Travis to build the Docker image containing the dependencies of the converter software such as ROOT and YODA. The Docker image hepdata/hepdata-converter is pushed to DockerHub from the Travis build. The hepdata-converter repository contains the converter software itself. Tagging a new release will make a new version available from PyPI. Finally, the hepdata-converter-ws-docker repository uses Travis to build the Docker image hepdata/hepdata-converter-ws from the hepdata/hepdata-converter Docker image and the latest released versions of the hepdata-converter and hepdata-converter-ws packages. The Docker image hepdata/hepdata-converter-ws then interacts with the main HEPData web application via the hepdata-converter-ws-client package.

The Docker container for the hepdata/hepdata-converter-ws image is currently deployed on a CERN OpenStack cluster called hepdata-converter-2018 created on 3rd January 2018. See some documentation at http://clouddocs.web.cern.ch/clouddocs/containers/ . To gain access to the cluster, log in to https://openstack.cern.ch with your personal CERN account and select the "GS HepData critical power" project. (Clusters can be created via the web interface at https://openstack.cern.ch/project/clusters .) Select "API Access", then "DOWNLOAD OPENSTACK RC FILE", then "OPENSTACK RC FILE (IDENTITY API V3)". SSH to lxplus-cloud.cern.ch and copy the file, then do e.g. source hepdata-openrc.sh and enter your CERN password. Then do openstack coe cluster config hepdata-converter-2018 > env.sh. These steps only need to be done once.

Subsequently, just log in to lxplus-cloud.cern.ch and do . env.sh to set up the Docker environment, then you can use normal Docker commands to manage the container, e.g.

docker ps
docker pull hepdata/hepdata-converter-ws
docker run --restart=always -d --name=hepdata_converter -p 0.0.0.0:80:5000 hepdata/hepdata-converter-ws hepdata-converter-ws
docker restart db1ec55cb69e
docker stop db1ec55cb69e
docker rm db1ec55cb69e
docker exec -it db1ec55cb69e bash

The Docker container will be accessible within CERN from the URL http://188.184.65.191 . This URL is specified in the config.py file of the main HEPData code.

Domain name and email forwarding

The hepdata.net domain name is registered with names.co.uk until 15/12/2024, when it will need to be renewed. Log in to names.co.uk with either NAMESCO_USERNAME or NAMESCO_ACCOUNT_REFERENCE and NAMESCO_PASSWORD obtained using tbag. The DNS settings can be changed, e.g. to specify the IP address (188.184.64.140) associated with the hepdata.net domain name. Email forwarding is provided from *@hepdata.net email addresses and the recipient (currently the HEPData Project Manager) can also be changed. There was recently (11/04/2018) a problem where the forwarding IP address (85.233.160.23) was listed by Spamhaus resulting in bounced emails to *@hepdata.net addresses; contact names.co.uk support if this happens again.

SSL certificate

The SSL certificate is registered with Gandi until 05/08/2020, when it will need to be renewed. It was last renewed on 27/07/2018. Log in to Gandi.net with GANDI_USERNAME and GANDI_PASSWORD obtained using tbag. Check the status of the SSL certificate at https://www.ssllabs.com/ssltest/analyze.html?d=hepdata.net . Gandi is a paid, but inexpensive, service, e.g. 20.40 EUR for an SSL certificate for two years. In the longer term, we could consider moving to the free Let's Encrypt service, which would need to be integrated with the Puppet configuration using the Certbot ACME client (see https://forge.puppet.com/puppet/letsencrypt ).

Email delivery

We originally sent email using a free plan from SMTP2GO, which allows 1000 emails per month and has a rate limit of 25 emails per hour. Log in to SMTP2GO with email address [email protected] and password given by MAIL_PASSWORD obtained using tbag to see details of emails sent. At the start of September 2018, the limit of 1000 emails was reached for the first time and we temporarily moved to a paid plan with a limit of 2000 emails per month.

On 13th September 2018, we changed the mail server from mail.smtp2go.com to cernmx.cern.ch. The latter is the anonymous mail gateway available from inside the CERN network. The HEPData code itself sends email via Celery tasks from either hepdata-task1 or hepdata-task2, which have been added as trusted machines by CERN IT. Email is also sent by Invenio (using Flask-Mail) from the hepdata-web1 or hepdata-web2 machines: either MAIL_USERNAME (set in config.py or MAIL_PASSWORD (set via the tbag command) needs to be an empty string to prevent Flask-Mail trying to log in to the mail server (giving an authentication exception).

Database access

The PostgreSQL (version 9.6) database is provided by CERN's Database On Demand service. Backups are made every 24 hours. The database instance is not reachable outside the CERN network. To access the database from aiadm.cern.ch, do:

psql -d hepdata -h dbod-hepdata.cern.ch -p 6603 -U admin

and enter the password given by POSTGRES_DB_PASSWORD obtained using tbag.