Skip to content

Decision Engine integration test for 1.7

Vito Di Benedetto edited this page Feb 24, 2022 · 1 revision

VM setup

On FermiCloud setup your VM, during the setup make sure to include in your setup "gwms-ports" Security Group. The Security Group can also be added later in case you already have a running VM.

Decision Engine setup

Now set up Decision Engine as described here, making sure to have SQLAlchemy and Redis set up.

Make sure to install following packages:

yum install -y glideinwms-vofrontend
yum install -y fermilab-util_kx509

Make sure to use decisionengine/decisionengine_modules RPMs for DE 1.7.1 or newer.
(Those RPMS can be retrieved from Jenkins selecting CI builds for master branch for decisionengine and decisionengine_modules)

Create frontend and user proxies

To create the frontend proxy run (as root):

pushd /etc/grid-security/
grid-proxy-init -cert hostcert.pem -key hostkey.pem -valid 999:0 -out /etc/gwms-frontend/fe_proxy
popd

To create the user proxy run (as root):

kx509
voms-proxy-init -rfc -dont-verify-ac -noregen -voms fermilab -valid 120:0
/bin/cp /tmp/x509up_u0 /etc/gwms-frontend/vo_proxy

Check your host and user DNs, those are needed to register your VM and user to the factory.
Marco M. can help with this step.

The host DN can be retrieved with:

openssl x509 -noout -subject -in  /etc/gwms-frontend/fe_proxy | cut -c 10- | sed -re 's#/CN=[0-9]{10}##'

The user DN can be retrieved with:

openssl x509 -noout -subject -in  /etc/gwms-frontend/vo_proxy | cut -c 10- | sed -re 's#/CN=[0-9]{10}##'

Configure GWMS frontend and HTCondor

  • edit /etc/gwms-frontend/frontend.xml
    A template is available on fermicloud508 at /etc/gwms-frontend/frontend.xml_template
    where it is needed to replace @FERMICLOUDNODE@ with your actual FermiCloud node name, as for example fermicloud508.

  • Create DE frontend configuration out of gwms-frontend configuration, run the command:

python3 /usr/lib/python3.6/site-packages/decisionengine_modules/glideinwms/configure_gwms_frontend.py
  • edit /etc/condor/certs/condor_mapfile
    A template is available on fermicloud508 at /etc/condor/certs/condor_mapfile_template
    where it is needed to update lines ending with vofrontend_service2
    in particular in the first line replace @FERMICLOUDNODE@ with your actual FermiCloud node name, as for example fermicloud508,
    and in the second line update your First Last name and user name accordingly to your DN.

  • Set ownership of gwms-frontend related files/folders to decisionengine

chown -R decisionengine: /etc/gwms-frontend
chown -R decisionengine: /var/lib/gwms-frontend
chown -R decisionengine: /var/log/gwms-frontend
  • Start services required for the schedd
/bin/systemctl start httpd
/bin/systemctl start condor
/bin/systemctl start fetch-crl-cron
/bin/systemctl start fetch-crl-boot

2-channel test configuration

Templates of the jsonnet configuration files for the 2-channel test configuration are available on fermicloud508 at the following paths:

/etc/decisionengine/template/job_classification.jsonnet
/etc/decisionengine/template/resource_request.jsonnet

In the job_classification.jsonnet file it is needed to update the collector_host, the schedds at the beginning of the sources section, and the graphite_context in the publishers section. For those entries the placeholder @FERMICLOUDNODE@ needs to be replaced with your FermiCloud node name.

Start Decision Engine

At this point we are ready to start the Decision Engine service:

systemctl start decisionengine.service

Submit a test job

  • prepare a Condor submission file mytest.submit with the following content:
#  A test Condor submission file - mytest.submit
executable = /bin/hostname
universe = vanilla
+DESIRED_Sites = "ITB_FC_CE2"
log = test.log
output = test.out.$(Cluster).$(Process)
error = test.err.$(Cluster).$(Process)
queue 1
  • submit a test jobs
condor_submit mytest.submit
  • check jobs in the queue
condor_q
  • check for available glideins
condor_status

after test jobs are submitted it will take few minutes (usually no more than 10 minutes) to get some glideins and then the job running.