-
Notifications
You must be signed in to change notification settings - Fork 10
CELPP on TSCC Cluster
This page provides steps and tasks to run CELPP on Triton Shared Computing Cluster (TSCC) cluster
The easiest way to get CELPP running on TSCC cluster is to create a Singularity image. Instructions for this can be found here
One above is done, upload this image file to TSCC cluster and put in $HOME/bin
directory. Then run the following (assuming the image file is named d3r.img):
chmod a+x $HOME/bin/d3r.img
Log into TSCC and create directories on TSCC Oasis filesystem by running these commands:
mkdir -p $HOME/bin
mkdir -p /oasis/tscc/scratch/$USER/data
mkdir -p /oasis/tscc/scratch/$USER/pdb
mkdir -p /oasis/tscc/scratch/$USER/archive
These directories will store the copy of the pdb as well as the outputs of the runs of CELPP.
Step 3 Create pdb download script
The following script downloads the pdb using rsync and tars up the data into a single file for each week since the oasis filesystem has problems with lots of little files.
Write the following code to a file named $HOME/bin/pdb-data-updater.sh
on TSCC cluster replacing <> with valid values like in case of <PUT YOUR EMAIL ADDRESS HERE>:
#!/bin/bash
CONTACTS="<PUT YOUR EMAIL ADDRESS HERE>"
if [ $# -ne 2 ] ; then
echo "$0 <tmpdir> <base download directory>"
exit 1
fi
base_dir=$1
pdb_dir_name="pdb.`date +%s`"
pdb_dir_name_tar="${pdb_dir_name}.tar"
pdb_dir="$base_dir/${pdb_dir_name}"
dest_dir="$2/pdb"
#
# TODO: If $dest_dir/latest_pdb exists and points to a
# pdb entry, need to copy that tar file to
# $base_dir, uncompress it and move its contents
# into $pdb_dir folder so that the rsync will
# run faster
#
rsync -rlpt -v -z --delete --port=33444 \
rsync.rcsb.org::ftp_data/structures/divided/pdb/ $pdb_dir
if [ $? != 0 ]; then
echo -e "RSYNC failed rsync.rcsb.org::ftp_data/structures/divided/pdb/\n\nSincerely,\n$0"
exit 1
fi
echo "Tarring $pdb_dir"
cd $base_dir
tar -c $pdb_dir_name > $pdb_dir_name_tar
if [ $? != 0 ] ; then
echo "Error running tar -c $pdb_dir_name > $pdb_dir_name_tar"
exit 2
fi
echo "Copying $pdb_dir_name_tar to $dest_dir"
cp $pdb_dir_name_tar $dest_dir/.
if [ $? != 0 ] ; then
echo "Error running cp $pdb_dir_name_tar $dest_dir/."
exit 3
fi
echo "Updating latest_pdb symbolic link"
rm $dest_dir/latest_pdb
if [ -e "$dest_dir/latest_pdb" ] ; then
if [ $? != 0 ] ; then
echo "Error running rm $dest_dir/latest_pdb"
exit 4
fi
fi
ln -s $dest_dir/$pdb_dir_name_tar $dest_dir/latest_pdb
if [ $? != 0 ] ; then
echo "Error running ln -s $dest_dir/$pdb_dir_name_tar $dest_dir/latest_pdb"
exit 5
fi
exit 0
Make the above script executable by running:
chmod a+x $HOME/bin/pdb-data-updater.sh
Write the following to $HOME/bin/pdb-data-updater.qsub
replacing values within <> such as <ACCOUNT> etc.
#!/bin/bash
#PBS -q condo
#PBS -N pdbdownload
#PBS -l nodes=1:ppn=1
#PBS -l walltime=4:00:00
#PBS -j oe
#PBS -o /oasis/tscc/scratch/<USER>/celpp/joblogs/pdbdownload.$PBS_JOBID.out
#PBS -w /oasis/tscc/scratch/<USER>/celpp
#PBS -V
#PBS -M <UCSD EMAIL ADDRESS>
#PBS -m abe
#PBS -A <ACCOUNT>
/usr/bin/time -v $HOME/bin/pdb-data-updater.sh /state/partition1/$USER/$PBS_JOBID /oasis/tscc/scratch/$USER/celpp
Make the above script executable by running:
chmod a+x $HOME/bin/pdb-data-updater.qsub
This will run the pdb download at 8pm every Tuesday. Add the following to cron on the TSCC login node, its easiest to put all the crons on the same login node:
0 20 * * 2 /opt/torque/bin/qsub /home/$USER/bin/pdb-data-updater.qsub
Step 6 Setup Openeye license file
For Openeye, the license file should be placed in $HOME
directory and only be visible to the user.
Name this file oe_license.txt
and run the following command to restrict visibility:
chmod go-rwx $HOME/oe_license.txt
Also set the environment variable OE_LICENSE to $HOME/oe_license.txt
. This can be done automatically by adding this to $HOME/.bash_profile
file:
export OE_LICENSE=$HOME/oe_license.txt
Create the following configuration files in $HOME
directory:
box.config
-- Click here for what to put in this file
rest.config
-- Click here for what to put in this file
smtp.config
-- Click here for what to put in this file
Be sure to restrict visibility of these files since they contain passwords with the following command:
chmod go-rwx $HOME/box.config $HOME/rest.config $HOME/smtp.config
Add the following to $HOME/bin/genchallenge.qsub
file replacing any <> text with valid values:
#!/bin/bash
#PBS -q home
#PBS -N genchallenge
#PBS -l nodes=1:ppn=3
#PBS -l walltime=96:00:00
#PBS -j oe
#PBS -o /oasis/tscc/scratch/<USER>/celpp/joblogs/genchall.$PBS_JOBID.out
#PBS -w /oasis/tscc/scratch/<USER>/celpp/testy
#PBS -V
#PBS -M <UCSD EMAIL ADDRESS>
#PBS -m abe
#PBS -A <ACCOUNT>
module load python/1
module load singularity
export SCHRODINGER=/opt/schrodinger
export SCHROD_LICENSE_FILE=<ADD SCHRODINGER LICENSE SERVER>
echo "Current time is `date` ... createchallenge job"
echo "Copying pdb from /oasis/tscc/scratch/$USER/celpp/pdb/latest_pdb to $TMPDIR"
cd $TMPDIR
pdb_dir=""
for Y in `seq 1 10` ; do
/usr/bin/time -p tar -xf /oasis/tscc/scratch/$USER/celpp/pdb/latest_pdb
ecode=$?
if [ $ecode == 0 ] ; then
pdb_dir=`find $TMPDIR -maxdepth 1 -name "pdb*" -type d`
break
fi
echo "`date` : Untar of pdb failed. Sleeping 60 seconds and trying again"
sleep 60
done
if [ "$pdb_dir" == "" ] ; then
echo "Error unable to uncompress /oasis/tscc/scratch/$USER/celpp/pdb/latest_pdb to $TMPDIR"
exit 1
fi
echo "PDB dir: $pdb_dir"
cd $pdb_dir
if [ $? != 0 ] ; then
echo "Error unable to cd to $pdb_dir"
exit 2
fi
echo "Uncompressing pdb files"
/usr/bin/time -p find . -name *.gz -exec gunzip {} \;
export MGL_ROOT=/usr/local/mgltools/
export PATH=$PATH:/opt/UCSF/Chimera64-1.10.2/bin
singularity run --bind /oasis --bind /proc --bind /state $HOME/d3r.img --importsleep 1200 --importretry 144 --blastnfiltertimeout 72000 --stage createchallenge --pdbdb $pdb_dir --compinchi http://ligand-expo.rcsb.org/dictionaries --ftpconfig $HOME/box.config --rdkitpython /opt/miniconda2 --log DEBUG --createweekdir --email <UCSD EMAIL ADDRESS> --smtpconfig $HOME/smtp.config /oasis/tscc/scratch/$USER/celpp/data
ecode=$?
echo "Done time is `date` and exit code is: $ecode"
exit $ecode
Make $HOME/bin/genchallenge.qsub executable
by running:
chmod a+x $HOME/bin/genchallenge.qsub
Step 9 Add Challenge generation job to cron
Add this to cron which will run genchallenge.qsub script we created in Step 8 at 9:01pm on Friday nights. Add the following to cron on the TSCC login node, its easiest to put all the crons on the same login node:
1 21 * * 5 . $HOME/.bash_profile;/opt/torque/bin/qsub $HOME/bin/genchallenge.qsub
Add the following to $HOME/bin/evaluation.qsub
file replacing any <> text with valid values:
#!/bin/bash
#PBS -q home
#PBS -N evaluation
#PBS -l nodes=1:ppn=8
#PBS -l walltime=96:00:00
#PBS -j oe
#PBS -o /oasis/tscc/scratch/<USER>/celpp/joblogs/evaluation.$PBS_JOBID.out
#PBS -w /oasis/tscc/scratch/<USER>/celpp/testy
#PBS -V
#PBS -M <UCSD EMAIL ADDRESS>
#PBS -m abe
#PBS -A <ACCOUNT>
module load python/1
module load singularity
export SCHRODINGER=/opt/schrodinger
export SCHROD_LICENSE_FILE=<ADD SCHRODINGER LICENSE SERVER>
echo "Current time is `date` ... evaluation job"
c_epoch=`date +%s`
link_age=`stat -c %Z /oasis/tscc/scratch/$USER/celpp/pdb/latest_pdb`
age_of_symlink=`echo "$c_epoch - $link_age" | bc -l`
while [ $age_of_symlink -gt 259200 ] ; do
echo "Sleeping 600 seconds"
sleep 600
c_epoch=`date +%s`
link_age=`stat -c %Z /oasis/tscc/scratch/$USER/celpp/pdb/latest_pdb`
age_of_symlink=`echo "$c_epoch - $link_age" | bc -l`
done
echo "Copying pdb from /oasis/tscc/scratch/$USER/celpp/pdb/latest_pdb to $TMPDIR"
cd $TMPDIR
pdb_dir=""
for Y in `seq 1 10` ; do
/usr/bin/time -p tar -xf /oasis/tscc/scratch/$USER/celpp/pdb/latest_pdb
ecode=$?
if [ $ecode == 0 ] ; then
pdb_dir=`find $TMPDIR -maxdepth 1 -name "pdb*" -type d`
break
fi
echo "`date` : Untar of pdb failed. Sleeping 60 seconds and trying again"
sleep 60
done
if [ "$pdb_dir" == "" ] ; then
echo "Error unable to uncompress /oasis/tscc/scratch/$USER/celpp/pdb/latest_pdb to $TMPDIR"
exit 1
fi
echo "PDB dir: $pdb_dir"
cd $pdb_dir
if [ $? != 0 ] ; then
echo "Error unable to cd to $pdb_dir"
exit 2
fi
echo "Uncompressing pdb files"
/usr/bin/time -p find . -name *.gz -exec gunzip {} \;
export MGL_ROOT=/usr/local/mgltools/
export PATH=$PATH:/opt/UCSF/Chimera64-1.10.2/bin
singularity run --bind /oasis --bind /proc --bind /state $HOME/d3r.img --stage evaluation,postevaluation --pdbdb $pdb_dir --evaluation evaluate.py --ftpconfig $HOME/box.config --rdkitpython /opt/miniconda2 --log DEBUG --createweekdir --email <UCSD EMAIL ADDRESS> --smtpconfig $HOME/smtp.config --websiteserviceconfig $HOME/rest.config /oasis/tscc/scratch/$USER/celpp/data
ecode=$?
echo "Done time is `date` and exit code is: $ecode"
exit $ecode
Make $HOME/bin/evaluation.qsub executable
by running:
chmod a+x $HOME/bin/evaluation.qsub
Step 11 Add Evaluation generation job to cron
Add this to cron which will run evaluation.qsub script we created in previous step at 1am on Wednesday morning. Add the following to cron on the TSCC login node, its easiest to put all the crons on the same login node:
0 1 * * 3 /opt/torque/bin/qsub $HOME/bin/evaluation.qsub
Write the following to $HOME/bin/extsubdownload_datamover.sh
:
#!/bin/bash -l
module load python/1
module load singularity
export SCHRODINGER=/opt/schrodinger
export SCHROD_LICENSE_FILE=<ADD SCHRODINGER LICENSE SERVER>
celppdir="/oasis/tscc/scratch/$USER/celpp/data"
cd $celppdir
singularity run --bind /oasis --bind /proc --bind /state $HOME/d3r.img --stage extsubmission --log DEBUG --email <UCSD EMAIL ADDRESS> --ftpconfig $HOME/box.config --smtpconfig $HOME/smtp.config $celppdir
ecode=$?
echo "Done time is `date` and exit code is: $ecode"
exit $ecode
Make the above script executable by running the following command:
chmod a+x $HOME/bin/extsubdownload_datamover.sh
Step 13 Add external submission download script to datamover cron
Log into the TSCC data mover server. This can be done by logging into a login node then running ssh tscc-dm1
from that machine. Add the following to cron which will run the external submission download at 3:10pm every Tuesday:
10 15 * * 2 $HOME/bin/extsubdownload_datamover.sh >> $HOME/bin/extsubdownload_datamover.sh.log 2>&1
NOTE: If this is production change the time to 3:00 by changing the 10 above to 0