-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #71 from NCATS-Gamma/prevent-redos
The previous approach -- splitting the input file into chunks -- made it difficult to identify which records had already been generated and which remained to be processed. This PR replaces that system with one based around specifying the rows that need to be processed on each run, but then providing tools so that these don't need to be specified by the operator. Specifically, this PR: - Modifies RoboCORD.scala so that, rather than specifying the "chunk" ID, you specify which rows should be processed in that run. - Writes output to an intermediate file (named `result_from_${startIndex}_until_${endIndex}.in-progress.txt`) before renaming it to a final output file (named `result_from_${startIndex}_until_${endIndex}.tsv`). This doesn't mean much right now, since I think we store everything in memory before writing it all out to disk, but once we replace processing with FS2 streams, this will actually be important. - Modifies robocord.job so that an input job can be specified with the number of chunks to run (using the syntax `--array=0-[number of chunks]`), and then have it calculate which runs should be processed in each run. - Adds a new program called RoboCORDManager that goes through the list of output files produced and confirms that every row in the metadata.csv file has been processed. If not, it can batch remaining rows into jobs and start them uses the `robocord-sbatch.sh` script, which itself uses `srun`. These changes are documented in the README file.
- Loading branch information
Showing
8 changed files
with
280 additions
and
41 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
#!/bin/bash | ||
|
||
# All arguments are passed on to the RoboCORD instance. | ||
|
||
sbatch <<EOT | ||
#!/bin/bash | ||
# | ||
#SBATCH --job-name=RoboCORD | ||
#SBATCH --output=robocord-output/log-output-%A.txt | ||
#SBATCH --error=robocord-output/log-error-%A.txt | ||
#SBATCH --cpus-per-task 16 | ||
#SBATCH --mem=50000 | ||
#SBATCH --time=2:00:00 | ||
#SBATCH [email protected] | ||
set -e # Exit immediately if a pipeline fails. | ||
export JAVA_OPTS="-Xmx50G" | ||
export MY_SCIGRAPH=omnicorp-scigraph-\$SLURM_JOB_ID | ||
echo "Duplicating omnicorp-scigraph so we can use it on multiple clusters" | ||
cp -R omnicorp-scigraph "scigraphs/\$MY_SCIGRAPH" | ||
echo "Starting RoboCORD with arguments: --neo4j-location scigraphs/\$MY_SCIGRAPH $@" | ||
sbt "runMain org.renci.robocord.RoboCORD --neo4j-location scigraphs/\$MY_SCIGRAPH $@" | ||
rm -rf "scigraphs/\$MY_SCIGRAPH" | ||
echo "Deleted duplicated omnicorp-scigraph" | ||
EOT |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,23 +1,38 @@ | ||
#!/bin/bash | ||
# | ||
# This script should be run: | ||
# sbatch --array=0-3999 robocord.job | ||
# Where the total number of jobs (3999 in the example above) can be | ||
# any number. | ||
# | ||
#SBATCH --job-name=RoboCORD | ||
#SBATCH --output=robocord-output/log-output-%a.txt | ||
#SBATCH --error=robocord-output/log-error-%a.txt | ||
#SBATCH --cpus-per-task 16 | ||
#SBATCH --mem=50000 | ||
#SBATCH --time=12:00:00 | ||
#SBATCH --time=4:00:00 | ||
#SBATCH [email protected] | ||
|
||
set -e # Exit immediately if a pipeline fails. | ||
|
||
export JAVA_OPTS="-Xmx50G" | ||
export MY_SCIGRAPH=omnicorp-scigraph-$SLURM_ARRAY_TASK_ID | ||
|
||
echo "Duplicating omnicorp-scigraph so we can use it on multiple clusters" | ||
cp -R omnicorp-scigraph "scigraphs/$MY_SCIGRAPH" | ||
export METADATA_SIZE=$(wc -l < robocord-data/metadata.csv) | ||
export CHUNK_SIZE=$(($METADATA_SIZE/$SLURM_ARRAY_TASK_MAX)) | ||
export FROM_ROW=$(($SLURM_ARRAY_TASK_ID * $CHUNK_SIZE)) | ||
export UNTIL_ROW=$(($FROM_ROW + $CHUNK_SIZE)) | ||
export OUTPUT_FILENAME=robocord-output/result_from_${FROM_ROW}_until_${UNTIL_ROW}.tsv | ||
|
||
if [ -f $OUTPUT_FILENAME ]; then | ||
echo Output filename $OUTPUT_FILENAME already exists, skipping. | ||
else | ||
echo "Duplicating omnicorp-scigraph so we can use it on multiple clusters" | ||
cp -R omnicorp-scigraph "scigraphs/$MY_SCIGRAPH" | ||
|
||
echo "Starting RoboCORD" | ||
sbt "runMain org.renci.robocord.RoboCORD --metadata robocord-data/metadata.csv --current-chunk $SLURM_ARRAY_TASK_ID --total-chunks $SLURM_ARRAY_TASK_MAX --output-prefix robocord-output/results --neo4j-location scigraphs/$MY_SCIGRAPH robocord-data" | ||
echo "Starting RoboCORD from row $FROM_ROW until $UNTIL_ROW." | ||
sbt "runMain org.renci.robocord.RoboCORD --metadata robocord-data/metadata.csv --from-row $FROM_ROW --until-row $UNTIL_ROW --output-prefix robocord-output/result --neo4j-location scigraphs/$MY_SCIGRAPH robocord-data" | ||
|
||
rm -rf "scigraphs/$MY_SCIGRAPH" | ||
echo "Deleted duplicated omnicorp-scigraph" | ||
rm -rf "scigraphs/$MY_SCIGRAPH" | ||
echo "Deleted duplicated omnicorp-scigraph" | ||
fi |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.