Skip to content

Commit

Permalink
Slurm processes_per_node (#103)
Browse files Browse the repository at this point in the history
* processes_per_node

* processes per node docs
  • Loading branch information
jmineau authored Oct 3, 2024
1 parent 0b81580 commit bc45316
Show file tree
Hide file tree
Showing 4 changed files with 15 additions and 8 deletions.
13 changes: 7 additions & 6 deletions docs/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,12 +13,13 @@ The following parameters are found in `r/run_stilt.r` and are used to configure

### Parallel simulation settings

| Arg | Description |
| --------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `n_nodes` | If using SLURM for job submission, number of nodes to utilize |
| `n_cores` | Number of cores per node to parallelize simulations by receptor locations and times |
| `slurm` | Logical indicating the use of rSLURM to submit job(s). When using SLURM, a `<stilt_wd>/_rslurm` directory is created to contain the SLURM submission scripts and node-specific log files. |
| `slurm_options` | Named list of options passed to `sbatch` using `rslurm::slurm_apply()`. This typically includes `time`, `account`, and `partition` values |
| Arg | Description |
| -------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `n_nodes` | If using SLURM for job submission, number of nodes to utilize |
| `n_cores` | Number of cores per node to parallelize simulations by receptor locations and times |
| `processes_per_node` | Number of processes to run on each node. Can be set higher than n_cores for nodes which support [hyperthreading](https://scicomp.ethz.ch/wiki/Using_hyperthreading) |
| `slurm` | Logical indicating the use of rSLURM to submit job(s). When using SLURM, a `<stilt_wd>/_rslurm` directory is created to contain the SLURM submission scripts and node-specific log files. |
| `slurm_options` | Named list of options passed to `sbatch` using `rslurm::slurm_apply()`. This typically includes `time`, `account`, and `partition` values |

### Receptor placement

Expand Down
2 changes: 1 addition & 1 deletion docs/execution.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ Rscript r/run_stilt.r

![Parallel simulations with SLURM](static/img/chart-parallel.png)

If `slurm = TRUE` STILT will distribute the simulations across `n_nodes` using `n_cores` on each node (total parallel worker count is `n_nodes * n_cores`). This will create a `<stilt_wd>/_rslurm` directory which contains SLURM submission scripts and logs from each node.
If `slurm = TRUE` STILT will distribute the simulations across `n_nodes` using `n_cores` on each node (total parallel worker count is `n_nodes * n_cores`). This will create a `<stilt_wd>/_rslurm` directory which contains SLURM submission scripts and logs from each node. For nodes which support [hyperthreading](https://scicomp.ethz.ch/wiki/Using_hyperthreading), the job allocation per node can be increased beyond the number of cores per node via `processes_per_node`.

```bash
Rscript r/run_stilt.r
Expand Down
2 changes: 2 additions & 0 deletions r/run_stilt.r
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ lib.loc <- .libPaths()[1]
# Parallel simulation settings
n_cores <- 1
n_nodes <- 1
processes_per_node <- n_cores
slurm <- n_nodes > 1
slurm_options <- list(
time = '300:00:00',
Expand Down Expand Up @@ -189,6 +190,7 @@ stilt_apply(FUN = simulation_step,
slurm_options = slurm_options,
n_cores = n_cores,
n_nodes = n_nodes,
processes_per_node = processes_per_node,
before_footprint = list(before_footprint),
before_trajec = list(before_trajec),
lib.loc = lib.loc,
Expand Down
6 changes: 5 additions & 1 deletion r/src/stilt_apply.r
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@
#' passed to rslurm::slurm_apply()
#' @param n_nodes number of nodes to submit SLURM jobs to using \code{sbatch}
#' @param n_cores number of CPUs to utilize per node
#' @param processes_per_node number of processes to run per node. Can be set
#' higher than n_cores for nodes which support hyperthreading
#' @param ... arguments to FUN
#'
#' @return if using slurm, returns sjob information. Otherwise, will return a
Expand All @@ -19,7 +21,8 @@
#' @export

stilt_apply <- function(FUN, slurm = F, slurm_options = list(),
n_nodes = 1, n_cores = 1, ...) {
n_nodes = 1, n_cores = 1, processes_per_node = n_cores,
...) {

if (!slurm && n_nodes > 1) {
stop('n_nodes > 1 but but slurm is disabled. ',
Expand Down Expand Up @@ -53,6 +56,7 @@ stilt_apply <- function(FUN, slurm = F, slurm_options = list(),
jobname = basename(getwd()), pkgs = 'base',
nodes = n_nodes,
cpus_per_node = n_cores,
processes_per_node = processes_per_node,
preschedule_cores = F,
slurm_options = slurm_options)
return(invisible(sjob))
Expand Down

0 comments on commit bc45316

Please sign in to comment.