Halve the median runtime of `gs_power_ahr()` #295

jdblischak · 2023-12-14T15:15:45Z

Inspired by Issue #219, I was able to halve the runtime of the gs_power_ahr() example in the documentation for gs_spending_bound(). I achieved this by:

converting ahr() to data.table
converting almost all of pw_info() to data.table (there is a tricky group arrange operation at the end that I'd need to investigate more)
updating expected_event() to return a data frame instead of a tibble
updating pw_info() to pre-allocate lists instead of growing the data frame via rbind() within each loop iteration

While optimizing, I performed the following cleanups:

added library(gsDesign) to tests/testthat/test-independent-gs_power_ahr.R so that it can be run via devtools::test()
had to do some NAMESPACE rearranging to be able to use := from both the data.table and rlang packages while still keeping R CMD check happy
added some missing @importFrom dplyr filter (I didn't get to all of them)

library("microbenchmark")

remotes::install_github("Merck/gsDesign2@50067c4", INSTALL_opts = "--with-keep.source")
library("gsDesign2")
packageVersion("gsDesign2")
## [1] ‘1.1.0’

microbenchmark(
  gs_power_ahr = gs_power_ahr(
    analysis_time = c(12, 24, 36),
    event = c(30, 40, 50),
    binding = TRUE,
    upper = gs_spending_bound,
    upar = list(sf = gsDesign::sfLDOF, total_spend = 0.025, param = NULL, timing = NULL),
    lower = gs_spending_bound,
    lpar = list(sf = gsDesign::sfLDOF, total_spend = 0.025, param = NULL, timing = NULL)
  ),
  times = 10
)
## Unit: milliseconds
##          expr      min       lq     mean   median       uq      max neval
##  gs_power_ahr 808.6144 840.0866 940.8113 925.5892 988.4552 1201.223    10

detach("package:gsDesign2")
devtools::load_all()
packageVersion("gsDesign2")
## [1] ‘1.1.0.1’

microbenchmark(
  gs_power_ahr = gs_power_ahr(
    analysis_time = c(12, 24, 36),
    event = c(30, 40, 50),
    binding = TRUE,
    upper = gs_spending_bound,
    upar = list(sf = gsDesign::sfLDOF, total_spend = 0.025, param = NULL, timing = NULL),
    lower = gs_spending_bound,
    lpar = list(sf = gsDesign::sfLDOF, total_spend = 0.025, param = NULL, timing = NULL)
  ),
  times = 10
)
## Unit: milliseconds
##          expr      min       lq     mean   median       uq      max neval
##  gs_power_ahr 405.3252 427.6431 507.1905 431.4219 538.8579 779.9118    10
925 / 431
## [1] 2.146172

nanxstats · 2023-12-14T19:27:27Z

This looks good from the technical perspective. We should discuss this in the meeting this week especially about using data.table.

nanxstats

Nice speed improvements. It's exciting that we have decided to use data.table to accelerate the gsDesign2 backend, too!

It seems the only explicit rlang usage (and := usage) is in R/utility_tidy_tbl.R:

dplyr::rename(!!aname := !!"_alab")

I think it would be ideal to use an alternative approach for the line above, so that 1. := is only used by data.table (ever) 2. rlang can be remove from Imports. 3. the ignore_unused_imports() workaround can be removed.

jdblischak requested review from nanxstats and keaven December 14, 2023 15:15

jdblischak self-assigned this Dec 14, 2023

Start converting ahr() and pw_info() to data.table

24fdc77

jdblischak force-pushed the ahr-and-pw-info branch from 4bd0dec to 24fdc77 Compare December 14, 2023 15:46

nanxstats approved these changes Dec 15, 2023

View reviewed changes

nanxstats merged commit 2ba6c33 into Merck:main Dec 15, 2023
8 checks passed

jdblischak deleted the ahr-and-pw-info branch December 16, 2023 01:29

jdblischak mentioned this pull request Dec 19, 2023

Use data.table for grouped sort in pw_info() #300

Merged

jdblischak mentioned this pull request Jan 8, 2024

Remove {rlang} to simplify NAMESPACE #307

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Halve the median runtime of `gs_power_ahr()` #295

Halve the median runtime of `gs_power_ahr()` #295

jdblischak commented Dec 14, 2023 •

edited

Loading

nanxstats commented Dec 14, 2023

nanxstats left a comment

Halve the median runtime of gs_power_ahr() #295

Halve the median runtime of gs_power_ahr() #295

Conversation

jdblischak commented Dec 14, 2023 • edited Loading

nanxstats commented Dec 14, 2023

nanxstats left a comment

Choose a reason for hiding this comment

Halve the median runtime of `gs_power_ahr()` #295

Halve the median runtime of `gs_power_ahr()` #295

jdblischak commented Dec 14, 2023 •

edited

Loading