Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure integer sample size and number of events #472

Open
LittleBeannie opened this issue Oct 23, 2024 · 0 comments
Open

Ensure integer sample size and number of events #472

LittleBeannie opened this issue Oct 23, 2024 · 0 comments
Assignees

Comments

@LittleBeannie
Copy link
Collaborator

LittleBeannie commented Oct 23, 2024

Problem overview

The following lines of code generate a integer design, i.e., integer events and sample size as multiplier of 2.

enroll_rate <- define_enroll_rate(duration = c(2, 2, 2, 6), 
                                  rate = 1:4)
fail_rate <- define_fail_rate(duration = Inf, 
                                fail_rate = log(2) / 10, 
                                hr = .7, 
                                dropout_rate = 0.001)

alpha <- 0.025 
beta <- 0.1
ratio <- 1 

x <- gs_design_ahr(
  enroll_rate = enroll_rate, fail_rate = fail_rate,
  ratio = ratio,
  beta = beta,
  alpha = alpha,
  # Information fraction at analyses and trial duration
  info_frac = c(0.6, 0.8, 1), 
  analysis_time = 48,
  # Function and parameter(s) for upper spending bound
  upper = gs_spending_bound, 
  upar = list(sf = gsDesign::sfLDOF, total_spend = alpha, param = NULL),
  test_upper = c(FALSE, TRUE, TRUE),
  lower = gs_spending_bound, 
  lpar = list(sf = gsDesign::sfHSD, total_spend = beta, param = -4) ,
  test_lower = c(TRUE, FALSE,FALSE),
  binding = FALSE) 

xi <- x |> to_integer()

Though it is under equal randomization, the statistical information under H0 is not event/4.
image


Root cause

The root cause lies in gs_info_ahr. Although the sample size is integer (denoted as N), the sum(enroll_rate$rate * enroll_rate$duration) may not be N anymore, but close enough to N.

y = gs_info_ahr(enroll_rate = xi$enroll_rate,
            fail_rate = xi$fail_rate,
            ratio = xi$input$ratio,
            event = 342, # FA events
            analysis_time = NULL)

image


Reasons

When calling gs_info_ahr() given event without analysis_time, it will

  • First calculate the expected time to get the event using the expected_time() function.
  • Then it calculates the statistical information given the above calculated expected time by ahr() function.
  • The ahr() function calculate the statistcial information based on enroll_rate, fail_rate, total_duration and ratio.
  • There is a problem with the enroll_rate!!!

The planned relative enroll rate is c(1, 2, 3, 4), with a sample size of (c(1, 2, 3, 4) * c(2, 2, 2, 6)) |> sum() = 36. Since the integer sample size is 386, we inflate the planned relative enroll rate as 386 / 36 * c(1, 2, 3, 4).

Though it is theoretically true that 386/36 * c(1, 2, 3, 4) * c(2, 2, 2, 6) = 386, this may not hold in numerical software calculation.


Possible solution

image

image


Some references from gsDesign:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

When branches are created from issues, their pull requests are automatically linked.

1 participant