Efficacy and futility boundary update
+Yujie Zhao and +Keaven M. Anderson
+ + + Source:vignettes/articles/story-update-boundary.Rmd
+ story-update-boundary.Rmd
Design assumptions +
+We assume two analyses: an interim analysis (IA) and a final analysis +(FA). The IA is planned 20 months after opening enrollment, followed by +the FA at month 36. The planned enrollment period spans 14 months, with +the first 2 months having an enrollment rate of 1/3 the final rate, the +next 2 months with a rate of 2/3 of the final rate, and the final rate +for the remaining 10 months. To obtain the targeted 90% power, these +rates will be multiplied by a constant. The control arm is assumed to +follow an exponential distribution with a median of 9 months and the +dropout rate is 0.0001 per month regardless of treatment group. Finally, +the experimental treatment group is piecewise exponential with a 3-month +delayed treatment effect; that is, in the first 3 months HR = 1 and the +HR is 0.6 thereafter.
+We use the null hypothesis information for boundary crossing +probability calculations under both the null and alternate hypotheses. +This will also imply the null hypothesis information will be used for +the information fraction used in spending functions to derive the +design.
+One-sided design +
+For the design, we have efficacy bounds at both the IA and FA. We use +the Lan and DeMets (1983) spending +function with a total alpha of 0.025, which approximates an +O’Brien-Fleming bound.
+Planned design +
+
+upper <- gs_spending_bound
+upar <- list(sf = gsDesign::sfLDOF, total_spend = alpha, param = NULL)
+
+x <- gs_design_ahr(
+ enroll_rate = enroll_rate,
+ fail_rate = fail_rate,
+ alpha = alpha,
+ beta = beta,
+ info_frac = NULL,
+ info_scale = "h0_info",
+ analysis_time = analysis_time,
+ ratio = ratio,
+ upper = gs_spending_bound,
+ upar = upar,
+ test_upper = TRUE,
+ lower = gs_b,
+ lpar = rep(-Inf, 2),
+ test_lower = FALSE
+) |> to_integer()
The planned design targets:
+-
+
- Planned events: 193, 297 +
- Planned information fraction for interim and final analysis: 0.6498, +1 +
- Planned alpha spending: 0.0054, 0.025 +
- Planned efficacy bounds: 2.5473, 1.9896 +
We note that rounding up the final targeted events increases power +slightly over the targeted 90%.
+Original design | +|||||
Bound | +Z | +Nominal p1 + | +~HR at bound2 + | ++ Cumulative boundary crossing probability + | +|
---|---|---|---|---|---|
Alternate hypothesis | +Null hypothesis | +||||
Analysis: 1 Time: 19.9 N: 366 Event: 193 AHR: 0.73 Information fraction: 0.65 | +|||||
Efficacy | +2.55 | +0.0054 | +0.6930 | +0.3494 | +0.0054 | +
Analysis: 2 Time: 35.8 N: 366 Event: 297 AHR: 0.68 Information fraction: 1 | +|||||
Efficacy | +1.99 | +0.0233 | +0.7938 | +0.9023 | +0.0250 | +
+1 One-sided p-value for experimental vs control treatment. + Value < 0.5 favors experimental, > 0.5 favors control. | +|||||
+2 Approximate hazard ratio to cross bound. | +
Update bounds at time of analysis +
+We assume 180 and 280 events observed at the IA and FA, respectively. +We will assume the differences from planned are due to logistical +considerations. We also assume the protocol specifies that the full +\(\alpha\) will be spent at the final +analysis even in a case like this where there is a shortfall of events +versus the design plan.
+
+# Observed vs. planned events
+event_observed <- c(180, 280)
+event_planned <- x$analysis$event
Planned vs. Observed events | +||
Analysis | +Planned events | +Observed events | +
---|---|---|
IA | +193 | +180 | +
FA | +297 | +280 | +
We will utilize the gs_power_npe()
function to update
+efficacy bounds based on the observed events. The details of its
+arguments and implementations are explained in the Appendix.
+# Take spending setup from original design
+upar <- x$upper$upar
+# Now update timing parameter for the interim analysis.
+# Interim spending based on observed events divided by final planned events.
+# The remaining alpha will be allocated to FA.
+upar$timing <- c(event_observed[1] / max(event_planned), 1)
+
+x_updated <- gs_power_npe(
+ # `theta = 0` provides the crossing probability under H0
+ theta = 0,
+ # Observed statistical information under H0
+ info = event_observed * x$input$ratio / (1 + x$input$ratio)^2,
+ info_scale = "h0_info",
+ # Upper bound uses spending function from planned design
+ upper = x$input$upper,
+ upar = x$input$upar,
+ test_upper = x$input$test_upper,
+ # No lower bound, but copy this from input
+ lower = x$input$lower,
+ lpar = x$input$lpar,
+ test_lower = x$input$test_lower,
+ # Binding
+ binding = x$input$binding
+)
The updated efficacy bounds are 2.547, 1.99.
+Updated design | +|||||||||
with observed 180, 280 events | +|||||||||
analysis | +bound | +z | +probability1 + | +theta | +theta1 | +info_frac | +info | +info0 | +info1 | +
---|---|---|---|---|---|---|---|---|---|
1 | +upper | +2.547307 | +0.005427893 | +0 | +0 | +0.6428571 | +45 | +45 | +45 | +
2 | +upper | +1.990466 | +0.025000000 | +0 | +0 | +1.0000000 | +70 | +70 | +70 | +
+1 Crossing pbability under H0. | +
Two-sided asymmetric design, beta-spending with +non-binding lower bound +
+In this section, we investigate a 2 sided asymmetric design, with the +non-binding beta-spending futility bounds. Beta-spending refers to error +spending for the lower bound crossing probabilities under the +alternative hypothesis. Non-binding assumes the trial continues if the +lower bound is crossed for Type I, but not Type II error +computation.
+Planned design +
+In the original designs, we employ the Lan-DeMets spending function +used to approximate O’Brien-Fleming bounds (Lan +and DeMets 1983) for both efficacy and futility bounds. The total +spending for efficacy is 0.025, and for futility is 0.1. Besides, we +assume the futility test only happens at IA.
+
+upper <- gs_spending_bound
+upar <- list(sf = gsDesign::sfLDOF, total_spend = alpha, param = NULL)
+lower <- gs_spending_bound
+lpar <- list(sf = gsDesign::sfLDOF, total_spend = beta, param = NULL)
+
+x <- gs_design_ahr(
+ enroll_rate = enroll_rate,
+ fail_rate = fail_rate,
+ alpha = alpha,
+ beta = beta,
+ info_frac = NULL,
+ info_scale = "h0_info",
+ analysis_time = c(20, 36),
+ ratio = ratio,
+ upper = gs_spending_bound,
+ upar = upar,
+ test_upper = TRUE,
+ lower = lower,
+ lpar = lpar,
+ test_lower = c(TRUE, FALSE),
+ binding = FALSE
+) |> to_integer()
In the planned design, we have
+-
+
- Planned events: 202, 311 +
- Planned information fraction (timing): 0.6495, 1 +
- Planned alpha spending: 0.0054167, 0.025 +
- Planned efficacy bounds: 2.548, 1.9895 +
- Planned futility bounds: 0.4778 +
Since we added futility bounds, the sample size and number of events +are larger than what we have in the 1-sided example.
+Original design | +|||||
Bound | +Z | +Nominal p1 + | +~HR at bound2 + | ++ Cumulative boundary crossing probability + | +|
---|---|---|---|---|---|
Alternate hypothesis | +Null hypothesis | +||||
Analysis: 1 Time: 19.9 N: 382 Event: 202 AHR: 0.73 Information fraction: 0.65 | +|||||
Futility | +0.48 | +0.3164 | +0.9350 | +0.0413 | +0.6836 | +
Efficacy | +2.55 | +0.0054 | +0.6987 | +0.3692 | +0.0054 | +
Analysis: 2 Time: 36.1 N: 382 Event: 311 AHR: 0.68 Information fraction: 1 | +|||||
Efficacy | +1.99 | +0.0233 | +0.7980 | +0.9030 | ++3 0.0247 | +
+1 One-sided p-value for experimental vs control treatment. + Value < 0.5 favors experimental, > 0.5 favors control. | +|||||
+2 Approximate hazard ratio to cross bound. | +|||||
+3 Cumulative alpha for final analysis (0.0247) is less than the full alpha (0.025) when the futility bound is non-binding. The smaller value subtracts the probability of crossing a futility bound before crossing an efficacy bound at a later analysis (0.025 - 0.0003 = 0.0247) under the null hypothesis. | +
Update bounds at time of analysis +
+In practice, let us assume the observed data is generated by
+simtrial::sim_pw_surv()
.
+set.seed(42)
+
+observed_data <- simtrial::sim_pw_surv(
+ n = x$analysis$n[x$analysis$analysis == 2],
+ stratum = data.frame(stratum = "All", p = 1),
+ block = c(rep("control", 2), rep("experimental", 2)),
+ enroll_rate = x$enroll_rate,
+ fail_rate = (fail_rate |> simtrial::to_sim_pw_surv())$fail_rate,
+ dropout_rate = (fail_rate |> simtrial::to_sim_pw_surv())$dropout_rate
+)
+
+observed_ia_data <- observed_data |> simtrial::cut_data_by_date(analysis_time[1])
+observed_fa_data <- observed_data |> simtrial::cut_data_by_date(analysis_time[2])
+
+# Observed vs. planned events
+event_observed <- c(sum(observed_ia_data$event), sum(observed_fa_data$event))
+event_planned <- x$analysis$event
Planned vs. Observed events | +||
Analysis | +Planned number of events | +Observed number of events | +
---|---|---|
IA | +202 | +199 | +
FA | +311 | +312 | +
Again, we use gs_power_npe()
to calculate the updated
+efficacy and futility bounds. The details of its arguments and
+implementations are explained in the Appendix. We initially set up
+theta = 0
for crossing probability under the null
+hypothesis and theta0 = 0
for null hypothesis. As for
+theta1
, there are 2 options to set it up. These 2 options
+arrive at the same value of theta1
.
-
+
-
+Option 1:
theta1
is the weighted +average of the piecewise HR, that is,
+
\[ + -\log(\text{AHR}) = -\log(\sum_{i=1}^m w_i \; \text{HR}_m), +\]
+where the weight is decided by the ratio of observed number of events +and the observed final events. In this example, the number of observed +events at the first 3 months (HR = 1) can be derived as
+
+event_first_3_month <- sum(observed_data$fail_time < 3)
+event_first_3_month
## [1] 72
+and the number of observed events after 3 months (HR = 0.6) are
+
+event_after_3_month <- event_observed[2] - event_first_3_month
+event_after_3_month
## [1] 240
+We can derive theta1
as -[72 / 312 \(\times \log(1) +\) 240 / 312 \(\times \log(0.6)\)] = 0.3929.
## [1] 0.3929428
+-
+
-
+Option 2: we use the blinded data to estimate the
+AHR and
theta1
is the negative logarithm of the estimated +AHR.
+
+blinded_ahr <- ahr_blinded(
+ surv = survival::Surv(
+ time = observed_fa_data$tte,
+ event = observed_fa_data$event
+ ),
+ intervals = c(3, Inf),
+ hr = c(1, 0.6),
+ ratio = 1
+)
Blinded estimation of average hazard ratio | +|||
event | +ahr | +theta | +info0 | +
---|---|---|---|
312 | +0.6750674 | +0.3929428 | +78 | +
## [1] 0.3929428
+Both Option 1 and Option 2 yield the same value of
+theta1
. Option 1 is applicable when the event counts are
+available, while Option 2 is suitable for scenarios where time-to-event
+data is available, without the corresponding event counts. A value of
+theta1
is computed with one of the above methods and
+incorporated below.
+x_updated <- gs_power_npe(
+ # `theta = 0` provides the crossing probability under H0.
+ # Users have the flexibility to modify the value of `theta`,
+ # if they are interested in the crossing probability under H1 or other scenarios.
+ theta = 0,
+ # -log(AHR) = 0 for H0 is used for determining upper spending
+ theta0 = 0,
+ # -log(AHR) for H1 is used for determining the lower spending and bounds
+ theta1 = theta1,
+ # Observed statistical information under H0 with equal randomization
+ info = event_observed * ratio / (1 + ratio)^2,
+ # Upper bound
+ upper = x$input$upper,
+ upar = list(
+ sf = gsDesign::sfLDOF,
+ total_spend = alpha,
+ param = NULL,
+ # The remaining alpha will be allocated to the FA stage
+ timing = c(event_observed[1] / max(event_planned), 1)
+ ),
+ test_upper = TRUE,
+ # Lower bound
+ lower = x$input$lower,
+ lpar = list(
+ sf = gsDesign::sfLDOF,
+ total_spend = beta,
+ param = NULL,
+ # The remaining alpha will be allocated to the FA stage
+ timing = c(event_observed[1] / max(event_planned), 1)
+ ),
+ test_lower = c(TRUE, FALSE),
+ binding = x$input$binding
+)
Updated design | +|||||||||
by event fraction based timing | +|||||||||
analysis | +bound | +z | +probability | +theta | +theta1 | +info_frac | +info | +info0 | +info1 | +
---|---|---|---|---|---|---|---|---|---|
1 | +upper | +2.570463 | +0.005078136 | +0 | +0.3929428 | +0.6378205 | +49.75 | +49.75 | +49.75 | +
2 | +upper | +1.987979 | +0.022865874 | +0 | +0.3929428 | +1.0000000 | +78.00 | +78.00 | +78.00 | +
1 | +lower | +1.018047 | +0.845672222 | +0 | +0.3929428 | +0.6378205 | +49.75 | +49.75 | +49.75 | +
Appendix +
+Arguments of gs_power_npe()
+
+We provide an introduction to its arguments.
+-
+
-
+3 theta-related arguments:
+
-
+
-
+
theta
: a vector with the natural parameter for the +group sequential design; represents expected ; \(-\log(\text{AHR})\) in this case. This is +used for boundary crossing probability calculation. For example, if we +settheta = 0
, then the crossing probability is the Type I +error.
+ -
+
theta0
: natural parameter for H0; used for determining +the upper (efficacy) spending and bounds. If the default of +NULL
is given, then this is replaced by 0.
+ -
+
theta1
: natural parameter for H1; used for determining +the lower spending and bounds. The default isNULL
, in +which case the value is replaced with the inputtheta
.
+
+ -
+
-
+3 statistical information-related arguments:
+
-
+
-
+
info
: statistical information at all analyses for input +theta
. Default is1
.
+ -
+
info0
: statistical information under H0, if different +thaninfo
. It impacts null hypothesis bound calculation. +Default isNULL
.
+ -
+
info1
: statistical information under hypothesis used +for futility bound calculation if different frominfo
. It +impacts futility hypothesis bound calculation. Default is +NULL
.
+
+ -
+
-
+Efficacy/Futility boundary-related arguments:
+
-
+
-
+
upper
,upar
andtest_upper
: +efficacy bounds related, same asgs_design_ahr()
.
+ -
+
lower
,lpar
andtest_lower
: +futility bounds related, same asgs_design_ahr()
.
+ -
+
binding
:TRUE
orFALSE
+indicating whether it is binding or non-binding designs.
+
+ -
+
Explanations of gs_power_npe()
arguments set up in the
+one-sided design
+
+We initially set up theta = 0
, which provides the
+crossing probability under the null hypothesis. Users have the
+flexibility to modify the value of theta
if they are
+interested in the crossing probability under alternative hypotheses or
+other scenarios. In this implementation, we leave the setup of
+theta0
with its default imputed value of 0
,
+indicating the null hypothesis. Additionally, as we are solely dealing
+with efficacy bounds, we do not adjust theta1
, which
+influences the futility bound.
Moving forward, we proceed to set up the values of the
+info
arguments for the input theta = 0
. Since
+the statistical information is event-based, and theta = 0
+(null hypothesis), the observed statistical information under null
+hypothesis is
\[ + \text{observed number of events} \times \frac{r}{1 + r} \times +\frac{1}{1 + r}, +\]
+where \(r\) is the randomization
+ratio (experimental : control). In this example, the observed
+statistical information is calculated as
+info = event_observed / 4
, and thus, we input 49.75, 78 for
+the info
argument. In this case, we retain the default
+imputed value for info0
, which is set as info
.
+The info
represents the statistical information under the
+null hypothesis (theta = 0
). Furthermore, we leave
+info1
as it is, considering our focus on efficacy bounds,
+while noting that info1
does affect the futility
+bounds.
Lastly, we establish the parameters for the upper and lower bounds.
+The setup of upper
, upar
,
+test_upper
, lower
, lpar
, and
+test_lower
closely resembles the original design, with the
+exception of the upar
setup. Here, we have introduced the
+timing
parameter. At the IA, the observed number of events
+is 199. If we employ the interim analysis alpha spending strategy, the
+spending timing at IA is calculated as 199 / 311. Here, 311 represents
+the planned number of events at FA. The remaining alpha will be
+allocated to the FA stage.
Explanations of gs_power_npe()
arguments set up in the
+two-sided asymmetric design, beta-spending with non-binding lower
+bound
+
+Moving forward, we proceed to set up the values of the
+info
arguments for the input theta = 0
.
+Similar to the examples in the 1-sided design above, we set
+info
as 199, 312/4, where 199, 312 is the observed number
+of events. Because info0
is under null hypothesis, we leave
+it and use its default imputed values same as info
.
+Furthermore, we leave info1
as its default imputed values
+same as info
with the local null hypothesis approach.
+Though people can explicitly write the formula of info1
in
+terms of the observed events, the difference is very minor, and it would
+be easiest with unblinded data.
Lastly, we establish the parameters for the upper and lower bounds, +following the similar procedure as we have in the 1-sided design.
+