The test at `test-independent-fixed_design.R:141` is not stable #451

yihui · 2024-08-13T13:10:20Z

gsDesign2/tests/testthat/test-independent-fixed_design.R

Line 141 in 32277b8

expect_equal(y$analysis$power, 0.9, tolerance = testthat_tolerance() * 1000)

It throws an error sporadically, e.g.,

The tolerance was recently raised with e9317fd but may still not be enough.

The text was updated successfully, but these errors were encountered:

nanxstats · 2024-08-14T00:37:02Z

Apparently, I'm not the stats expert here but @elong0527 how about we set tolerance = 1e-4 explicitly, meaning power = 90.009 is ok but power = 90.01 is not ok.

elong0527 · 2024-08-14T00:48:05Z

I like a explicit criteria as well.

LittleBeannie · 2024-08-15T20:49:38Z

Thanks for pointing it out, @yihui ! I guess there are 2 issues leading to the instability of the testing:

A numerical issue. From line 118-127, I calculate the sample size to get 90% power, which is around 415.7804. Then, from line 130 - 139, I input the sample size of 415.7804 via the implementation of line 132, to check if the sample size gives me 90% power of not.
If you look into line 132, you will notice I did something like x/y*y. As we discussed before, it is theoretically true, but not hold for laptop numerical calculation. So y's sample size (line 130) is slightly different from 415.7804, which leading a power slightly different of 90%. In my laptop, I got y's power is 0.899991, which fails when I run expect_equal(0.899991, 0.9).
A random number issue. For the MaxCombo test, there are some random number issues, as Nan bring up in Add random seed for running the non-deterministic gs_power_combo()? #340.

yihui · 2024-08-15T21:23:19Z

Got it. Then it seems we need to either raise the tolerance (as I did in PR #455) or set a seed. I noticed that we were using a higher tolerance (0.01) for other tests: https://github.com/Merck/gsDesign2/blob/main/tests/testthat/test-independent-fixed_design.R so I wonder why this test was given a much smaller tolerance.

yihui mentioned this issue Aug 13, 2024

Let to_integer() always produce integer analysis$n and analysis$event #452

Merged

yihui changed the title ~~The test at test-independent-fixed_design.R:141 is not stable on macOS~~ The test at test-independent-fixed_design.R:141 is not stable Aug 13, 2024

yihui added a commit to yihui/gsDesign2 that referenced this issue Aug 14, 2024

fix Merck#451: set the tolerance to 1e-4 explicitly

9f3d108

yihui mentioned this issue Aug 14, 2024

Set the tolerance to 1e-4 explicitly #455

Merged

LittleBeannie closed this as completed in #455 Aug 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The test at `test-independent-fixed_design.R:141` is not stable #451

The test at `test-independent-fixed_design.R:141` is not stable #451

yihui commented Aug 13, 2024 •

edited

Loading

nanxstats commented Aug 14, 2024

elong0527 commented Aug 14, 2024

LittleBeannie commented Aug 15, 2024

yihui commented Aug 15, 2024

The test at test-independent-fixed_design.R:141 is not stable #451

The test at test-independent-fixed_design.R:141 is not stable #451

Comments

yihui commented Aug 13, 2024 • edited Loading

nanxstats commented Aug 14, 2024

elong0527 commented Aug 14, 2024

LittleBeannie commented Aug 15, 2024

yihui commented Aug 15, 2024

The test at `test-independent-fixed_design.R:141` is not stable #451

The test at `test-independent-fixed_design.R:141` is not stable #451

yihui commented Aug 13, 2024 •

edited

Loading