Feat/min impurity decrease #165

agnesbao · 2021-01-20T23:19:57Z

Checklist

closes Consider opening up min_impurity_decrease in survival tree for early stoppping? #144
py.test passes
tests are included
code is well formatted
documentation renders correctly

What does this implement/fix? Explain your changes
Add min_logrank_split parameter to SurvivalTree which function in place as min_impurity_decrease in sklearn tree model. Like in DecisionTree we set min threshold on whether to keep splitting based on impurity decrease, here we set min threshold on the abs logrank stats (how separate the survival curves are)

I'm still working on writing a functional test for this param, but want to open this PR first to seek feedback on whether it makes sense.

codecov · 2021-01-20T23:41:33Z

Codecov Report

Merging #165 (2db2b11) into master (ccaf30e) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master     #165   +/-   ##
=======================================
  Coverage   98.33%   98.33%           
=======================================
  Files          37       37           
  Lines        3126     3132    +6     
  Branches      460      461    +1     
=======================================
+ Hits         3074     3080    +6     
  Misses         28       28           
  Partials       24       24

Impacted Files	Coverage Δ
sksurv/ensemble/forest.py	`100.00% <100.00%> (ø)`
sksurv/tree/tree.py	`99.30% <100.00%> (+0.01%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ccaf30e...2db2b11. Read the comment docs.

sebp

Thanks for helping to improve sksurv. The changes look reasonable. However, you would also need to modify RandomSurvivalForest to include the new min_logrank_split parameter. In addition, you need to add tests to assure that the intended change does what it is supposed to.

I'd suggest to create some instances where the min_logrank_split gets triggered at different depth of the tree building process. You can check the log-rank statistic in a fully grown tree and design tests based on that. You might find the LogrankTreeBuilder class helpful in that regard.

Please let me know if you have any questions.

sebp

First, I think the min log-rank check in the unit test differs from what LogrankCriterion does.

Second, I'm not sure whether the unit test is sufficient in the sense that this condition is actually triggered.

sebp · 2021-07-10T09:06:04Z

tests/test_tree.py

+                    s, p = compare_survival(y, groups)
+                    abs_z = abs(norm.ppf(p/2))
+                    if s > best_stat and abs_z >= min_logrank:


The LogrankCriterion class uses the absolute value of the log-rank test statistic as splitting criterion, whereas here, the condition is based on the p-value.

sebp · 2021-10-26T20:40:04Z

@agnesbao Are you still going to work on improving this PR, or should I consider it as abandoned?

sebp · 2021-11-23T08:35:28Z

I'm closing this PR as no further information has been provided. Please feel free to reopen this PR if you can provide the requested updates.

Xiaojun Bao added 2 commits January 20, 2021 16:39

add min_logrank_split

ac2cae5

test

9e44a04

sebp requested changes Jan 31, 2021

View reviewed changes

Xiaojun Bao added 4 commits June 24, 2021 13:51

merge master

24c2091

add min_logrank_split to forest model arg

6fa2760

add min_impurity_decrease model test

3edcf2d

>=

2db2b11

agnesbao requested a review from sebp June 25, 2021 02:16

sebp requested changes Jul 10, 2021

View reviewed changes

sebp closed this Nov 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/min impurity decrease #165

Feat/min impurity decrease #165

agnesbao commented Jan 20, 2021 •

edited

Loading

codecov bot commented Jan 20, 2021 •

edited

Loading

sebp left a comment

sebp left a comment •

edited

Loading

sebp Jul 10, 2021

sebp commented Oct 26, 2021

sebp commented Nov 23, 2021

Feat/min impurity decrease #165

Feat/min impurity decrease #165

Conversation

agnesbao commented Jan 20, 2021 • edited Loading

codecov bot commented Jan 20, 2021 • edited Loading

Codecov Report

sebp left a comment

Choose a reason for hiding this comment

sebp left a comment • edited Loading

Choose a reason for hiding this comment

sebp Jul 10, 2021

Choose a reason for hiding this comment

sebp commented Oct 26, 2021

sebp commented Nov 23, 2021

agnesbao commented Jan 20, 2021 •

edited

Loading

codecov bot commented Jan 20, 2021 •

edited

Loading

sebp left a comment •

edited

Loading