added uncertainty scaling based on the validation set #33

donerancl · 2024-09-13T16:17:50Z

No description provided.

mjohnson541 · 2024-09-13T21:03:30Z

pysidt/sidt.py

@@ -1102,7 +1104,20 @@ def estimate_uncertainty(self,rel_node_dof_tolerance=1e-5):
                node.rule.uncertainty = node.parent.rule.uncertainty
            elif node.rule.num_data == 0:
                node.rule.uncertainty = 0.0 #if n=0 the LASSO should drive node.rule.value to zero so there should be approximately no variance contribution 
-
+
+        if self.validation_set and len(node_uncertainties) > 0:


Enable assignment of the validation_set in this function in case someone wants to tune it after generating the tree easily.

mjohnson541 · 2024-09-13T21:05:18Z

pysidt/sidt.py

+    return confidence_levels, proportion_correct
+
+def objective_function(scaling_factor, errs, uncs, n = 500):


Can we make "objective_function" either more specific or embed in another function

mjohnson541 · 2024-09-13T21:07:28Z

pysidt/sidt.py

-
+
+        if self.validation_set and len(node_uncertainties) > 0:
+            val_predictions_uncertainties = [self.evaluate(d.mol, estimate_uncertainty=True) for d in self.validation_set]


We probably only want to run this once perhaps at the end of generation so either we put this is in a separate function "scale_uncertainties" or add a flag to this function to not do this so we can not scale them everytime we call this during tree generation except one final time at the end.

mjohnson541 · 2024-09-16T19:58:12Z

pysidt/sidt.py

+def get_bounded_fraction(errs, uncs, confidence_level):
+    t = scipy.stats.norm.ppf((1 + confidence_level) / 2)
+    return np.sum(uncs * t >= np.abs(errs)) / len(errs)


I think this formula is wrong...the uncs you're pulling I believe are variances not standard deviations...also separately I think the len(errs) needs a sqrt?

donerancl and others added 2 commits September 13, 2024 12:13

added uncertainty scaling based on the validation set

5258de3

removing extraneous comments

073b81e

mjohnson541 reviewed Sep 13, 2024

View reviewed changes

mjohnson541 reviewed Sep 16, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added uncertainty scaling based on the validation set #33

added uncertainty scaling based on the validation set #33

donerancl commented Sep 13, 2024

mjohnson541 Sep 13, 2024

mjohnson541 Sep 13, 2024

mjohnson541 Sep 13, 2024

mjohnson541 Sep 16, 2024

		return confidence_levels, proportion_correct

		def objective_function(scaling_factor, errs, uncs, n = 500):



		if self.validation_set and len(node_uncertainties) > 0:
		val_predictions_uncertainties = [self.evaluate(d.mol, estimate_uncertainty=True) for d in self.validation_set]

added uncertainty scaling based on the validation set #33

Are you sure you want to change the base?

added uncertainty scaling based on the validation set #33

Conversation

donerancl commented Sep 13, 2024

mjohnson541 Sep 13, 2024

Choose a reason for hiding this comment

mjohnson541 Sep 13, 2024

Choose a reason for hiding this comment

mjohnson541 Sep 13, 2024

Choose a reason for hiding this comment

mjohnson541 Sep 16, 2024

Choose a reason for hiding this comment