-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] Nice binning of time variables (Distributions, SOM) #4123
Conversation
cc749d9
to
f7d6551
Compare
Codecov Report
@@ Coverage Diff @@
## master #4123 +/- ##
==========================================
+ Coverage 85.62% 85.66% +0.03%
==========================================
Files 388 389 +1
Lines 69435 69638 +203
==========================================
+ Hits 59454 59654 +200
- Misses 9981 9984 +3 |
0b3c15d
to
947facb
Compare
I ❤️ it! |
So far I was not able to find any faults with it. It is responsive, labels are correct, output is as expected. This makes Distributions really useful for visualizing datetime data! The only thing that doesn't seem to be implemented yet is the legend in SOM - it shows floats for datetime. |
Orange/widgets/unsupervised/owsom.py
Outdated
self.colors = [QColor(*color) for color in self.attr_color.colors] | ||
else: | ||
col = self.data.get_column_view(self.attr_color)[0].astype(float) | ||
self.thresholds = decimal_binnings(col, min_bins=4)[0][1:-1] | ||
binning = decimal_binnings(col, min_bins=4)[-1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you forget to check whether self.attr_color is a Time Variable. This doesn't use time_binnings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops. I just ported it to new signature of decimal_binning
. It works now.
I'm not sure I like the legend. It uses short labels.
Perhaps the lower bound for each interval should be in the long form. That is, the last two should be 17 Apr - Jul
and >= 17 Jul
. The second would be in short form, except of course when the longer is needed, like in 16 Oct - 17 Jan
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
if not return_defs: | ||
bins = [get_bins(bin) for bin in bins] | ||
def time_binnings(data, *, min_bins=2, max_bins=50, min_unique=5, add_unique=0): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The output of time_binnings
and decimal_binnings
are not the same. The latter uses BinDefinition
, which has thresholds and labels properties, while the former doesn't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can't be. OWDistributions calls either time_binnings
or decimal_binnings
, stores the result and then doesn't care where the bins came from.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But OWSOM doesn't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was OWSOM's problem. Fixed.
97ff40b
to
04d9602
Compare
94fc142
to
29f8b43
Compare
Fixes #3964.
Description of changes
time_binning
function, similar todecimal_binning
, which can be used in any visualization widget that discretizes numeric featuresdecimal_binning
BinDefinition
decimal_binning
)Includes