Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIX] Add labels to combo when data comes from distance matrix #2866

Merged
merged 2 commits into from
Jan 19, 2018

Conversation

janezd
Copy link
Contributor

@janezd janezd commented Jan 12, 2018

Fixes #2863.

Includes
  • Code changes
  • Tests

@janezd janezd added this to the 3.9 milestone Jan 12, 2018
@codecov-io
Copy link

codecov-io commented Jan 12, 2018

Codecov Report

Merging #2866 into master will increase coverage by 0.02%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #2866      +/-   ##
==========================================
+ Coverage   82.06%   82.09%   +0.02%     
==========================================
  Files         327      327              
  Lines       56111    56166      +55     
==========================================
+ Hits        46046    46107      +61     
+ Misses      10065    10059       -6

for var in domain.metas:
if var.is_string:
self.graph.attr_label = var
break
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to fixing labels for distance matrices, this PR changes some default behaviour in the same commit.
And I am not so sure we even want to auto-set labels to the first string var. The labels a user would want are just as often in a categorical var. But even if the first string var does contain instance labels, we might not want to show them unless explcitly selected, because of our notorious label-overlapping problems. It will not look nice for any dataset with a couple hundred instances (e.g. brown-selected has short labels, <200 instances, shows the correct thing, and I find it borderline ok)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd guess that showing string labels (as opposed to showing nothing) is going to minimize the expected number of mouse clicks. If I have string labels - and I'll often have a single one, I'd usually want to see them.

For class labels, we already have colors. For attribute labels, we can't guess which one to show and we'll usually miss.

@BlazZupan?

But I agree that we shouldn't show labels by default if there are too many. (You already said this on Friday, but I forgot.) Should I add an and len(data) <= 200 above?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A popular thing from text mining, which we showed several times in workshops (!), is to load a corpus (use the default book-excerpts), connect to Bag of Words and then to MDS to show a map of documents.
Try that after this PR ;)

I am still in favour of not showing any labels automatically, but if I had to, then in addition to len(data)<=200 I would add that the max len of all string values from the variable should be <12 or something like that.

@lanzagar lanzagar self-assigned this Jan 18, 2018
@janezd janezd force-pushed the fix-mds-label branch 2 times, most recently from 711be6f to 1a8901e Compare January 18, 2018 18:09
@janezd
Copy link
Contributor Author

janezd commented Jan 18, 2018

I removed default labels. I haven't squashed the commits, so we can bring them back (now or in the future).

@lanzagar lanzagar merged commit a6d4f7f into biolab:master Jan 19, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants