You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is the first nasty thing of the type I feared when removing Variable.make.
File -> kMeans -> Merge Data. Connect another File -> kMeans and then to the same merge data. In Merge data, set the key to "Row Index". On the output from Merge data I want to have heart_disease data with two additional columns with cluster labels, so I can compare clusterings.
After [FIX] Merge data: Rename variables with duplicated names #4076 I don't use Edit Domains and I expected to have columns "Clusters (1)" and "Clusters (2)". This however does not happen because attributes are now matched by name and type, so Merge data believes that both tables have the same attribute "Clusters", and it doesn't duplicate it. It takes the column from the first table and ignores the one from the second.
If we decide that, no, it should keep two columns, it would also duplicate all other columns (age, max HR...).
Options:
Revert removal of Variable.make.
Do nothing. The user has to rename the columns with duplicated names.
Note that with outer join (which is exactly the situation you've drawn on the paper), the tables are equivalent, so it makes sense to keep both columns. This is what the first if in var_needed does: if there are any rows from the right table that are concatenated at the bottom, all right attributes are kept. Without this, outer join could add rows that would come from the right table but contain only columns from the left table -- so all this rows would have just nans.
This is the first nasty thing of the type I feared when removing
Variable.make
.File -> kMeans -> Merge Data. Connect another File -> kMeans and then to the same merge data. In Merge data, set the key to "Row Index". On the output from Merge data I want to have heart_disease data with two additional columns with cluster labels, so I can compare clusterings.
If we decide that, no, it should keep two columns, it would also duplicate all other columns (age, max HR...).
Options:
Revert removal ofVariable.make
.My 4 cents:
Variable.make
. Something much worse must happen to reintroduce itThis problem was not caused by #4076. #4076 just didn't (and couldn't have) fixed it.
The text was updated successfully, but these errors were encountered: