You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
update_interactions method of the Dataset accepts:
interactions_df : pd.DataFrame
method: tp.Literal["add", "replace"]
The main goal is to get old dataset id maps, extend them to new users and items. Then convert new interactions to internal ids and append those new interactions to the old ones.
We have one very important thing to remember: dataset id_map always has hot users (who have interactions) before warm users (who don't have interactions but have features). New hot users should start from the last id that was relevant for the old hot users.
But previously these id belonged to warm users. So warm users will have their ids changed. So user_features array should also be changed. Row numbers in user_features correspond to internal user ids.
Same for items.
In the first PR we can implement just one of the methods.
"add" method should just append new interactions to the old ones. (duplicate user-item pairs will have multiple entries, this will result in their weights summed in user-item matrix)
"replace" method should remove old interactions.
Why this feature?
This allows for incremental training for models that support fit_partial.
Feature Description
Draft functionality (to be discussed):
update_interactions
method of theDataset
accepts:interactions_df : pd.DataFrame
method: tp.Literal["add", "replace"]
The main goal is to get old dataset id maps, extend them to new users and items. Then convert new interactions to internal ids and append those new interactions to the old ones.
We have one very important thing to remember: dataset id_map always has hot users (who have interactions) before warm users (who don't have interactions but have features). New hot users should start from the last id that was relevant for the old hot users.
But previously these id belonged to warm users. So warm users will have their ids changed. So user_features array should also be changed. Row numbers in user_features correspond to internal user ids.
Same for items.
In the first PR we can implement just one of the methods.
"add" method should just append new interactions to the old ones. (duplicate user-item pairs will have multiple entries, this will result in their weights summed in user-item matrix)
"replace" method should remove old interactions.
Why this feature?
This allows for incremental training for models that support
fit_partial
.Additional context
Discussed here: #176
Updating user and item features should be done in next PRs.
Maybe we should call this method
update_data
or smth like that to create only one method for interactions and features with optional arguments.To be discussed
The text was updated successfully, but these errors were encountered: