Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: lazy tensor datatype #139

Merged
merged 94 commits into from
Feb 7, 2024
Merged

feat: lazy tensor datatype #139

merged 94 commits into from
Feb 7, 2024

Conversation

sebffischer
Copy link
Member

@sebffischer sebffischer commented Oct 11, 2023

We need this from mlr3pipelines: mlr-org/mlr3pipelines#737

@sebffischer sebffischer marked this pull request as draft October 12, 2023 07:49
@sebffischer
Copy link
Member Author

How to solve the issue with the cloning for the preprocessing eventually:

We can create a special PipeOp / Graph class where there is only a $forward() method and no state / parameter values.
I.e. the use-case of PipeOpModule.

We can then maybe use this for the internal graph to do away with all the deep clones that we don't need in 99% of the cases.

@sebffischer
Copy link
Member Author

Maybe: when augment is TRUE, input shape must be equal to output shape (are there cases where this is a restriction?)

@sebffischer
Copy link
Member Author

sebffischer commented Dec 5, 2023

Things to do:

  • autotest should check that shapes applied to batch is coherent, i.e. giving NA and a specific value is identical but for the first dim
  • Check that the predict shapes are added correctly and that it also throws an error (only use public API for this test)
  • replace the imguri with the lazy_tensor type (example tasks, alexnet learner)
  • Fix substitution bug
  • DataDescriptor should be R6 for more efficient serialization of lazy_tensor
  • fully remove the dollar method for the lazy tensor data type
  • remove attribute "hash" in c() function at the end
  • remove rlang, vctrs, fs
  • Move torchvision to suggests
  • Don't include batch dim when returning lazy tensor as a list
  • address open TODOs in code
  • maybe refactor materialize_internal to make it better understandable and fix the bug
  • Fix dataset_ltnsr
  • Customize hash input for lazy tensor in backend

* Fix substitution bug (by removing these shenanigans)
* Implements DataDescriptor as R6
* Removes the "." from some "private" fields
  (collapsed with DataDescriptor, as R6 classes use . only
   for private methods but these fields should be accessible)
DESCRIPTION Outdated Show resolved Hide resolved
R/materialize.R Outdated Show resolved Hide resolved
R/task_dataset.R Outdated Show resolved Hide resolved
R/utils.R Outdated Show resolved Hide resolved
Copy link
Member Author

@sebffischer sebffischer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix stuff

@sebffischer sebffischer merged commit a655242 into main Feb 7, 2024
5 checks passed
@sebffischer sebffischer deleted the feat/lazy_tensor branch February 7, 2024 17:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant