-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
*: framework agnostic preprocessing #781
*: framework agnostic preprocessing #781
Conversation
a9d1035
to
0ba102d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huge amount of work! You seem to be pushing typescript to its limits haha
I have an error when trying to build the webapp:
Error: Cannot find module @rollup/rollup-darwin-arm64. npm has a bug related to optional dependencies (https://github.com/npm/cli/issues/4828). Please try 'npm i again after removing both package-lock.json and node_modules directory.
at requireWithFriendlyError (/Users/vignoud/Disco/review/node_modules/rollup
/dist/native.js: 59:9)
at Object. ‹anonymous> (/Users/vignoud/Disco/review/node_modules/rollup/dist/
native. js: 68:76)
•. 3 lines matching cause stack trace ...
at Module._load (node:internal/modules/cjs/loader:1023:12) at cjsLoader (node:internal/modules/esm/translators:356:17)
at ModuleWrap. ‹anonymous> (node:internal/modules/esm/translators:305:7) at ModuleJob. run (node:internal/modules/esm/module_job:218:25)
at async ModuleLoader. import (node:internal/modules/esm/loader:329:24) {
[cause]: Error: Cannot find module '@rollup/rollup-darwin-arm64'
Require stack:
- /Users/vignoud/Disco/review/node_modules/rollup/dist/native.js
at Module._resolveFilename (node:internal/modules/cjs/loader:1144:15)
at Module._load (node:internal/modules/cjs/loader:985:27) at Module. require (node:internal/modules/cjs/loader:1235:19)
at require (node:internal/modules/helpers:176:18)
at requireWithFriendlyError (/Users/vignoud/Disco/review/node_modules/roll up/dist/native.js:41:10)
at Object. ‹anonymous> (/Users/vignoud/Disco/review/node_modules/rollup/dis
t/native. js: 68:76)
at Module._compile (node:internal/modules/cjs/loader:1376:14)
at Module._extensions..js (node: internal/modules/cjs/Loader:1435:10) at Module. load (node:internal/modules/cjs/loader:1207:32)
at Module._load (node:internal/modules/cjs/loader: 1023:12) {
code: 'MODULE_NOT_FOUND' requireStack: [
'/Users/vignoud/Disco/review/node_modules/rollup/dist/native.js'
]
}
}
I didn't manage to solve the issue even by re-creating the package-lock.json (it actually solves this error but then I can't build the server anymore because of new errors)
I couldn't review everything yet because of this but I already left a lot of questions (nothing blocking except the ropllup error so far)
If you can't solve the rollup error without having access to a Mac we can organize a call next week if you want
5323bed
to
88b20b3
Compare
7d520b8
to
0ac6c29
Compare
héhé, yeah, it's what I do in pretty much any coding languages, pushing it to its limits then complaining about it (:
yeah,I think that the more we progress, the long the review will be, as we both know more of the code, what can get wrong, and are getting opiniated on various components. I feel quite good about it in fact, that means we care more for disco 🪩
fwiw, it was indeed a mac specific issue, that gets only triggered there, and will probably happen again if there is an update of rollup that I do on my side. well, now we know |
9a611b7
to
dbbf432
Compare
fe0e1c8
to
cca6dc3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing refactor!!
-
Mainly, I noticed that testing an image model always shows the images as red (while the accuracy is 100% in this case). That can be a simple UI mistake but maybe a deeper problem in the validation.
-
Not sure if this was introduced in this PR or the dark theme but when there is only one datapoint the chart now shows a bar instead of a dot
-
Testing with a language model shows a "Download CSV" button that doesn't work
cca6dc3
to
4f0bc55
Compare
ouf, it was a simple UI mistake, I inversed red & green in #682
ha yes, good point, I reworked it to make it a bit nicer. the curve is now "cubic" instead of "smooth" which makes it less jittery at the expense of precision.
right, it is not implemented. I wasn't sure of what to show in it so I prefered to throw a TODO. now, do we want to even have this feature? I feel that it is not really useful (for testing that is, for prediction it is useful). we need a way to present testing results but CSV is too technical.
WDYT? (for a next iteration) |
on the road to (closer by every PR) ONNX' land.
preprocessing revamp
tf.TensorContainer
NormalizedImage
{Image,Tabular,Text}Data
classes by replacing theses by correctly typed datasetdataset improvements
Dataset.cached
to avoid recomputation when iterating multiple timespreprocessOnce
option toDisco
to avoid all recomputation by keeping a preprocessed version of memory (more hardcore version ofDataset.cached
)Dataset.batch
constructs batch concurrentlytype safety
Model
,Task
,Trainer
, … generic on their datatype ("image" | "tabular" | "text"
) to check soundness at compile timeTask<'tabular'>
, is does have the required{input,output}Columns
{Disco,Trainer,Validator}<'image'>
doesn't get feed aDataset<{Tabular,Text}>
but only aDataset<[Image, label: string]>
TrainingInformation<'tabular'>.outputColumn
instead of multiple columDisco
&Validator
(Raw
), and what is internal (ModelEncoded
, after preprocessing and before postprocessing)misc
sadly, there is a performance drawback to that: as the preprocessing computation is solely done in JS which is fundamentally monothread (as python is), processing is slower than the tfjs-node accelerated version we have. I tried a few ways to workaround it and push V8 to do magic but it didn't work. there is some way to make make all that work (#758) but that would require platform specific accelerators. and this PR was already a bit long so pushing it to another iteration.