You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have an analytics dataset with >100m rows and want to make a query over it and dump to parquet or csv for further analysis. Is it possible to do without intermediate storing in memory?
The text was updated successfully, but these errors were encountered:
It doesn't look like a streaming. It download first, then eats a lot of memory, and then writes everything to disk.
In my case it works with 5 columns and 60m rows (using around 60Gb ram), but doesn't with 80m (because of OOM killer) (the same query but with different 'limit').
You're correct, I think it could be possible to add support for streaming to the analytics command, I'll leave this issue up as a request for that feature.
I have an analytics dataset with >100m rows and want to make a query over it and dump to parquet or csv for further analysis. Is it possible to do without intermediate storing in memory?
The text was updated successfully, but these errors were encountered: