You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It is super helpful to quickly see meaningful column statistics in the "DataFame-Output".
like: min, max, mean, median, percentiles, distinct values, null count, frequency, top values, distribution, ...
The (annoying) alternative is to have multiple (temporary) cells for looking at different column stats.
Suggested solution
Marimo already has some column statistics available which is great(!) but there is some room for improvement.
Currently inf/float columns only show a graph while all others show only null/unique count.
Thanks for sharing. We calculate this information already (and I am pretty sure we do send it to the front end). We can surface this.
Does the column stats mode ('off', 'compact', 'detailed') apply to all dataframes? is that essentially a user setting or dataframe setting? when you close/reopen the notebook, is that setting persisted?
Just tested and the mode is individual per cell. Also whenever a cell is rerun it is set back to "off" (which is a bad design In my opinion).
I would prefer a global option for all statistics and maybe a cell level in addition if necessary which preserves the state when rerunning 😆
Here just a simple screenshot of what is looks like in PyCharm Professional
Aside from the statistics PyCharm and Databricks notebooks also have amazing build in Visualization support (Databricks also with tabs to create multiple visuals per DataFrame in a compact and easy way.)
If you are interested I can open another issue and explain more detailed what I mean and love so see 😉
Description
It is super helpful to quickly see meaningful column statistics in the "DataFame-Output".
like: min, max, mean, median, percentiles, distinct values, null count, frequency, top values, distribution, ...
The (annoying) alternative is to have multiple (temporary) cells for looking at different column stats.
Suggested solution
Marimo already has some column statistics available which is great(!) but there is some room for improvement.
Currently inf/float columns only show a graph while all others show only null/unique count.
PyCharm does this very well imo! (see for example: https://blog.jetbrains.com/pycharm/2024/10/data-exploration-with-pandas/)
They have many different statistics dependent on the data type.
They have also have to option to choose the detail level:
Alternative
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: