-
Notifications
You must be signed in to change notification settings - Fork 394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move the runner into a separate package with minimal dependencies #557
Comments
Hm. It's true that Jinja2, Rich, and Textual aren't needed for collecting profile data, but they are needed for generating reports from the profile data, and as a general rule, we expect the reporters to be run on the same system as the report was generated on... In the case of |
I do realize it may need extra refactoring but I do believe it would be possible to implement the collection phase (including the extraction the required information from the running environment, e.g. from shared libraries) without those libraries, and make the resulting files completely portable. To clarify, I envision roughly the following flow:
I am unsure whether a CLI for steps 1 and 2 would be desired/useful. |
It's better in the sense that's more modular and allows to do what you are suggesting but has other significant drawbacks:
In any case, we will discuss this as there is some potential gain here but my guess is that we will unfortunately need to reject this suggestion as this complicates the maintenance and release and also entails some extra development to keep a workflow that is not the most common one. |
Update: we agreed that this makes sense to do and we will investigate ways to handle the complexity. Not sure how much it will take but we will probably do some form of this 👍 |
After talking about it, we decided we would take the following approach:
That last one means we also need to:
|
Cheers to this effort! 🎉 Wearing my reproducibility hat: So in the "more packages" architecture, a pure-python
Ignoring one vs many packages, as for plugins: there's really very few approaches as simple and robust as declarative [project.entry-points."memray.v0.reporter"]
console = "memray.reporter.console:ConsoleReporter"
html = "memray.reporter.html:HTMLReporter" Where the reporter is a well-typed class description of the minimum it needs to do the thing, e.g. This has the follow-on of allowing downstreams to also participate as first class citizens, e.g. for a speedscope reporter: [project.entry-points."memray.v0.reporter"]
speedscope-json = "memray_speedscope:SpeedScopeJSONReporter"
speedscope-html = "memray_speedscope:SpeedScopeHTMLReporter" Good luck, looking forward to this! |
Really, what we want is almost the opposite of extras. In a perfect world, we wouldn't have opt-in dependencies, we'd have opt-out dependencies. |
See https://github.com/tiangolo/typer/blob/master/pdm_build.py and https://github.com/tiangolo/typer/blob/master/pyproject.toml#L62-L105 as an example for a real-world package that implements "negative optional dependencies" (and similarly, FastAPI and others). There is no documentation about this since it is an internal configuration, but following their footsteps could be a viable option. |
xref for the much-requested (and awaited) |
As a downstream repackager of some of those packages... yeah, please don't. If one were to I humbly submit that multiple lightweight packages with real pins (hard |
Is there an existing proposal for this?
Is your feature request related to a problem?
The system I am currently working on involves some confidential customer data to which regular developers do not have access. Due to the complex nature of ML algorithms and libraries involved, sometimes we encounter anomalous memory usage which we cannot easily reproduce with our test data set. I am considering integration of some memory profiling mechanism, possibly involving memray which would be selectively enabled in production environment to help us to get insights into problematic cases without requiring access to customer data.
For multiple reasons (security, reliability, docker image size to name a few) we try to minimize the number of our dependencies. Memray has a number of dependencies which are unnecessary for trace collection:
memray/setup.py
Lines 91 to 96 in 578e02d
and those have their own dependencies as well.
Describe the solution you'd like
It would be great to split memray into two packages, tentatively named
memray-core
andmemray
. The former would only be able to perform collection, and the later would do everything memray does today.Alternatives you considered
Another possibility would be having all the dependencies not related to
memray.Tracker
to be under "extras" but that would probably be a nuisance for most ordinary users who would want a simplepip install memray
to install everything.The text was updated successfully, but these errors were encountered: