Add support for custom timestamped snapshots of memory #678

jgbradley1 · 2024-09-09T23:03:13Z

Is there an existing proposal for this?

I have searched the existing proposals

Is your feature request related to a problem?

I have a long-running process defined as an asynchronous pipeline of multiple steps. You may think of it as an ETL pipeline. These steps can consist of activity such as modifying pandas dataframes for example).

Not all steps have memory issues but I would like to use memray to understand how memory is used and passed around in my pipeline over time. There is a memory explosion that occurs in one of the final steps of the pipeline.

Since the current flamegraph report only displays a snapshot of memory use at peak memory time, it is hard for me to investigate what/where in prior steps of the pipeline might be a contributing factor to the explosion. While analyzing the point of peak memory explosion is very insightful, it does not capture enough information about other parts of my pipeline that could be optimized compared to areas that cannot (known reasons for the explosion like a join call between pandas dataframes is always memory expensive). Instead, focusing on reducing memory in prior steps would reduce the impact of the memory explosion in later steps and still lead to an overall reduction in memory.

Describe the solution you'd like

Is there a way to plant some sort of marker in my code (i.e. as a function decorator or API call) that signals to memray to take a snapshot of the memory usage?

Ideally I'd like to look at a plot of heap memory usage over time and see markers where my code called into memray to record a place in time and code. This functionality would enable users to make custom calls to memray in their code (my pipeline code for example) to generate timestamped snapshots of memory and compare them over time (not just at peak memory usage).

Alternatives you considered

No response

The text was updated successfully, but these errors were encountered:

godlygeek · 2024-09-10T07:33:31Z

Have you seen https://bloomberg.github.io/memray/flamegraph.html#temporal-flame-graphs ? The --temporal mode lets you compare the memory usage between two moments in your program's execution.

jgbradley1 added the enhancement New feature or request label Sep 9, 2024

jgbradley1 changed the title ~~Add annotations~~ Add support for custom timestamped snapshots of memory Sep 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for custom timestamped snapshots of memory #678

Add support for custom timestamped snapshots of memory #678

jgbradley1 commented Sep 9, 2024 •

edited

Loading

godlygeek commented Sep 10, 2024

Add support for custom timestamped snapshots of memory #678

Add support for custom timestamped snapshots of memory #678

Comments

jgbradley1 commented Sep 9, 2024 • edited Loading

Is there an existing proposal for this?

Is your feature request related to a problem?

Describe the solution you'd like

Alternatives you considered

godlygeek commented Sep 10, 2024

jgbradley1 commented Sep 9, 2024 •

edited

Loading