Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

force garbage collection after large objects are removed #140

Merged
merged 2 commits into from
Feb 16, 2024

Conversation

cjyetman
Copy link
Member

@cjyetman cjyetman commented Feb 16, 2024

By forcing garbage collection after large objects are removed, significantly better memory management is achieved.

I don't relish doing this, but it works...

Before
Screenshot 2024-02-16 at 09 14 04

After
Screenshot 2024-02-16 at 10 39 30

@cjyetman cjyetman marked this pull request as ready for review February 16, 2024 09:44
@jdhoffa
Copy link
Member

jdhoffa commented Feb 16, 2024

NIT: Maybe we should...
Re-write data prep in a non-gc'd language where we can explicitly and safely manage memory, and as a bonus fosters safe multi-threading in an intuitive way 😳

🦀 🦀 🦀 🦀
crab walks out of the room

Copy link
Member

@jdhoffa jdhoffa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving as a "fine if we HAVE to" approval. And on the condition that we open a new issue to further explore a better way to get memory use within a reasonable limit.

There has to be a better way! But agree that this is an effective band-aid for now.

@cjyetman
Copy link
Member Author

NIT: Maybe we should... Re-write data prep in a non-gc'd language where we can explicitly and safely manage memory, and as a bonus fosters safe multi-threading in an intuitive way 😳

@jdhoffa do you know how to do that?

@jdhoffa
Copy link
Member

jdhoffa commented Feb 16, 2024

NIT: Maybe we should... Re-write data prep in a non-gc'd language where we can explicitly and safely manage memory, and as a bonus fosters safe multi-threading in an intuitive way 😳

@jdhoffa do you know how to do that?

Today, no certainly not. But I'd be very happy to give it a shot as a {pedagogical, experimental, pet project} type prototype if we think it's valuable.

@cjyetman
Copy link
Member Author

cjyetman commented Feb 16, 2024

just a note: using gc() is not total witchcraft... it's just explicitly determining when garbage collection occurs rather than waiting for R to guess when it's appropriate, which is a perfectly legitimate thing to do when you're using a lot of memory and you know for sure when is an appropriate time to run garbage collection

in fact, in some (many?) languages, you have to decide if/when to do it

@jdhoffa
Copy link
Member

jdhoffa commented Feb 16, 2024

I understand. I think the reliance on a garbage collector (either intentionally invoked or passively relied on) is gonna result in a memory hit. But I'm not even sure that's the biggest problem here frankly.

I'd love to explore #94 more and see if we can get better performance from that, I imagine we can

Copy link
Collaborator

@AlexAxthelm AlexAxthelm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DNLGTM (Does not look good to me)

Approved despite my visceral reaction.

@cjyetman
Copy link
Member Author

I'm happy to not merge this also. It's just an easy option for significantly improving the memory footprint, which has been a big problem for @jdhoffa and @AlexAxthelm running it.

@cjyetman cjyetman closed this Feb 16, 2024
@jdhoffa
Copy link
Member

jdhoffa commented Feb 16, 2024

Oh what? I very much do approve it! I just wanted to make sure we documented the follow-up steps (which was done in #141).

There's nothing functionally wrong with this, only NIT/ NB/style/"visceral" hesitancy

But for the record, I would much rather this be merged then the status quo of not being able to run data prep

@AlexAxthelm
Copy link
Collaborator

But for the record, I would much rather this be merged then the status quo of not being able to run data prep

Same. My approval stands.

@cjyetman cjyetman reopened this Feb 16, 2024
@cjyetman
Copy link
Member Author

ok, sorry... I misunderstood the hesitancy... re-opened and will merge

@cjyetman cjyetman merged commit 6de425b into main Feb 16, 2024
@cjyetman cjyetman deleted the improve-memory-management branch February 16, 2024 11:07
@AlexAxthelm
Copy link
Collaborator

😆 Just because I don't like it doesn't mean it's wrong.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants