Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes memory leak in prepared statement cache. Fixes #1143 #1157

Merged
merged 14 commits into from
Jan 13, 2025

Conversation

matthughes
Copy link
Contributor

Prepared statements were created on the server and cached as part of the "prepare cache." However, despite entries being removed from the in-memory cache, the evicted entries were not removed from Postgres itself.

This modifies SemispaceCache to keep track of evicted elements. When a Session is recycled, we close any prepared statements that have been evicted.

Prepared statements were created on the server and cached as part of the
"prepare cache."  However, despite entries being removed from the
in-memory cache, the evicted entries were not removed from Postgres
itself.

This modifies SemispaceCache to keep track of evicted elements.  When a
Session is recycled, we close any prepared statements that have been
evicted.
def insert(k: K, v: V): SemispaceCache[K, V] = {
if (max == 0) this.withEvicted(v :: evicted) // special case, can't insert!
else if (gen0.size < max) SemispaceCache(gen0 + (k -> v), gen1, max, evicted) // room in gen0, done!
else SemispaceCache(Map(k -> v), gen0, max, gen1.values.toList ::: evicted)// no room in gen0, slide it down
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there some way we can eagerly close the evicted gen 1 items here? Instead of waiting until session end?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know. I was trying to do that for a while but kept getting stuck on the best place to do it. I don't like adding a method to Session but it seems the easiest to ensure everything that is evicted is cleaned up.

For example, if someone used Session.prepare... how would I know when I could safely get rid of that? It's somewhat easier for the execute methods.

The prepare methods seem to conflict with this cache model. Before the cache, you had docs saying to prepare the queries ahead of time and then re-use them each session. But now they are getting cached both manually by the user and in the SemispaceCache. So you could imagine a bug where user has prepared cache size of 10, then they call

session.prepare { pq =>

...
}

Then they create 10 more prepared statements using execute. This pq will get evicted from semispace cache and closed on the Postgres side and when they go to execute it again it will blow up.

Copy link
Member

@mpilquist mpilquist Jan 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't thought much about it yet but what about something like providing SemispaceCache an onEvicted callback? Then at least we know that any statements that have been evicted have also been closed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could say methods that use Session.prepare/prepareR just aren't cached via SemispaceCache and then update the execute/option/stream methods to close any evicted statements immediately?

@mpilquist
Copy link
Member

@matthughes I pushed a commit which eagerly closes evicted statements. Will add a few review comments with details.

@@ -365,7 +365,7 @@ object Session {
* isn't running arbitrary statements then `minimal` might be more efficient.
*/
def full[F[_]: Monad]: Recycler[F, Session[F]] =
closeEvictedPreparedStatements[F] <+> ensureIdle[F] <+> unlistenAll <+> resetAll
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No longer needed since we now close any evicted statement as soon as possible after cache mutation (either on an insert or get).


/** Like [[apply]] but doesn't acquire a mutex, allowing usage from within an existing exchange. */
private[skunk] def midExchange[F[_]: FlatMap: MessageSocket: Tracer]: Close[F] =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Total hack to re-use the logic in closing a statement from within the ParseDescribe exchange without acquiring the session mutex and hence deadlocking.


case Right(os) =>
OptionT(parseCache.value.get(stmt)).map(id => (id.pure, (_:StatementId) => ().pure)).getOrElse {

val closeEvictedStatements =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extracted here b/c this is needed in both the get and insert cases below (because get can cause eviction of gen1 in case where target entry is in gen1).


}

override def command[A](cmd: skunk.Command[A], ty: Typer): F[StatementId] = {

def describeExchange(span: Span[F]): F[(StatementId => F[Unit], F[Unit])] =
OptionT(cache.commandCache.get(cmd)).as(((_: StatementId) => ().pure, ().pure)).getOrElse {
OptionT(cache.commandCache.get(cmd)).as(((_: StatementId) => ().pure, cache.commandCache.clearEvicted.void)).getOrElse {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have to clear evicted cached entries for command cache here (and query cache below) just to ensure we don't leak entries. Unlike statements, we don't need to take any action on the evicted entries -- just ensure they are removed from the eviction space in the underlying SemispaceCache.

@mpilquist
Copy link
Member

mpilquist commented Jan 9, 2025

Discussed this some today with @matthughes. This PR as-is has a significant flaw -- if a client prepares a statement and holds on to it, then that statement is evicted from the parse cache and subsequently closed on the server, any subsequent use of the evicted statement results in an exception. @matthughes demonstrated this in 8799d74.

Another issue with this PR is that users can directly manipulate the ParseCache. For example, users can call clear or clearEvicted and resultantly leak statements (the statements wouldn't get closed on the server).

Options to fix:

  1. When binding/executing, check if the StatementId referenced by a Protocol.PreparedQuery is still in the parse cache. If not, then prepare it again and use the resulting statement id instead. We'd need to propagate the replacement statement id back up to the Protocol.PreparedQuery object, which is messy.
  2. Provide an API that allows preparing a statement without caching. Statements prepared this way will never be evicted & closed (until session cleanup) and hence are always safe to use. We rely on the fact that most statements are used immediately and not held on to otherwise.
  3. Make the ParseCache unbounded to avoid all of these eviction issues entirely. Maybe then also make the resource variants (e.g. prepareR) skip the cache and ensure the statement is cleaned up as part of resource cleanup.
  4. Remove prepared statement caching altogether. This would mean going back to the resource based APIs and would revert the work done in Idea: Statement Caching #496 and Implement per-session parsed statement cache #728. Probably the worst option here.

I'm leaning towards (3). This would entail making the parse cache a special unbounded cache with no support for user-land manipulation (instead of an instance of StatementCache). Or if we do support user-land eviction, then we'd need to make those operations issue close commands to the server. Either way, this isn't a StatementCache anymore.

@mpilquist
Copy link
Member

mpilquist commented Jan 11, 2025

I restored the logic @matthughes implemented originally, where any evicted prepared statements aren't closed until the recycling of the session they were prepared in. This avoids the risk of a prepared statement being unusable due to eviction during a session.

I also updated the parse cache to ensure manually cleared entries are still evicted at end of session.

I also modified prepareR to skip caching altogether and close the prepared statement at resource finalization.

Ready for review.

@matthughes
Copy link
Contributor Author

LGTM

@mpilquist mpilquist merged commit 0ab5ca9 into typelevel:main Jan 13, 2025
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants