-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize mnesia cache querying #625
Conversation
QLC queries via mnesia-based caches that would use the `{traverse, {select, MatchSpec}}` in any shape or form would cause the QLC query to be executed in two parts, the `mnesia:table` call running the entire table over the selected match specification, and then another Erlang list comprehension that would filter the results of that across a list comprehension in plain Erlang. Per the discussion in #622, this is how such a query would look like under `qlc:info/1`: qlc:q([ Guild || {GuildId, Guild} <- mnesia:table(nostrum_guilds, [{n_objects, 100}, {lock, read} | {traverse, {select, [{{'_', '$1', '$2'}, [], [{{'$1', '$2'}}]}]}}]), GuildId =:= RequestedGuildId ]) The issue is that neither QLC nor mnesia can cleanly optimize it here: Mnesia does not know about the condition specified in QLC, and QLC's optimization to use a `lookup_fun` is knocked out by the fact that it can't reach into the traverse call to detect the reordered columns. It might be possible to implement this in QLC itself, given it is smart enough to figure out when to use the lookup function based on the indices, together with some cooperation from Mnesia itself. Obviously, this behaviour would lead to unacceptable performance. This commit introduces an optimization that allows guild, member and presence cache implementations to export a `query_handle/1` function that accepts a match specification guard, that is, the "middle part" of a match specification. The guard determines which rows shall be filtered. Do note, however, that this is still unable to perform a complete optimization of lookups of single records - it will still traverse the table, but in the native ETS code.
I don't think this improves the performance on large tables much at all qlc:q([
Guild ||
{GuildId, Guild} <-
mnesia:table(nostrum_guilds,
[{n_objects, 100},
{lock, read} |
{traverse,
{select,
[{{'_', '$1', '$2'},
[{'==', '$1', {const, 12234}}],
[{{'$1', '$2'}}]}]}}]),
GuildId =:= RequestedGuildId
]) I'll inject a few thousand fake guilds and see how it performs locally in this PR vs |
Okay this is weird, tested with 5000 guilds injected into the table and iex(8)> func = fn ->
...(8)> start = :erlang.monotonic_time()
...(8)> Nostrum.Cache.GuildCache.get(guild.id)
...(8)> :erlang.monotonic_time() - start
...(8)> end) is usually taking 40-ish milliseconds on But an almost empty table takes under 1 millisecond on The report said it took closer to 400ms at 4000 guilds 🤔 There was no significant difference at 5000 guilds when using this PR |
@Th3-M4jor can you share your benchmarking script? @atlas-oc thanks for the valuable info from the graph. To be honest, that looks as expected. Our table type is a set, that gives us O(1) access on reads :-) |
Speaking of benchmarks, it might be nice to have that in a benchmark file, if you are interested in adding it. See |
Opened #627 to add a basic benchmark for I may update it with one for |
Thanks for the PR. I've noticed something else, namely that in theory QLC supports what we're doing - I've dug through the source code and it seems capable to translate QLC filters into match specifications directly without having to input it here. POC: diff --git a/lib/nostrum/cache/guild_cache/mnesia.ex b/lib/nostrum/cache/guild_cache/mnesia.ex
index 04f27955..80bc6dbb 100644
--- a/lib/nostrum/cache/guild_cache/mnesia.ex
+++ b/lib/nostrum/cache/guild_cache/mnesia.ex
@@ -291,8 +291,9 @@ if Code.ensure_loaded?(:mnesia) do
@doc "Get a QLC handle for the guild cache."
@spec query_handle :: :qlc.query_handle()
def query_handle do
- ms = [{{:_, :"$1", :"$2"}, [], [{{:"$1", :"$2"}}]}]
- :mnesia.table(@table_name, {:traverse, {:select, ms}})
+ #ms = [{{:_, :"$1", :"$2"}, [], [{{:"$1", :"$2"}}]}]
+ qh = :mnesia.table(@table_name, {:traverse, :select})
+ :nostrum_guild_cache_mnesia_qlc.transform_table_records(qh)
end
@impl GuildCache
diff --git a/src/nostrum_guild_cache_mnesia_qlc.erl b/src/nostrum_guild_cache_mnesia_qlc.erl
new file mode 100644
index 00000000..bea90c7c
--- /dev/null
+++ b/src/nostrum_guild_cache_mnesia_qlc.erl
@@ -0,0 +1,8 @@
+-module(nostrum_guild_cache_mnesia_qlc).
+-export([transform_table_records/1]).
+
+-include_lib("stdlib/include/qlc.hrl").
+
+
+transform_table_records(QH) ->
+ qlc:q([{GuildId, Guild} || {_RecordTag, GuildId, Guild} <- QH]). Yields query: iex(4)> IO.puts(:qlc.info(:nostrum_guild_cache_qlc.get(123, Nostrum.Cache.GuildCache.Mnesia)))
qlc:q([
Guild ||
{GuildId, Guild} <-
mnesia:table(nostrum_guilds,
[{traverse,
{select,
[{{'_', '$1', '$2'},
[true],
[{{'$1', '$2'}}]}]}},
{n_objects, 100},
{lock, read} |
{traverse, select}]),
GuildId =:= RequestedGuildId
]) Notice that we didn't specify an explicit matchspec but QLC got the message. That seems to stem from https://github.com/erlang/otp/blob/0b33c5a4c0a647cf7df6d50c8768b574c48752fd/lib/stdlib/src/qlc_pt.erl#L1140. I think it boils down to this issue I reported last year: erlang/otp#7268 diff --git a/lib/nostrum/cache/guild_cache/mnesia.ex b/lib/nostrum/cache/guild_cache/mnesia.ex
index 80bc6dbb..0ad7a855 100644
--- a/lib/nostrum/cache/guild_cache/mnesia.ex
+++ b/lib/nostrum/cache/guild_cache/mnesia.ex
@@ -293,7 +293,7 @@ if Code.ensure_loaded?(:mnesia) do
def query_handle do
#ms = [{{:_, :"$1", :"$2"}, [], [{{:"$1", :"$2"}}]}]
qh = :mnesia.table(@table_name, {:traverse, :select})
- :nostrum_guild_cache_mnesia_qlc.transform_table_records(qh)
+ qh
end
@impl GuildCache which effectively removes the intermediate query handle and allows it to directly hook to the iex(2)> IO.puts(:qlc.info(:nostrum_guild_cache_qlc.get(123, Nostrum.Cache.GuildCache.Mnesia)))
mnesia:table(nostrum_guilds,
[{traverse,
{select,
[{{'$1', '$2'}, [{'=:=', '$1', {const, 123}}], ['$2']}]}},
{n_objects, 100},
{lock, read} |
{traverse, select}]) (Note this query is "wrong" because it will compare the atom in the first column, not the guild ID, to |
So, I've been digging through the QLC source code. Parse transformAt compile time, the Don't even try to understand the code of the parse transform. I have no idea what is going on there... EvaluationWith our list comprehension rewritten to a record with magic functions, this is something that we can pass to My proposed optimization would basically traverse down into any nested included ProblemThe issue is that this is much harder than it sounds, because pretty much everything in the qlc record is represented a function that appears to be hooked together with the state machine in various ways. Even the match specification requires calling a function with some fun(size) -> fun(1) -> 2; (_) -> undefined end;
(template) -> fun(_, _) -> [] end;
(constants) -> fun(_) -> no_column_fun end;
(equal_constants) ->
fun(1) -> fun(1) -> {values, [2], {all, [2]}};
(_) -> false
end;
(_) -> no_column_fun
end;
(n_leading_constant_columns) -> fun(1) -> 1; (_) -> 0 end;
(constant_columns) -> fun(1) -> "\001"; (_) -> [] end;
(match_specs) ->
fun(1) -> {[{{'$1', '$2'}, [{'==', '$1', 2}], ['$2']}], all};
(_) -> undefined end;
(_) -> undefined Some of these are somewhat self-explanatory and then you have the question of when you have other variables than ConclusionHonestly, after some discussion with @jb3, we mostly conclude that while QLC is pretty cool tech, and a pretty cool idea, but with the invisible issues we have with it, and the (as evident also by the lack of upstream contributions to it) very hard to digest codebase. We've invested a lot of time into using it, testing it, and - sometimes - fighting it, and I'm sure given enough time one of us could manage to fix this issue, but I also think that time would be better supported by building something new, maybe supported by an EEP to integrate it into Erlang (think Elixir's We'll therefore sunset the QLC cache implementations and revert back to requiring individual functions for our cache accesses. |
QLC queries via mnesia-based caches that would use the
{traverse, {select, MatchSpec}}
in any shape or form would cause the QLC query to be executed in two parts, themnesia:table
call running the entire table over the selected match specification, and then another Erlang list comprehension that would filter the results of that across a list comprehension in plain Erlang. Per the discussion in #622, this is how such a query would look like underqlc:info/1
:The issue is that neither QLC nor mnesia can cleanly optimize it here: Mnesia does not know about the condition specified in QLC, and QLC's optimization to use a
lookup_fun
is knocked out by the fact that it can't reach into the traverse call to detect the reordered columns. It might be possible to implement this in QLC itself, given it is smart enough to figure out when to use the lookup function based on the indices, together with some cooperation from Mnesia itself. Obviously, this behaviour would lead to unacceptable performance.This commit introduces an optimization that allows guild, member and presence cache implementations to export a
query_handle/1
function that accepts a match specification guard, that is, the "middle part" of a match specification. The guard determines which rows shall be filtered. Do note, however, that this is still unable to perform a complete optimization of lookups of single records - it will still traverse the table, but in the native ETS code.