-
-
Notifications
You must be signed in to change notification settings - Fork 181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up filtered queries (but how?) #716
Comments
Please let me know in case I misunderstood something, my experience with the codebase is rather limited |
Thanks for bringing this up. You're right, as soon as filters are involved, summaries need to be recomputed, instead of being fetched from "cache" (aka. the At first sight, your suggestion to simply just fetch those heartbeats, that are relevant for the filtering (e.g. If you take a look at the aggregation logic, you'll find that, in order to get total coding time for label
With When only fetching heartbeats for label foo, you'll end up with a total coding time of So instead of "give me all heartbeats for 'foo'", your query would have to be something like "give me all heartbeats for 'foo' and everything in between each two of those that are farther than apart" or something - which, of course, would be hard to implement as an indexed query either. So while I definitely see the problem here, I can't think of a straightforward solution right now. |
Ah, yeah that explains why this wasn't implemented on a DB level in the first place. A classic case of the user thinking an improvement is easy, but failing to understand the complexity behind it :^) In theory, I guess we could move the current calculation into an SQL query. Especially Postgres should have support for these complex computations. I do imagine that a vendor-agnostic implementation would be rather challenging though. The performance would probably increase, yes, but I don't think the difference is major enough to warrant the complicated change. My next course of action, should I get around to it, would be to try profiling Wakapi on my local machine, connected to the remote database, to find potential bottlenecks. It'll probably just be my slow I/O, but there may be some potential left. |
I attempted to implement (a previous version of) the calculation logic entirely in SQL before (see here). However, as you correctly pointed out, it would require individual queries per supported database system, as the query can't be done in standard SQL. We could implement a MySQL- and Postgres-specific query and fall back to "application-side" aggregation for all other databases. But I'm a bit reluctant, because it would increase complexity by a lot and every change to it would then have to be done three times. When I find a spare moment, I might give it a try, let's see... |
Most promising refactoring in order to speed up filtered queries would probably be to make |
Today, I tried looking up my coding time for a specific label. Unfortunately, the requests time out with a HTTP 502 before completing.
Sadly, my host systems are pretty slow (old server CPU & RAM) and thus take ages to compute summaries. Checking routes/summaries, it appears as if all heartbeats are fetched no matter what filters are applied, which leads me to believe that filtering is done either in the template or somewhere in the frontend - at least not on a database level. My user has about 350 000 heartbeats, filtering them already took long enough before but it has reached the breaking point today.
Since these tables can utilize indexes for queryable fields on a database level and are overall faster than application level filters, I'd appreciate if we could brainstorm ideas on how to improve this current situation.
The text was updated successfully, but these errors were encountered: