Very slow to launch and display more than 200 tasks #109

nekohayo · 2015-05-24T01:09:10Z

Essentially, https://bugs.launchpad.net/gtg/+bug/587307 still affects the current (0.3.1) version. If this was solved, there would be no need for issue #43.

sametmax · 2015-06-15T12:08:43Z

+1

mkorvas · 2017-06-22T10:18:59Z

After a crash, gtg has been eating 100% CPU for whopping 40 minutes that it's been launching so far. There is no indication of how much longer it's going to take. I normally don't stop gtg exactly for this reason -- by the time it has launched, half of my working day is gone. It's doing something really stupid, my tasks file has some 750 <task> elements, really not something to justify so much CPU time.

mkorvas · 2017-06-22T10:20:10Z

(Not sure why Github says that I unassigned @izidormatusov ; I didn't mean it that way.)

diegogangl · 2019-12-15T23:33:58Z

Of course, this needs to be profiled but IMO python + xml = performance poison. I bet most of that time is spent on disk IO and sloooowly looping over all the xml tags. There are probably some quick fixes to improve performance a little, but the real deal would be switching to sqlite or something like that.

ksamuel · 2019-12-18T11:57:59Z

Not at all, using lxml, most of the XML processing is done using C. As for I/O, no language is ever the bottleneck: the slowest one is a 100 times faster than disk access.

Anyway, there is no reason those to ever be an issue as you are no supposed to work on the XML directly, but load it in memory, translate it into an optimal datastructure, and only work on that, then dump the result on disk as XML only when necessary and in a separated thread.

All in all, the biggest problem of GTG is not performance, although I'm pretty sure the code is suboptimal give the small loads it has and how slow it is (this smells like quadratic algo to me, and probably working directly with data on the XML).

No, the problem is it's doing the work in the same thread as the UI, so any slow down freezes the interface and makes it feels sluggish.

Now 5 years ago I would have probably dug in the code and provide a patch, but I'm no longer using GTG: I switched to dynalist, a proprietary alternative. However, if someone wish to make it fast, follow what I wrote up there, there 80% of chance that those are the problem.

ploum · 2019-12-18T13:57:19Z

The process to read the XML file was really optimised and in its own thread. The whole thing prompted a huge rewrite of liblarch to be multi-thread safe, something which was really hard because GTK was not multi-thread safe. But this could be a simple bug preventing thread to operate in parallel due to a lock somewhere.

…

On 18 Dec 2019, at 12:58, ksamuel ***@***.***> wrote: Not at all, using lxml, most of the XML processing is done using C. As for I/O, no language is ever the bottleneck: the slowest one is a 100 times faster than disk access. The solution is simply to work in memory, and dump the result only when necessary, and in a separated thread. All in all, the biggest problem of GTG is not performance, although I'm pretty sure the code is suboptimal give the small loads it has and how slow it is, but the fact it's doing the work in the same thread as the UI: because of this, any slow down freeze the UI and make it feels sluggish. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

diegogangl · 2019-12-20T22:07:57Z

My bad guys, I got this mixed up with #43 which involves reading/writing to xml.
So yeah, besides profiling and finding that possible bug moving to lxml would surely improve performance.

diegogangl · 2020-03-21T00:24:36Z

I profiled this with cProfile, looks like the main culprit is in in liblarch. I don't really know much about liblarch but it seems to be the __sort_func being called everytime a row is inserted in the model.

We should revisit this for 0.5

Here's the prof in case anyone wants to take a stab:
all_tags.zip

(you can open it in snakeviz)

nekohayo · 2020-03-28T16:10:11Z

https://github.com/getting-things-gnome/liblarch/blob/master/liblarch_gtk/__init__.py#L340 ?

My 2¢ / hunch, without actually studying the code:

If sorting gets called on every operation, that indeed sounds foolish and, short of optimizing the sorting function itself, I suppose there would need to be mechanisms for the app to inhibit sorting and "trigger sorting when the time is right" (maybe based on a signal plus a resettable timer). I'm just guessing here, but I suspect that sorting only once the view/model has been populated, unless you can do cheap sorted insertion, is much more efficient than sorting the whole model on every insertion (if that's what it does currently).
Additionally, I guess the sorting function(s) could be reviewed to see if smarter algorithms can be used. If we could somehow "let Python handle the sorting" instead of whatever naïve sort we might have in place (I haven't checked, I'm just guessing) then there might be a massive gain there too.
- Python's built-in sorting function is ludicrously fast. It uses a derivative version of an adaptive merge sort called Timsort, which runs in "n log2 n" time.

nekohayo · 2020-04-30T22:48:09Z

I did a quick test today and clicked the "All tasks" view (while I had a tag selected), and GTG (re)loads all the tasks when you do that, which is when I realized that something definitely seems to make it single-threaded; otherwise I don't see why it would use "only one of the CPU cores at a time" as revealed by the graph during that time:

@ploum previously said:

The process to read the XML file was really optimised and in its own thread. The whole thing prompted a huge rewrite of liblarch to be multi-thread safe, something which was really hard because GTK was not multi-thread safe.
But this could be a simple bug preventing thread to operate in parallel due to a lock somewhere.

I don't know if the problem is in liblarch itself or not. For what it's worth, there are a few mentions of "threading" in:

...but nowhere else.

If someone were to figure out a way to ensure liblarch+GTG use all CPU cores when searching & displaying tasks, then it would be blazing fast. Even without lxml, you would then divide the waiting times by a factor of 2, 4, 8 or more, depending on the user's machine.

Currently, with 833 tasks, GTG git (to become 0.4) takes ~21 seconds to launch.
Once launched, if you switch to another tag, when you click "All tasks" to show the 833 tasks again, it takes ~11-12 seconds to do so (this is surprising to me, I would have expected the numbers to be the same, actually).
If it was using 4 CPU cores instead of 1 on my machine, it would probably manage to launch in in 5 seconds and switch between tags in max 2 seconds; that's very significant for perceived speed, responsiveness and usefulness.

lxml might yield immense gains and we should definitely switch to it if upholds that promise (I think @diegogangl already secretly started a branch for that), but even then, unless full multithreading is obtained "for free" with the switch to lxml, I think spending some effort on that wouldn't be a waste. Unless we find out that after a switch to lxml, GTG can start up in less than 5 seconds with 8000 tasks even if using a single CPU ;)

nekohayo · 2020-07-08T23:28:30Z

In case someone reading this wants to learn about some basic performance profiling techniques I mentioned in issue #402, see https://fortintam.com/blog/profiling-specto-and-whole-python-applications-in-general/ and other posts in https://fortintam.com/blog/tag/profiling/

Neui · 2020-10-29T16:20:33Z

I once tried to tackle this problem since it was also getting annoying for me. Documenting it here before I forget.
In my testing a startup takes around 10 seconds for like 600 open tasks. I/O doesn't seem to be the culprit.

I used python profiling feature and used flameprof to get an rough overview what is happening.

A lot of time is spend (re)filtering liblarch nodes in __update_node, that ends up emitting callbacks to insert data to the GTK Tree Model, which also sorts them (I think I once read they use some kind of tree, but I didn't confirmed it yet).
There is also an "update task" which sets values, I can't remember why they are there (I think tags are being added as we go? Too long ago).

jaesivsm · 2020-12-30T10:12:59Z

Hello there, unaware the discussion was already taking place here, I did open an issue in liblarch which I'm closing to focus the findings here.

TL;DR :

Why __update_node doesn't works on its own (from root, deleting what needs to be deleted) ? Should we change that ?
Could/Should we have a task/tag version control logic to limit tree node updates ?
Why are Tasks pushing stuff into the queue ?
Noob question : what's a gray task ?

What I found

Ok so I did some more profiling myself and here's what I found (or what I'm unsure about) :

refilter (which calls might be optimized (maybe)) browse the children and calls __update_node which is recursive. I tried changing the logic, calling __update_node less and it works quite well (guesstimation : splitting the refilter call time by at least 5) also every parent task stays gray so I'm guessing I must be missing something. It's lost to me why calling on __update_node on the root_id would not work, are root nodes a different breed ?

diff --git a/liblarch/filteredtree.py b/liblarch/filteredtree.py
index 0b65bf5..3b31576 100644
--- a/liblarch/filteredtree.py
+++ b/liblarch/filteredtree.py
@@ -324,16 +327,9 @@ class FilteredTree(object):
 
         # Build tree again
         root_node = self.tree.get_root()
-        queue = root_node.get_children()
-
-        while queue != []:
-            node_id = queue.pop(0)
-            # FIXME: decide which is the best direction
-            self.__update_node(node_id, direction="both")
-
-            node = self.tree.get_node(node_id)
-            for child_id in node.get_children():
-                queue.append(child_id)
+        self.__update_node(self.root_id, direction="both")
+        for child_id in self.tree.get_root().get_children():
+            self.__update_node(child_id, direction="both")
 
     def __is_displayed(self, node_id):
         """ Should be node displayed regardless of its current status? """

about __update_node optimization :
- lots of __update_node calls ends up with a update callback. It seems like update is the default (else: action = "update"). Managing some versions for each tasks (hash, sequence or whatever) and not triggering an update callback when nothing as been updated might also be the way to go (but I'm not sure of the implication on Gtk side, not knowledgeable at all on that lib).
- the few first __update_node calls also removes a lot of the tree (calls on deleted callbacks). And with huge data set, I'm seeing a lot of tasks being removed before being re-added. The tree is also cleaned dry at first on a refilter, it might be overkill.
Something I don't understand that might be the cause of a lot of useless processing : there is two way in which the internal data of Gtg are binded to something in gtk :
- one of liblarch tree, updated through stuff like either __update_node
- the GTG.core.task.Task.modified method which is called on each and every task modification

nekohayo · 2021-01-03T01:13:45Z

My initial testing with @jaesivsm's branch in PR #530 showed various operations, including startup, having their duration cut roughly in half on the heavy "Bryce" sample dataset (./launch.sh -s "bryce"). 200%, that's nothing to sneeze at! But I don't know how it'd behave with my own personal data (I suspect performance may vary depending on the dataset, especially if yours has a lot of closed tasks).

I have posted some benchmark observations on PR 530 as a comment, but they may need to be measured again once a fix is provided for the issue I found there.
Arguably it would be a bit annoying to have switching between top tabs not be instantaneous anymore (i.e. the tabs vs tags tradeoff), but I guess I can live with that for now; in theory however, the tabs population should be done async in the background, to avoid locking up the application "at the last minute" when the user clicks those tabs...
Even without nice async background loading of "other tabs", the performance gains we would probably get from @jaesivsm's branch, will probably be a welcome relief until we have a real fix for the core problem; when that happens, it'll be an extra boost on top!
If we can somehow figure out a big optimization in liblarch, the "async tabs loading in the background" trick might not be strictly necessary if the loading is fast enough (i.e: less than a second for a thousand tasks)
If this whole stuff could somehow be made multi-threaded (as I've pointed out in one of my earlier comments above here), we would be able to either further divide the time to process the tasks for each operation, or be able to process all tabs at once (one per CPU thread on machines that have at least 2-3 CPU cores/threads). Multi-threading would be bruteforcing our way by leveraging hardware rather than optimizing the software, but even if we made all the possible optimizations, having multithreaded processing would be nice to have anyway.

nekohayo · 2022-08-22T19:57:21Z

This hopefully would be fixed (I hope) by the big core rewrite in #553, and what I'm about to say here is not exactly new information but it probably is worth noting here "just in case":

The XML file format has been optimized (with LXML etc.) so it's really fast, and it has absolutely no bearing on performance because, as I've observed again today with 0.6...
...the number of shown tasks in the UI is what kills performance. On startup, with GTG set to "Actionable" view mode (so that it displays less stuff, as per the optimizations that were made possible in 0.6 with @jaesivsm's code), with my 2000+ tasks dataset, the difference between having the majority (maybe half?) of my tasks displayed on startup, vs "deferring" many hundreds of them into the future (ex: bumping them to the week-end), is the difference between a 48 seconds startup time and a 19 seconds startup time or less on my machine. It really is related to the GTK GUI, or the liblarch tree building, or both. Hopefully @diegogangl's big core rewrite bypasses/eliminates the whole problem.

diegogangl · 2024-04-16T14:53:59Z

@nekohayo do you consider this is still the case? I'm testing with a +780 tasks file and it loads reasonably fast (considering I still have an HDD :) )

izidormatusov self-assigned this May 24, 2015

izidormatusov mentioned this issue May 25, 2015

Show a splash screen with loading progressbar on startup if dataset/treeview is too big to be ready in less than 5 seconds #43

Open

izidormatusov added the priority:high label May 25, 2015

izidormatusov added this to the 0.4 milestone May 25, 2015

mkorvas unassigned izidormatusov Jun 22, 2017

nekohayo modified the milestones: 0.4 "Work of love", 0.5 "Die Hard" Dec 11, 2019

nekohayo added the enhancement label Dec 11, 2019

nekohayo mentioned this issue Feb 21, 2020

Automatic projects (aka micro-projects) listing in the sidebar (or as a view mode) #245

Open

nekohayo added the performance Issues affecting the speed and responsiveness of the application label Apr 17, 2020

nekohayo changed the title ~~Very slow to display ~250-500 tasks~~ Very slow to display more than 200 tasks Apr 30, 2020

nekohayo changed the title ~~Very slow to display more than 200 tasks~~ Very slow to launch and display more than 200 tasks Apr 30, 2020

nekohayo added the reproducible-in-git Issues that affect the current dev version label Apr 30, 2020

nekohayo mentioned this issue Jul 8, 2020

With a big dataset, switching between tags in workview (actionable) mode is slower than it used to be in 0.3.x #402

Closed

diegogangl removed this from the 0.5 "You Can (Not) Improve Performance" milestone Oct 24, 2020

nekohayo mentioned this issue Jan 27, 2021

Remember and activate the sidebar's last viewed tag filter on startup #551

Closed

nekohayo added this to the 0.7: "Getting things Gradius — Destroy the core!" milestone Mar 3, 2022

nekohayo mentioned this issue Mar 3, 2022

Complex saved searches (and "All tasks") can massively slow down startup - do not execute/populate them until clicked #840

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Very slow to launch and display more than 200 tasks #109

Very slow to launch and display more than 200 tasks #109

nekohayo commented May 24, 2015

sametmax commented Jun 15, 2015

mkorvas commented Jun 22, 2017

mkorvas commented Jun 22, 2017

diegogangl commented Dec 15, 2019

ksamuel commented Dec 18, 2019 •

edited

Loading

ploum commented Dec 18, 2019 via email

diegogangl commented Dec 20, 2019

diegogangl commented Mar 21, 2020

nekohayo commented Mar 28, 2020

nekohayo commented Apr 30, 2020

nekohayo commented Jul 8, 2020

Neui commented Oct 29, 2020

jaesivsm commented Dec 30, 2020

nekohayo commented Jan 3, 2021 •

edited

Loading

nekohayo commented Aug 22, 2022 •

edited

Loading

diegogangl commented Apr 16, 2024

Very slow to launch and display more than 200 tasks #109

Very slow to launch and display more than 200 tasks #109

Comments

nekohayo commented May 24, 2015

sametmax commented Jun 15, 2015

mkorvas commented Jun 22, 2017

mkorvas commented Jun 22, 2017

diegogangl commented Dec 15, 2019

ksamuel commented Dec 18, 2019 • edited Loading

ploum commented Dec 18, 2019 via email

diegogangl commented Dec 20, 2019

diegogangl commented Mar 21, 2020

nekohayo commented Mar 28, 2020

nekohayo commented Apr 30, 2020

nekohayo commented Jul 8, 2020

Neui commented Oct 29, 2020

jaesivsm commented Dec 30, 2020

TL;DR :

What I found

nekohayo commented Jan 3, 2021 • edited Loading

nekohayo commented Aug 22, 2022 • edited Loading

diegogangl commented Apr 16, 2024

ksamuel commented Dec 18, 2019 •

edited

Loading

nekohayo commented Jan 3, 2021 •

edited

Loading

nekohayo commented Aug 22, 2022 •

edited

Loading