You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I used the LDBC-SNB dataset with a Scale Factor of 1. For this test statement below
match (n0:Place)<-[r1:personIsLocatedIn]-(n1:Person)-[r2]->(n2)-[r3]->(n3:Place) return count(n3)
Running this in both TuGraph and Neo4j and averaging the 5 tests, the average time for the test in TuGraph is 96.86s and the average time for the test in Neo4j is 6.89s.
I added DB_Hit statistics for each op in TuGraph, here are the profile results for TuGraph and Neo4j.
TuGraph:
Neo4j:
The total DB_Hit in TuGraph is 297445717, and the total DB_Hit in Neo4j is 5749972. From the execution plan, you can see that TuGraph is directly Expanding from n0 to n3, and since the number of rows in n2 is very high, the number of edges from n2 to n3 is very high as well.
However, Neo4j uses Cost-Based Optimizer to split this path into two parts, both parts of the DB_Hit are less, although the last need to do a NodeHashJoin, but overall time is still far less than TuGraph.
So I just wanted to ask if there is a better way to optimize the execution plan on the TuGraph side, or will the TuGraph side consider implementing it afterwards?
The text was updated successfully, but these errors were encountered:
A nice profile! Now TuGraph-DB only supports RBO instead of CBO. So it is hard to implement such path split when expanding without statistics. The team is working on a new advanced optimizer, and it hopefully can be open source in the next year. Contribution are welcome if you are interested in this feature.
Description
I used the LDBC-SNB dataset with a Scale Factor of 1. For this test statement below
Running this in both TuGraph and Neo4j and averaging the 5 tests, the average time for the test in TuGraph is
96.86s
and the average time for the test in Neo4j is6.89s
.I added DB_Hit statistics for each op in TuGraph, here are the profile results for TuGraph and Neo4j.
The total DB_Hit in TuGraph is
297445717
, and the total DB_Hit in Neo4j is5749972
. From the execution plan, you can see that TuGraph is directly Expanding from n0 to n3, and since the number of rows in n2 is very high, the number of edges from n2 to n3 is very high as well.However, Neo4j uses Cost-Based Optimizer to split this path into two parts, both parts of the DB_Hit are less, although the last need to do a
NodeHashJoin
, but overall time is still far less than TuGraph.So I just wanted to ask if there is a better way to optimize the execution plan on the TuGraph side, or will the TuGraph side consider implementing it afterwards?
The text was updated successfully, but these errors were encountered: