Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Improve Dijkstra performance #1

Open
machosec opened this issue Dec 5, 2018 · 3 comments
Open

Feature Request: Improve Dijkstra performance #1

machosec opened this issue Dec 5, 2018 · 3 comments

Comments

@machosec
Copy link

machosec commented Dec 5, 2018

Hi @dirkjanm,
The current implementation of the Dijkstra algorithm is not optimal, which is noticeable when dealing with larger graphs. As you mentioned in the wiki, the ShortestPath algorithm is faster than Dijkstra but the performance differences can be greatly reduced with few optimizations.

Neo4j stores relationship's properties values on disk which mean a lot of unnecessary IO overhead which dramatically slowing the process of graph traversal. Because the weight is constant for each ACL, it's better to calculate the cost based on the relationship type (name) which is in-memory (assuming the entire graph can be loaded to RAM and cache is warmed up). You can achieve this by tweaking the neo4j-graph-algorithms plugin to use constant weight values for each ACL instead of looking at the relationship's property. This will also spare the creation of additional properties.

@dirkjanm
Copy link
Contributor

dirkjanm commented Dec 5, 2018

Thanks for thinking along with this :) Do you have an example of how and where this can be configured? Note that the weight is not constant for each ACL, it depends on the start and end node as well, see the code here: https://github.com/fox-it/aclpwn.py/blob/master/aclpwn/database.py#L78
I haven't come along something like this in the exposed cypher procedures. Also the default uses the REST API Dijkstra algorithm, which doesn't use the neo4j-graph-algorithms plugin.

@machosec
Copy link
Author

machosec commented Dec 5, 2018

Because Neo4j builtin Dijsktra capabilities are disappointing, you would probably have to write a user procedure that implements Dijkstra with relationship name as cost. This issue made others create custom plugins who modified the implementation to their liking.
This link is a good place to start: https://grokbase.com/t/gg/neo4j/156knxv03a/performance-difference-between-dijkstra-and-allshortestpath
With user procedures, you could also evaluate the weight based on the start/end node and make it REST compatible.

@MrAnde7son
Copy link

MrAnde7son commented Dec 10, 2018

Hi,
I wrote a code sample that uses a Map object for Dijkstra's CostEvaluator instead of relationship's distance property. Please note that it uses some of apoc's helpers function.
If you're not familiar with Neo4j's Procedures, you may refer to link.

import java.util.Map;
import java.util.stream.Stream;

import org.neo4j.graphdb.*;
import org.neo4j.logging.Log;
import org.neo4j.procedure.*;
import org.neo4j.graphalgo.WeightedPath;
import org.neo4j.graphalgo.GraphAlgoFactory;
import org.neo4j.graphalgo.PathFinder;
import org.neo4j.helpers.collection.Pair;

import static apoc.util.Util.toDouble;
import apoc.path.RelationshipTypeAndDirections;
import apoc.result.WeightedPathResult;

public class Procedure {

    @Context
    public Log log;

    @Procedure(value = "util.findWeightedPath")
    @Description("CALL util.findPathWithConfig(startNode, endNode, 'KNOWS|<WORKS_WITH|IS_MANAGER_OF>', {'KNOWS': 5}, defaultValue, numberOfWantedResults) YIELD path," +
            " weight")
    public Stream<WeightedPathResult> findWeightedPath(
            @Name("startNode") Node startNode,
            @Name("endNode") Node endNode,
            @Name("relationshipTypesAndDirections") String relTypesAndDirs,
            @Name("relationshipsWeights") Map<String, Double> relsWeights,
            @Name(value = "defaultWeight", defaultValue = "NaN") double defaultWeight,
            @Name(value = "numberOfWantedPaths", defaultValue = "1") long numberOfWantedPaths) {
        PathFinder<WeightedPath> algo = GraphAlgoFactory.dijkstra(
                buildPathExpander(relTypesAndDirs),
                (relationship, direction) -> toDouble(relsWeights.getOrDefault(relationship, defaultWeight)),
                (int)numberOfWantedPaths
        );
        return WeightedPathResult.streamWeightedPathResult(startNode, endNode, algo);
    }
	
	public static Map<String, Double> relsWeights = new HashMap<String, Double>() {
		{
			put("KNOWS", 1.0);
			put("WORKS_WITH", 10.0);
			put("IS_MANAGER_OF", 100.0);
		}
    };

	@Procedure(value = "util.findPath")
	@Description("CALL util.findPath(startNode, endNode, numberOfWantedResults) YIELD path, weight")
	public Stream<WeightedPathResult> searchPath(@Name("startNode") Node startNode,
													   @Name("endNode") Node endNode,
													   @Name(value = "numberOfWantedPaths", defaultValue = "1") long numberOfWantedPaths) {
		return this.findPathWithConfig(startNode, endNode, '>', relsWeights, 1, numberOfWantedPaths);
	}


    private static PathExpander<Object> buildPathExpander(String relationshipsAndDirections) {
        PathExpanderBuilder builder = PathExpanderBuilder.empty();
        for (Pair<RelationshipType, Direction> pair : RelationshipTypeAndDirections
                .parse(relationshipsAndDirections)) {
            if (pair.first() == null) {
                if (pair.other() == null) {
                    builder = PathExpanderBuilder.allTypesAndDirections();
                } else {
                    builder = PathExpanderBuilder.allTypes(pair.other());
                }
            } else {
                if (pair.other() == null) {
                    builder = builder.add(pair.first());
                } else {
                    builder = builder.add(pair.first(), pair.other());
                }
            }
        }
        return builder.build();
    }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants