Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Force atlas fix to accept renumbered graphs #3759

Closed
wants to merge 7 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions datasets/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,16 @@ This directory contains small public datasets in `mtx` and `csv` format used by
| karate | 34 | 156 | No | No |
| dolphin | 62 | 318 | No | No |
| netscience | 1,589 | 5,484 | No | Yes |
| dining_prefs | 26 | 52 | Yes | Yes |

**karate** : The graph "karate" contains the network of friendships between the 34 members of a karate club at a US university, as described by Wayne Zachary in 1977.

**dolphin** : The graph dolphins contains an undirected social network of frequent associations between 62 dolphins in a community living off Doubtful Sound, New Zealand, as compiled by Lusseau et al. (2003).

**netscience** : The graph netscience contains a coauthorship network of scientists working on network theory and experiment, as compiled by M. Newman in May 2006.

**dining_prefs** : The graph dining_prefs contains dining partner preferences from a classic social network dataset originated by J. L. Moreno (1960). The Sociometry Reader. The Free Press, Glencoe, Illinois, pg.35



### Modified datasets
Expand Down
52 changes: 52 additions & 0 deletions datasets/dining_prefs.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
Ada Cora 1
Ada Louise 2
Cora Ada 1
Cora Jean 2
Louise Marion 1
Louise Lena 2
Jean Helen 1
Jean Robin 2
Helen Jean 1
Helen Eva 2
Martha Marion 2
Martha Anna 1
Alice Martha 2
Alice Eva 1
Robin Helen 2
Robin Eva 1
Marion Martha 1
Marion Frances 2
Maxine Eva 2
Maxine Adele 1
Lena Louise 2
Lena Marion 1
Hazel Hilda 1
Hazel Anna 2
Hilda Hazel 2
Hilda Betty 1
Frances Marion 2
Frances Eva 1
Eva Marion 2
Eva Maxine 1
Ruth Hilda 2
Ruth Jane 1
Edna Adele 2
Edna Mary 1
Adele Marion 2
Adele Frances 1
Jane Adele 1
Jane Mary 2
Anna Maxine 1
Anna Lena 2
Mary Edna 1
Mary Jane 2
Betty Hilda 2
Betty Edna 1
Ella Helen 2
Ella Ellen 1
Ellen Edna 2
Ellen Anna 1
Laura Eva 1
Laura Edna 2
Irene Hilda 1
Irene Ellen 2
54 changes: 54 additions & 0 deletions datasets/dining_prefs.mtx
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
%%MatrixMarket matrix coordinate pattern symmetric
26 26 52
1 2 1
1 3 2
2 1 1
2 4 2
3 9 1
3 11 2
4 5 1
4 8 2
5 4 1
5 15 2
6 9 2
6 20 1
7 6 2
7 15 1
8 5 2
8 15 1
9 6 1
9 14 2
10 15 2
10 18 1
11 3 2
11 9 1
12 13 1
12 20 2
13 12 2
13 22 1
14 9 2
14 15 1
15 9 2
15 10 1
16 13 2
16 19 1
17 18 2
17 21 1
18 9 2
18 14 1
19 18 1
19 21 2
20 10 1
20 11 2
21 17 1
21 19 2
22 13 2
22 17 1
23 5 2
23 24 1
24 17 2
24 20 1
25 15 1
25 17 2
26 13 1
26 24 2
7 changes: 7 additions & 0 deletions docs/cugraph/source/references/datasets.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,20 +2,27 @@

karate
- W. W. Zachary, *An information flow model for conflict and fission in small groups*, Journal of Anthropological Research 33, 452-473 (1977).

dining_prefs
- J. L. Moreno (1960). *The Sociometry Reader*, The Free Press, Glencoe, Illinois, pg.35

dolphins
- D. Lusseau, K. Schneider, O. J. Boisseau, P. Haase, E. Slooten, and S. M. Dawson,
*The bottlenose dolphin community of Doubtful Sound features a large proportion of long-lasting associations*,
Behavioral Ecology and Sociobiology 54, 396-405 (2003).

netscience
- M. E. J. Newman,
*Finding community structure in networks using the eigenvectors of matrices*,
Preprint physics/0605087 (2006).

email-Eu-core
- Hao Yin, Austin R. Benson, Jure Leskovec, and David F. Gleich.
*Local Higher-order Graph Clustering.*
In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2017.
- J. Leskovec, J. Kleinberg and C. Faloutsos.
*Graph Evolution: Densification and Shrinking Diameters*.
ACM Transactions on Knowledge Discovery from Data (ACM TKDD), 1(1), 2007. http://www.cs.cmu.edu/~jure/pubs/powergrowth-tkdd.pdf

polbooks
- V. Krebs, unpublished, http://www.orgnet.com/.
2 changes: 1 addition & 1 deletion python/cugraph/cugraph/experimental/datasets/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@


meta_path = Path(__file__).parent / "metadata"

dining_prefs = Dataset(meta_path / "dining_prefs.yaml")
karate = Dataset(meta_path / "karate.yaml")
karate_data = Dataset(meta_path / "karate_data.yaml")
karate_undirected = Dataset(meta_path / "karate_undirected.yaml")
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
name: dining_prefs
file_type: .csv
author: J.L. Moreno
url: https://data.rapids.ai/cugraph/datasets/dining_prefs.csv
refs:
J. L. Moreno (1960). The Sociometry Reader. The Free Press, Glencoe, Illinois, pg.35
delim: " "
header: None
col_names:
- src
- dst
- wgt
col_types:
- string
- string
- int
has_loop: true
is_directed: true
is_multigraph: false
is_symmetric: true
number_of_edges: 52
number_of_nodes: 26
number_of_lines: 52
8 changes: 5 additions & 3 deletions python/cugraph/cugraph/layout/force_atlas2_wrapper.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -56,9 +56,11 @@ def force_atlas2(input_graph,
if not input_graph.edgelist:
input_graph.view_edge_list()

# FIXME: This implementation assumes that the number of vertices
# is the max vertex ID + 1 which is not always the case.
num_verts = input_graph.nodes().max() + 1
# this code allows handling of renumbered graphs
if input_graph.is_renumbered():
num_verts = input_graph.renumber_map.df_internal_to_external['id'].max()+1
else:
num_verts = input_graph.nodes().max() + 1
num_edges = len(input_graph.edgelist.edgelist_df['src'])

cdef GraphCOOView[int,int,float] graph_float
Expand Down
24 changes: 20 additions & 4 deletions python/cugraph/cugraph/tests/layout/test_force_atlas2.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,13 @@
from cugraph.internals import GraphBasedDimRedCallback
from sklearn.manifold import trustworthiness
import scipy.io
from cugraph.experimental.datasets import karate, polbooks, dolphins, netscience
from cugraph.experimental.datasets import (
karate,
polbooks,
dolphins,
netscience,
dining_prefs,
)

# Temporarily suppress warnings till networkX fixes deprecation warnings
# (Using or importing the ABCs from 'collections' instead of from
Expand All @@ -43,11 +49,15 @@ def cugraph_call(
strong_gravity_mode,
gravity,
callback=None,
renumber=False,
):

G = cugraph.Graph()
if cu_M["src"] is not int or cu_M["dst"] is not int:
renumber = True
else:
renumber = False
G.from_cudf_edgelist(
cu_M, source="src", destination="dst", edge_attr="wgt", renumber=False
cu_M, source="src", destination="dst", edge_attr="wgt", renumber=renumber
)

t1 = time.time()
Expand All @@ -72,7 +82,13 @@ def cugraph_call(
return pos


DATASETS = [(karate, 0.70), (polbooks, 0.75), (dolphins, 0.66), (netscience, 0.66)]
DATASETS = [
(karate, 0.70),
(polbooks, 0.75),
(dolphins, 0.66),
(netscience, 0.66),
(dining_prefs, 0.50),
]


MAX_ITERATIONS = [500]
Expand Down
Loading