Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds benchmarks for nx-cugraph #3854

Merged

Conversation

rlratzel
Copy link
Contributor

@rlratzel rlratzel commented Sep 8, 2023

closes rapidsai/graph_dl#299

This PR adds new benchmarks for nx-cugraph which can be used to compare performance for NetworkX with and without the cugraph backend.

These benchmarks depend on pytest, pytest-benchmark, networkx>=3.0, cugraph (only for the Dataset APIs), and nx-cugraph

Some results can be seen here:
image
image

Other changes:

  • black is now run on all python sources under benchmarks, which resulted in several format-only changes.
  • Several other updates to versions, etc. were applied to various .yaml files in order to resolve CI errors.
  • Merged in changes from jnke2016/branch-23.10_increase-timeout to get changes needed to fix dask CI failure.

rlratzel and others added 22 commits August 24, 2023 17:10
…s to specify a number of nodes for checking paths, sets is_directed metadata on dataset objs, some refactoring and cleanup.
A [PR](rapidsai#3002) updating the vertex pair column names was merged few releases ago however few docstrings weren't.
This PR updates the docstrings for Jaccard and Sorensen.

Authors:
  - Joseph Nke (https://github.com/jnke2016)

Approvers:
  - Alex Barghi (https://github.com/alexbarghi-nv)

URL: rapidsai#3817
This PR replaces the `copy_prs` functionality from the `ops-bot` with the new dedicated `copy-pr-bot` GitHub application.

Thorough documentation for the new `copy-pr-bot` application can be viewed below.

- https://docs.gha-runners.nvidia.com/apps/copy-pr-bot/

**Important**: `copy-pr-bot` enforces signed commits. If an organization member opens a PR that contains unsigned commits, it will be deemed untrusted and therefore require an `/ok to test` comment. See the GitHub docs [here](https://docs.github.com/en/authentication/managing-commit-signature-verification/about-commit-signature-verification) for information on how to set up commit signing.

Any time a PR is deemed untrusted, it will receive a comment that looks like this: rapidsai/ci-imgs#63 (comment).

Every subsequent commit on an untrusted PR will require an additional `/ok to test` comment.

Any existing PRs that have unsigned commits after this change is merged will require an `/ok to test` comment for each subsequent commit _or_ the PR can be rebased to include signed commits as mentioned in the docs below:
https://docs.gha-runners.nvidia.com/cpr/contributors.

This information is all included on the documentation page linked above.

_I've skipped CI on this PR since it's not a change that is tested._

[skip ci]
This PR is on top off the changes from rapidsai#3831.

Temporarily disables single-GPU "MG" tests in CI until rapidsai#3790 is closed.
This will unblock CI for PRs unrelated to the issue in rapidsai#3790 at the risk of removed coverage for MG code paths. Hopefully nightly MG testing will minimize the risk.
A followup PR will be submitted that re-enables the tests and must be merged prior to 23.10 burndown.

Authors:
  - Naim (https://github.com/naimnv)
  - Rick Ratzel (https://github.com/rlratzel)

Approvers:
  - Rick Ratzel (https://github.com/rlratzel)
  - Ray Douglass (https://github.com/raydouglass)

URL: rapidsai#3833
…3813)

Closing rapidsai#3801

I also submitted a minimum reproducer to the slack thrust channel.

Authors:
  - Seunghwa Kang (https://github.com/seunghwak)
  - Chuck Hastings (https://github.com/ChuckHastings)

Approvers:
  - Naim (https://github.com/naimnv)
  - Joseph Nke (https://github.com/jnke2016)
  - Chuck Hastings (https://github.com/ChuckHastings)

URL: rapidsai#3813
The python API is now leveraging the CAPI for both [betweenness](rapidsai#2971) and [edge betweenness centrality](rapidsai#3672) therefore, the legacy code is no longer used anywhere. This PR cleanup the C++ API.

closes rapidsai#2651
closes rapidsai#3272

Authors:
  - Joseph Nke (https://github.com/jnke2016)
  - Chuck Hastings (https://github.com/ChuckHastings)

Approvers:
  - Chuck Hastings (https://github.com/ChuckHastings)

URL: rapidsai#3829
Add pygraphistry to oss list ("(please post an issue if you have a project to add to this list)")

Authors:
  - https://github.com/lmeyerov
  - Chuck Hastings (https://github.com/ChuckHastings)

Approvers:
  - Rick Ratzel (https://github.com/rlratzel)
  - Chuck Hastings (https://github.com/ChuckHastings)

URL: rapidsai#3826
See: rapidsai#3773

Possible follow-up tasks:
- Update to use threshold parameter exposed from C++ (rapidsai#3792)
- Add `max_level` argument to networkx implementation
  - ~Or, add `max_level` as extra`cugraph_nx`-specific argument~ (**done**)
- Update PLC to handle empty graphs gracefully (rapidsai#3804)
- Update PLC to handle directed graphs
- Add `louvain_partitions` (needs added to PLC)
  - https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.community.louvain.louvain_partitions.html

This is passing many networkx tests. I don't have this as draft, b/c it's usable (and I would argue) mergable as is.

Authors:
  - Erik Welch (https://github.com/eriknw)

Approvers:
  - Rick Ratzel (https://github.com/rlratzel)

URL: rapidsai#3803
Ensures that batches are renumbered starting from the starting batch id rather than 0.
Adds an appropriate failing test, which passes with the change.
Closes rapidsai#3819

Authors:
  - Alex Barghi (https://github.com/alexbarghi-nv)

Approvers:
  - Vibhu Jawa (https://github.com/VibhuJawa)
  - Brad Rees (https://github.com/BradReesWork)

URL: rapidsai#3823
…apidsai#3809)

This PR makes a handful of changes aimed at simplifying the CI pipeline for building wheels as a precursor to switching RAPIDS nightlies to using proper alpha versions:
- Inlines apply_wheel_modifications.sh in build_wheel.sh. Now that the build doesn't rely excessively on logic in shared workflows, there's no real benefit to having a separate script (previously apply_wheel_modification.sh was a special script that the shared workflow knew to execute i.e. it was a hook into an externally controlled workflow).
- Consolidates the textual replacements using for loops and makes the replacements more targeted by only modifying the Python package being built in a given script. For instance, python/cugraph/pyproject.toml is no longer overwritten when building pylibcugraph.
- Modifies dependency specs for RAPIDS packages to include a `>=0.0.0a0` component. This is the key change that will allow alpha dependencies to be discovered. dask-cuda is the canary here because we already upload alphas of it, so the installation of cugraph in the test job should pull the latest dask-cuda alpha now without requiring direct installation from git.

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)
  - Brad Rees (https://github.com/BradReesWork)

URL: rapidsai#3809
The `uniform_neighbor_sample` code is becoming increasingly difficult to maintain.  This PR removes all the options that were deprecated in the previous release, and also deprecates the `with_edge_properties` option, which will be replaced by returning whatever properties are in the graph in the next release.

This PR also resolves a FIXME by allowing `fanout_vals` to be a `cupy.ndarray`, `numpy.ndarray`, or `cudf.Series`.

Closes rapidsai#3698

Authors:
  - Alex Barghi (https://github.com/alexbarghi-nv)

Approvers:
  - Brad Rees (https://github.com/BradReesWork)
  - Rick Ratzel (https://github.com/rlratzel)

URL: rapidsai#3816
rapidsai/raft#1746 added static targets for RAFT that can be used directly now instead of building RAFT static explicitly

Authors:
  - Divye Gala (https://github.com/divyegala)

Approvers:
  - Chuck Hastings (https://github.com/ChuckHastings)

URL: rapidsai#3842
…apidsai#3846)

Add a property getter for batch size.  Requested by JoC.

Authors:
  - Alex Barghi (https://github.com/alexbarghi-nv)

Approvers:
  - Rick Ratzel (https://github.com/rlratzel)

URL: rapidsai#3846
We decided to rename `cugraph-nx` to `nx-cugraph` to follow (and help establish) conventions for names of networkx backends. See: networkx/networkx#6883

This PR was created from the following commands:
```sh
mv notebooks/ ../notebooks-bak
find * -type f -print0 | xargs -0 sed -i 's/cugraph_nx/nx_cugraph/g'
find * -type f -print0 | xargs -0 sed -i 's/cugraph-nx/nx-cugraph/g'
git mv ./conda/recipes/cugraph-nx ./conda/recipes/nx-cugraph
git mv ./python/cugraph-nx ./python/nx-cugraph
git mv ./python/nx-cugraph/cugraph_nx ./python/nx-cugraph/nx_cugraph
mv ../notebooks-bak/ notebooks
```
(a more reliable bash script would ensure the destination of `git mv` does not exist yet, b/c if the destination is a directory, it will happily--and incorrectly--move the target _into_ the directory)
```sh
# Make sure everything got renamed correctly
git grep -i 'cugraph.nx'
find . -iname '*cugraph*nx*' -print
```
Should we remove `cugraph-nx` nightlies once this is merged?

Authors:
  - Erik Welch (https://github.com/eriknw)

Approvers:
  - Rick Ratzel (https://github.com/rlratzel)
  - Ray Douglass (https://github.com/raydouglass)

URL: rapidsai#3840
This PR migrates SAGEConv and RGCNConv to cugraph-pyg, in preparation for removing these models from upstream.
`pylibcugraphops` now becomes a dependency of cugraph-pyg.

Authors:
  - Tingyu Wang (https://github.com/tingyu66)

Approvers:
  - Alex Barghi (https://github.com/alexbarghi-nv)
  - AJ Schmidt (https://github.com/ajschmidt8)

URL: rapidsai#3763
The threshold parameter (referred to as `epsilon` in most of the centrality measures) is used to define when to stop the iterative steps of Louvain.  Once the modularity increase for an iteration of Louvain is smaller than the threshold we will stop that iteration and start coarsening the graph.

This parameter was hard-coded in the initial C++ implementation of Louvain.  This PR exposes this parameter through the C++, C API, PLC and Python layers.

The PR also renames the python parameter `max_iter` to be `max_level`, which is more appropriate semantically.

Closes rapidsai#3791

Authors:
  - Chuck Hastings (https://github.com/ChuckHastings)

Approvers:
  - Seunghwa Kang (https://github.com/seunghwak)
  - Naim (https://github.com/naimnv)
  - Joseph Nke (https://github.com/jnke2016)
  - Rick Ratzel (https://github.com/rlratzel)

URL: rapidsai#3792
When calling `client.has_what(`) which returns the data's key that are held in each worker’s memory, those keys used to be returned as string but a recent change in `dask` changed the type to tuples
 
From `{worker_ip_address: ("('from-delayed-190587f1b2318dc54d5f92a79e59b71a', 0)", "('from-delayed-190587f1b2318dc54d5f92a79e59b71a', 1)")}` to`{worker_ip_address: (('from-delayed-c3d92b2cc9948634e82a0b2b62453a6c', 0), ('from-delayed-c3d92b2cc9948634e82a0b2b62453a6c', 1))}`
 
When mapping workers to persisted data in the function `get_persisted_df_worker_map`, an assumption about the type of those keys was made thereby breaking our MG tests.

This PR removes that assumption.
Closes rapidsai#3834

Authors:
  - Joseph Nke (https://github.com/jnke2016)
  - Alex Barghi (https://github.com/alexbarghi-nv)

Approvers:
  - Alex Barghi (https://github.com/alexbarghi-nv)

URL: rapidsai#3835
Closes rapidsai#3820

This PR adds simple getter methods to the `dataset` class, which allows users to easily get information about datasets without need to access the `metadata` dict or look in the directory.

```python
from cugraph.datasets import karate

# users now call
karate.number_of_nodes()

# instead of
karate.metadata['number_of_nodes']
```

Authors:
  - ralph (https://github.com/nv-rliu)

Approvers:
  - Alex Barghi (https://github.com/alexbarghi-nv)

URL: rapidsai#3821
Applies same changes for the same reasons as cuDF PR rapidsai/cudf#14067 to cuGraph.

Authors:
  - Rick Ratzel (https://github.com/rlratzel)

Approvers:
  - Ray Douglass (https://github.com/raydouglass)

URL: rapidsai#3853
@rlratzel rlratzel added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Sep 8, 2023
@rlratzel rlratzel added this to the 23.10 milestone Sep 8, 2023
@rlratzel rlratzel self-assigned this Sep 8, 2023
@copy-pr-bot
Copy link

copy-pr-bot bot commented Sep 8, 2023

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@trxcllnt

This comment was marked as duplicate.

@trxcllnt

This comment was marked as duplicate.

@trxcllnt

This comment was marked as duplicate.

dependencies.yaml Outdated Show resolved Hide resolved
@trxcllnt

This comment was marked as duplicate.

@trxcllnt
Copy link
Collaborator

/ok to test

alexbarghi-nv and others added 9 commits September 30, 2023 23:35
Created based on code from @dongxuy04 

Adds support for `WholeGraph` `WholeMemory` in the cuGraph `FeatureStore` class.  This enables both DGL and PyG to take advantage of distributed feature store functionality.

Adds `pylibwholegraph` as a testing dependency so the feature store can be tested.  Adds appropriate SG and MG tests.

Authors:
  - Alex Barghi (https://github.com/alexbarghi-nv)

Approvers:
  - Ray Douglass (https://github.com/raydouglass)
  - Brad Rees (https://github.com/BradReesWork)
  - Vibhu Jawa (https://github.com/VibhuJawa)

URL: rapidsai#3874
…MFG creation (rapidsai#3887)

Allow cugraph-dgl dataloader to consume sampled outputs from BulkSampler in CSC format.

Authors:
  - Tingyu Wang (https://github.com/tingyu66)
  - Seunghwa Kang (https://github.com/seunghwak)
  - Alex Barghi (https://github.com/alexbarghi-nv)

Approvers:
  - Seunghwa Kang (https://github.com/seunghwak)
  - Alex Barghi (https://github.com/alexbarghi-nv)
  - Vibhu Jawa (https://github.com/VibhuJawa)

URL: rapidsai#3887
@rlratzel
Copy link
Contributor Author

rlratzel commented Oct 6, 2023

/ok to test

@rlratzel
Copy link
Contributor Author

rlratzel commented Oct 6, 2023

/ok to test

@rlratzel
Copy link
Contributor Author

rlratzel commented Oct 6, 2023

/ok to test

@rapids-bot rapids-bot bot merged commit 2c1626f into rapidsai:branch-23.12 Oct 6, 2023
72 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.