K-truss benchmarking #4593

ChuckHastings · 2024-08-06T04:20:32Z

Benchmark k-truss and compare to other existing implementations.

jnke2016 · 2025-02-04T15:51:03Z

In 24.06, we updated our K-Truss implementation by moving from cuHornet to an implementation matching our primitive and the performance improvement was several order of magnitude faster.

As a summary, our earlier implementation run several iteration of our primitive per_v_pair_dst_nbr_intersection to compute the number of triangles incident to every edge in the graph. This can be inefficient if there is only a few subset of the edges to invalidate which will still require to run per_v_pair_dst_nbr_intersection on the entire edgelist and that primitive is not well optimized for K-Truss.
In our new implementation, we do not compute the number of triangles incident to every edge but instead identify weak edges to be deleted and remove their contribution on the affected triangles. This optimization improves our performance for different graph size and k values as it can be seeing in the figure below. These benchmark numbers were obtained by running cuGraph 25.02 on an H100 machine.

On the memory side we are currently limited by the number of edge we can process in nbr_intersection on large graph though we have an internal flag that enables us to perform this operation in chunk. By reducing the intersection size, we can process 2 to 4x larger graph. A study of the algorithm unveiled that there is no need to store the result of nbr_intersection and instead directly update the edge property value (in this case triangle count). This will both significantly reduce our peak memory usage and also improve our runtime (since we will no longer need to read the intersection result stored in CSR format to retrieve and update the edge property)

ChuckHastings · 2025-02-04T16:07:28Z

This is a great start. But I think we should also show comparison to other implementations (I saw nx, hornet, arkuda in a quick internet search, but any we can find).

jeaton32 · 2025-02-04T16:08:47Z

These are all single-gpu so far? Which GPU is being used?

jnke2016 · 2025-02-04T16:20:50Z

This is a great start. But I think we should also show comparison to other implementations (I saw nx, hornet, arkuda in a quick internet search, but any we can find).

Correct. This issue will be updated incrementally. In the first part of this k-truss benchmarking issue, I am portraying how the new optimization improved our runtime from our previous one which was already faster than the cuHornet based implementation.

The two fastest implementation in the literature on single GPU are KTrussExplorer and a linear algebraic based implementation which we are targeting to outperform.

jnke2016 · 2025-02-04T16:22:17Z

These are all single-gpu so far

Yes I do also have single node multi GPU numbers which will be added shortly.

Which GPU is being used

NVIDIA H100 80GB HBM3

jnke2016 · 2025-02-10T19:50:49Z

Runtime breakdown on a single GPU for a graph generated at scale 22 and edge factor 16 for different values of k.

ChuckHastings assigned jnke2016 Aug 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

K-truss benchmarking #4593

K-truss benchmarking #4593

ChuckHastings commented Aug 6, 2024 •

edited by jnke2016

Loading

jnke2016 commented Feb 4, 2025 •

edited

Loading

ChuckHastings commented Feb 4, 2025

jeaton32 commented Feb 4, 2025

jnke2016 commented Feb 4, 2025 •

edited

Loading

jnke2016 commented Feb 4, 2025

jnke2016 commented Feb 10, 2025 •

edited

Loading

K-truss benchmarking #4593

K-truss benchmarking #4593

Comments

ChuckHastings commented Aug 6, 2024 • edited by jnke2016 Loading

jnke2016 commented Feb 4, 2025 • edited Loading

ChuckHastings commented Feb 4, 2025

jeaton32 commented Feb 4, 2025

jnke2016 commented Feb 4, 2025 • edited Loading

jnke2016 commented Feb 4, 2025

jnke2016 commented Feb 10, 2025 • edited Loading

ChuckHastings commented Aug 6, 2024 •

edited by jnke2016

Loading

jnke2016 commented Feb 4, 2025 •

edited

Loading

jnke2016 commented Feb 4, 2025 •

edited

Loading

jnke2016 commented Feb 10, 2025 •

edited

Loading