-
Notifications
You must be signed in to change notification settings - Fork 167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance improvements in is_isomorphic
algorithm
#288
Conversation
A benchmark: import time
import retworkx as rx
def permute(g, seed=None):
nodes = list(g.node_indexes())
edges = list(g.edge_list())
if seed:
np.random.seed(seed)
np.random.shuffle(nodes)
edges = map(lambda e: (nodes[e[0]], nodes[e[1]]), g.edge_list())
edges = list(edges)
if isinstance(g, rx.PyDiGraph):
res = rx.PyDiGraph()
else:
res = rx.PyGraph()
res.add_nodes_from([None for _ in range(len(nodes))])
res.add_edges_from_no_data(edges)
return res
n = 10000
degrees = [10, 15, 50, 100]
total = 0
for deg in degrees:
p = 2 * deg / (n - 1)
g_a = rx.directed_gnp_random_graph(n, p, seed=42)
g_b = permute(g_a, seed=4242)
start = time.time()
res = rx.is_isomorphic(g_a, g_b);
assert res
stop = time.time()
dt = stop - start
total += dt
print(total)
|
Pull Request Test Coverage Report for Build 808147690
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just skimmed the code so far (I'll do an in depth review later) but besides the doc build failing can you also add a release note for the new default_order
kwarg on the is_isomorphic
function?
I'm also wondering if there is any backwards compatibility concern here. I haven't had a chance to read the vf2++ paper yet, but does using the heuristic matching order expose us to any risk of breaking users? I would assume not and this is fine then because we definitely want to default to using it because of the much faster performance.
There is also is_isomorphic_node_match()
do we want to expose default order there too?
@mtreinish Sure. release note added and kwarg No, i don't think there is any concern for backwards compatibility here. The algorithm will still work as expected. I should note however that there might be cases that the order based on ids will actually run faster since we are only talking about a heuristic order. This was one of the reasons i decided to add |
So I was curious how the heuristic works on qiskit's
Which produced:
I'm thinking we probably shouldn't change the default order if it's going to have such a large impact on qiskit's performance. |
in this example, you are "cheating" since you are comparing identical circuits and the order based on node ids is the "optimal". in fact the only work that I guess a more realistic example would be: import retworkx as rx
import numpy as np
import time
from qiskit.circuit.random import random_circuit
from qiskit.converters import circuit_to_dag
def permute(graph):
num_nodes = len(graph)
data = list(graph.nodes())
edges = list(graph.edge_list())
nodes = np.random.permutation(num_nodes)
edges = map(lambda edge: (nodes[edge[0]], nodes[edge[1]]),
graph.edge_list())
edges = list(edges)
if isinstance(graph, rx.PyDiGraph):
res = rx.PyDiGraph()
else:
res = rx.PyGraph()
pdata = [None for _ in range(num_nodes)]
for i, ni in enumerate(nodes):
pdata[ni] = data[i]
res.add_nodes_from(pdata)
res.add_edges_from_no_data(edges)
return res
circuit_a = circuit_to_dag(random_circuit(15, 4096, seed=42))
graph_a = circuit_a._multi_graph
graph_a_perm = permute(graph_a)
start = time.time()
res = rx.is_isomorphic(graph_a, graph_a, id_order=False)
assert res
stop = time.time()
print(stop - start)
---
1.999 sec
start = time.time()
res = rx.is_isomorphic(graph_a, graph_a, id_order=True)
assert res
stop = time.time()
print(stop - start)
---
1.026 sec
start = time.time()
res = rx.is_isomorphic(graph_a, graph_a_perm, id_order=False)
assert res
stop = time.time()
print(stop - start)
---
1.927 sec
start = time.time()
res = rx.is_isomorphic(graph_a, graph_a_perm, id_order=True)
assert res
stop = time.time()
print(stop - start)
---
No termination even after 20mins |
@mtreinish After some experiment, i guess you are right and keeping the default order is indeed the right path for qiskit perfmornace. I guess the reason is that we are restricted to a small subset of all the possible permutation of nodes if two quantum circuits have isomorphic dagcircuit representations and matching the nodes according to their ids order is faster. Outside of qiskit, i believe this PR makes sense. |
Yeah, I agree with that. I think the path forward here is for this first PR we just add the option to change the order now but don't switch to the heuristic by default for the first release. This will give us a chance to adjust terra to explicitly say it wants the id order. Then we can deprecate and change the default order to use the heuristic. For the non-qiskit case the best we can do for this first PR is add a detailed docstring about setting the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great addition to the library! It's good to have one of the best algorithms that can compete with LEMON and other graph libraries.
I requested changes mostly to modify the name of Graph<Ty>
and to see if we can avoid copy and pasting those for loops in try_match
.
Lastly, with regards to the performance regression: keep in mind the authors of the VF2++ paper never tested it on directed graphs or DAGs. So for our specific case in Qiskit, the difference between VF2 and VF2++ won't be as impressive as for random graphs.
To avoid a regression in Qiskit, we should default to no heuristic. Later we can explicit disable the heuristic in Qiskit's code and enable the heuristic by default on our side.
@IvanIsCoding Thanks for the review! |
src/isomorphism.rs
Outdated
// repeatedly bring largest element in front. | ||
for i in 0..vd.len() { | ||
let (index, &item) = vd[i..] | ||
.par_iter() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious did you benchmark this? I'm a bit worried that using a parallel iterator here will actually make this slower, especially as i
approaches vd.len()
. Parallel iterators aren't always faster, especially for smaller iterators where the overhead often is higher than the execution time. So we should be tactical about where we leverage rayon and not just use it everywhere because we can.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mtreinish I support benchmarking it, but I can give some insight on why I suggested Georgios to parallelize it.
That loop looks like it could be one of the most expensive for sparse graphs. It does a procedure similar to Selection Sort, and it is quadratic on vd.len()
(and from what I understood vd
contains all graph nodes at a given BFS level). So it makes sense to parallelize the inner loop that is executed multiple times, because for most of the iterations it will be an expensive loop.
For the benchmark, I suggest running it against a sparse graph and comparing it. The simplest one that comes to my mind is a Star Graph.
Ideally, we should do like in all or other functions and decide if we parallelize the loop based on the vector size:
for i in 0..vd.len() {
let (index, &item) = if vd.len() - i >= parallel_threshold {
// parallel loop
}
else {
// fall back to sequential loop
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after some benchmarks on complete, star and gnp graphs, parallel iterator is a bit slower. i tried to set a parallel_threshold
but no improvements either. so i switched back to serial iterator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM thanks for doing this, just a couple small doc formatting issues and a question about the type for the new argument in the rust code. But other than that I think this is good to go
Co-authored-by: Matthew Treinish <mtreinish@kortar.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the fast update.
Introduced changes:
R_out, R_in, R_new
) based on the original VF2 paper.