-
-
Notifications
You must be signed in to change notification settings - Fork 517
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve PolynomialSequence.connected_components() #35518
Improve PolynomialSequence.connected_components() #35518
Conversation
Couldn't you use the variable indices rather than the variables themselves? That does require some additional code to polynomials that do not provide such method. But it is straightforward given the code of
The code above builds the set |
DS = DisjointSet(self.variables()) | ||
L = [] # to avoid calling twice f.variables() | ||
for f in self: | ||
for i,c in enumerate(C): | ||
if len(set(f.variables()).difference(c)) == 0: | ||
var = f.variables() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not worse than what it was before, but this is calling self.variables()
and later f.variables()
for each f in self
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I do one call less to self.variables()
. You forgot the calls that were done in connection_graph
.
The proposal of @videlec is certainly doable but much more involved and cannot be done in the same file. I don't have enough knowledge in this part of the code to do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can definitely postpone the usage of variable indices rather than variables themselves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a more concrete proposal for an interface at #35523. Comments welcome.
# Use a union-find data structure to encode relationships between | ||
# variables, i.e., that they belong to a same polynomial | ||
from sage.sets.disjoint_set import DisjointSet | ||
DS = DisjointSet(self.variables()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DS = DisjointSet(self.variables()) | |
vs = [f.variables() for f in self] | |
DS = DisjointSet(set().union(*vs)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than self.variables()
I think than one should use self.universe().variables()
. The point being that the vertices should be all variables in the underlying polynomial ring (and not just the ones that appear in some polynomial in self
). Doing this change will also be better in terms of efficiency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, self.universe().variables()
is only available for boolean polynomial rings. One should use self.universe().gens()
for other polynomial ring (and possibly fix this discrepency between boolean and other polynomial rings).
# variables, i.e., that they belong to a same polynomial | ||
from sage.sets.disjoint_set import DisjointSet | ||
DS = DisjointSet(self.variables()) | ||
L = [] # to avoid calling twice f.variables() | ||
for f in self: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for f in self: | |
for f, var in zip(self, vs): |
for f in self: | ||
for i,c in enumerate(C): | ||
if len(set(f.variables()).difference(c)) == 0: | ||
var = f.variables() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var = f.variables() |
This is the best I came about:
It seems 30% faster than yours, mostly because it avoids calling I wonder what is the need of those Note that the code is completely deterministic even removing the inner As for the outer |
Something like this:
|
Simplify the code; also, the previous code has to iterate over variables of the sequence twice (this is really bad before sagemath#35510) Moreover, now we add a clique between the variables of each polynomial, so it agrees with the description (the code before used to add just a spanning tree of the clique -- a star). This makes this method a little bit slower for the purposes of `connected_components()` (for which adding a star is equivalent). However, sagemath#35518 will rewrite `connected_components()` without using `connection_graph()` so this is not really a problem.
I followed the proposal of @tornaria but I let the final |
IMO what's important is that the function is deterministic, not that it sorts. The way I proposed, the order of the output is determined by the order of the input. The doctest in line 79 can be fixed by --- a/src/sage/rings/polynomial/multi_polynomial_sequence.py
+++ b/src/sage/rings/polynomial/multi_polynomial_sequence.py
@@ -76,7 +76,7 @@ pair and study it::
We separate the system in independent subsystems::
- sage: C = Sequence(r2).connected_components(); C
+ sage: C = Sequence(sorted(r2)).connected_components(); C
[[w213 + k113 + x111 + x112 + x113,
w212 + k112 + x110 + x111 + x112 + 1,
w211 + k111 + x110 + x111 + x113 + 1, It may be preferable just not so sort here but fix the output in the doctest to match whatever comes out. If you look at the output of |
Now we don't sort anymore in |
Documentation preview for this PR is ready! 🎉 |
I had to look this up: "Changed in version 3.7: Dictionary order is guaranteed to be insertion order." https://docs.python.org/3/library/stdtypes.html#dictionary-view-objects May I suggest the following tests to check the order of the output:
Also, my preference would be to not sort in line 79 and change the output, the sequences will be in the same order as in the output of line 43, so it's arguably more natural. But whatever you prefer is ok. |
I did the proposed changes. |
Simplify the code; also, the previous code has to iterate over variables of the sequence twice (this is really bad before #35510). This affects mainly the method `connected_components()`. EDIT: Moreover, now we add a clique between the variables of each polynomial, so it agrees with the description (the code before used to add just a spanning tree of the clique -- a star). This makes this method a little bit slower for the purposes of `connected_components()` (for which adding a star is equivalent). However, #35518 will rewrite `connected_components()` without using `connection_graph()` so this is not really a problem. ### 📝 Checklist - [x] The title is concise, informative, and self-explanatory. - [x] The description explains in detail what this PR is about. - [x] New tests added to check that a clique is added for the variables in each polynomial. ### Dependencies - #35511 URL: #35512 Reported by: Gonzalo Tornaría Reviewer(s): David Coudert, Gonzalo Tornaría, Vincent Delecroix
As discussed in #35512, we propose an implementation of method
PolynomialSequence.connected_components()
that avoids the effective construction of the graph connecting variables that appear in a same polynomial.📚 Description
We use a union-find data structure to encode the connected components.
📝 Checklist
⌛ Dependencies
See discussion in #35512