Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Equal hashes for non-isomorphic bipartite graphs with edge labels #33255

Closed
maxale opened this issue Jan 31, 2022 · 45 comments · Fixed by #35146
Closed

Equal hashes for non-isomorphic bipartite graphs with edge labels #33255

maxale opened this issue Jan 31, 2022 · 45 comments · Fixed by #35146

Comments

@maxale
Copy link
Contributor

maxale commented Jan 31, 2022

The following code illustrates the problem that two bipartite graph B1 and B2 (both canonically labeled) are claimed to be equal and have equal hashes, while they are NOT isomorphic, let alone equal, as labeled graphs.

When labels are ignored, these graphs are isomorphic, and so the problem may be caused by equality testing / hash() function somehow ignoring edge labels.

B1 = BipartiteGraph( [(0, 11, 2), (0, 12, 1), (0, 14, 1), (0, 16, 1), (1, 10, 1), (1, 13, 1), (1, 14, 2), (1, 16, 1), (2, 10, 2), (2, 11, 1), (2, 15, 1), (2, 16, 1), (3, 10, 1), (3, 12, 1), (3, 16, 2), (3, 17, 1), (4, 11, 1), (4, 13, 1), (4, 15, 2), (4, 17, 1), (5, 9, 2), (5, 14, 1), (5, 15, 1), (5, 17, 1), (6, 9, 1), (6, 13, 2), (6, 16, 1), (6, 17, 1), (7, 9, 1), (7, 10, 1), (7, 11, 1), (7, 12, 2), (7, 14, 1), (8, 9, 1), (8, 12, 1), (8, 13, 1), (8, 15, 1), (8, 17, 2)], immutable=True )

B2 = BipartiteGraph( [(0, 11, 1), (0, 12, 1), (0, 14, 2), (0, 16, 1), (1, 10, 2), (1, 13, 1), (1, 14, 1), (1, 16, 1), (2, 10, 1), (2, 11, 2), (2, 15, 1), (2, 16, 1), (3, 10, 1), (3, 12, 1), (3, 16, 2), (3, 17, 1), (4, 11, 1), (4, 13, 1), (4, 15, 2), (4, 17, 1), (5, 9, 2), (5, 14, 1), (5, 15, 1), (5, 17, 1), (6, 9, 1), (6, 13, 2), (6, 16, 1), (6, 17, 1), (7, 9, 1), (7, 10, 1), (7, 11, 1), (7, 12, 2), (7, 14, 1), (8, 9, 1), (8, 12, 1), (8, 13, 1), (8, 15, 1), (8, 17, 2)], immutable=True )

print('Same hashes:', hash(B1) == hash(B2) )
print('Isomorphic:', B1.is_isomorphic( B2, edge_labels=True ) )
print('Equal:', B1 == B2)

Component: graph theory

Issue created by migration from https://trac.sagemath.org/ticket/33255

@maxale maxale added this to the sage-9.6 milestone Jan 31, 2022
@maxale

This comment has been minimized.

@DaveWitteMorris
Copy link
Member

comment:4

The docstring for Graph.__eq__ says: "... labels of arrows/edges are taken into account if and only if the graphs are considered weighted ...". This means that if you want equality testing to consider the edge labels (i.e., if you want the graphs to be considered as labelled graphs), then you need to make the graphs weighted.

sage: B1 = BipartiteGraph( [(0, 11, 2), (0, 12, 1), (0, 14, 1), (0, 16, 1), (1, 10, 1), (1, 13, 1), (1, 14, 2), (1, 16, 1), (2, 10, 2), (2, 11, 1), (2, 15, 1), (2, 16, 1), (3, 10, 1), (3, 12, 1), (3, 16, 2), (3, 17, 1), (4, 11, 1), (4, 13, 1), (4, 15, 2), (4, 17, 1), (5, 9, 2), (5, 14, 1), (5, 15, 1), (5, 17, 1), (6, 9, 1), (6, 13, 2), (6, 16, 1), (6, 17, 1), (7, 9, 1), (7, 10, 1), (7, 11, 1), (7, 12, 2), (7, 14, 1), (8, 9, 1), (8, 12, 1), (8, 13, 1), (8, 15, 1), (8, 17, 2)], immutable=True, weighted=True )

sage: B2 = BipartiteGraph( [(0, 11, 1), (0, 12, 1), (0, 14, 2), (0, 16, 1), (1, 10, 2), (1, 13, 1), (1, 14, 1), (1, 16, 1), (2, 10, 1), (2, 11, 2), (2, 15, 1), (2, 16, 1), (3, 10, 1), (3, 12, 1), (3, 16, 2), (3, 17, 1), (4, 11, 1), (4, 13, 1), (4, 15, 2), (4, 17, 1), (5, 9, 2), (5, 14, 1), (5, 15, 1), (5, 17, 1), (6, 9, 1), (6, 13, 2), (6, 16, 1), (6, 17, 1), (7, 9, 1), (7, 10, 1), (7, 11, 1), (7, 12, 2), (7, 14, 1), (8, 9, 1), (8, 12, 1), (8, 13, 1), (8, 15, 1), (8, 17, 2)], immutable=True, weighted=True )

sage: print('Same hashes:', hash(B1) == hash(B2) )
sage: print('Isomorphic:', B1.is_isomorphic( B2, edge_labels=True ) )
sage: print('Equal:', B1 == B2)

Same hashes: False
Isomorphic: False
Equal: False

So this is not a bug in the method, but I can see that this could be unexpected behavior.

Perhaps a warning should be given when edge labels are put on an unweighted graph, or perhaps the documentation should be clarified.

@slel
Copy link
Member

slel commented Jan 31, 2022

comment:5

Simpler examples:

sage: a = [(0, 2, 1), (0, 3, 2), (1, 3, 1)]
sage: b = [(0, 2, 2), (0, 3, 1), (1, 3, 2)]

sage: A = BipartiteGraph(a, immutable=True)
sage: B = BipartiteGraph(b, immutable=True)

sage: print(f'Same hashes: {hash(A) == hash(B)}\n'
....:       f'Isomorphic: {A.is_isomorphic(B, edge_labels=True)}\n'
....:       f'Equal: {A == B}')
Same hashes: True
Isomorphic: False
Equal: True

sage: C = BipartiteGraph(a, immutable=True, weighted=True)
sage: D = BipartiteGraph(b, immutable=True, weighted=True)

sage: print(f'Same hashes: {hash(C) == hash(D)}\n'
....:       f'Isomorphic: {C.is_isomorphic(D, edge_labels=True)}\n'
....:       f'Equal: {C == D}')
Same hashes: False
Isomorphic: False
Equal: False

@slel

This comment has been minimized.

@dcoudert
Copy link
Contributor

comment:6

The manipulation of edge labels is not simple since labels might not be hashable. For instance, a label can be a Graph or a dictionary.
In general, I prefer when the decision to consider labels is explicit (an argument set to True or False).
Do you have any proposal to improve the documentation ?

@maxale
Copy link
Contributor Author

maxale commented Jan 31, 2022

comment:7

A related issue is the name of this option. What is called weighted= in BipartiteGraph() in many other methods (like .is_isomorphic(), .canonical_label(), .automorphism_group() etc.) is called edge_labels=.

Should there be consistency in the naming of this option?

@maxale
Copy link
Contributor Author

maxale commented Jan 31, 2022

comment:8

Documentation says:

weighted – boolean (default: None); whether graph thinks of itself as weighted or not. See self.weighted()

So, the default value is not True or False, but None. I'd suggest that when weighted=None (ie., not specified), it should guessed from the input data. That is, if labels are present in the given data, BipartiteGraph() should assume weighted=True, otherwise it's weighted=False.

@enjeck
Copy link
Contributor

enjeck commented Apr 12, 2022

comment:9

I want to clarify what fix is expected for this ticket:

  • Either replace edge_labels with weighted or vice versa for consistency
  • When weighted=None, assign True or False to weighted after checking the input data for labels. For example, if the input is an edge list, just check if at least one edge is a triple (and the third item is not None).
    Is this right?

@maxale
Copy link
Contributor Author

maxale commented Apr 12, 2022

comment:10

Replying to @enjeck:

I want to clarify what fix is expected for this ticket:

  • Either replace edge_labels with weighted or vice versa for consistency
  • When weighted=None, assign True or False to weighted after checking the input data for labels. For example, if the input is an edge list, just check if at least one edge is a triple (and the third item is not None).
    Is this right?

Yes, this would be great!

@dcoudert
Copy link
Contributor

comment:11

I disagree with the proposed change of behavior of parameter weighted, and observe that it may affect the entire graph library and other modules using graphs.

Parameter weighted is an old parameter that is used in some algorithm to specify the default behavior. That is, do computations in a weighted graph instead of unweighted graph. This is for instance the case for min_spanning_tree.
We have a long term objective of unifying the use of weights #13112. I'm not sure that parameter weighted is well taken into account by all methods that can use edge weights. So, I usually prefer to explicitly set by_weight=True to avoid confusion.

May be others have a different opinion ?

@mkoeppe mkoeppe modified the milestones: sage-9.6, sage-9.7 May 3, 2022
@mkoeppe mkoeppe modified the milestones: sage-9.7, sage-9.8 Sep 19, 2022
@Bruno-TT
Copy link
Mannequin

Bruno-TT mannequin commented Oct 31, 2022

Branch: u/gh-Bruno-TT/bipartite-hashes

@Bruno-TT
Copy link
Mannequin

Bruno-TT mannequin commented Oct 31, 2022

Commit: 6d31aad

@Bruno-TT
Copy link
Mannequin

Bruno-TT mannequin commented Oct 31, 2022

comment:14

There was never a unanimous consensus on how to handle the weight/labelling logic, so I've implemented a suggestion.

My implementation adds an optional boolean parameter hash_labels to the class constructor, which if left empty will default to None and warn the user upon hash() invocation, and then include labels in the hash.

Let me know what you guys think.

(I'm new to the project so please feel free to give criticism, I'm assuming I've somehow misused trac).

@Bruno-TT Bruno-TT mannequin added the s: needs review label Oct 31, 2022
@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Oct 31, 2022

Branch pushed to git repo; I updated commit sha1. New commits:

e49602dforgot to actually put something in the commit lol

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Oct 31, 2022

Changed commit from 6d31aad to e49602d

@dcoudert
Copy link
Contributor

dcoudert commented Jan 8, 2023

Changed branch from u/gh-Bruno-TT/bipartite-hashes to public/graphs/33255

@dcoudert
Copy link
Contributor

dcoudert commented Jan 8, 2023

Changed commit from c379a13 to 634ca63

@dcoudert
Copy link
Contributor

dcoudert commented Jan 8, 2023

comment:32

I did several corrections / improvements. It currently passes tests, but I may have missed some cases.

I'll check the copy method asap. It's not obvious how to endup with a clean solution.


New commits:

569ec3ctrac #33255: merged with 9.8.beta6
837a8c7trac #33255: review commit
634ca63trac #33255: corrections

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Jan 14, 2023

Changed commit from 634ca63 to e993d2f

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Jan 14, 2023

Branch pushed to git repo; I updated commit sha1. New commits:

e993d2ftrac #33255: better copy

@dcoudert
Copy link
Contributor

comment:34

This should do the job.

Roughly, if parameter hash_labels is None, we have the same behavior than for the weighted parameters. If the parameter is set, we create a (im)mutable copy of the graph depending on the value of parameter immutable.

In another ticket, we can change the default behavior of returning self when asking for a copy of an immutable graph without changing parameters.

@mkoeppe mkoeppe removed this from the sage-9.8 milestone Jan 29, 2023
dcoudert added a commit to dcoudert/sage that referenced this issue Feb 15, 2023
dcoudert added a commit to dcoudert/sage that referenced this issue Feb 15, 2023
dcoudert added a commit to dcoudert/sage that referenced this issue Feb 15, 2023
dcoudert added a commit to dcoudert/sage that referenced this issue Feb 15, 2023
dcoudert added a commit to dcoudert/sage that referenced this issue Feb 15, 2023
dcoudert added a commit to dcoudert/sage that referenced this issue Feb 15, 2023
dcoudert added a commit to dcoudert/sage that referenced this issue Feb 15, 2023
@dcoudert
Copy link
Contributor

Ready for review.

@mkoeppe
Copy link
Contributor

mkoeppe commented Feb 16, 2023

Removed branch from issue description; replaced by PR #35146.

@mkoeppe
Copy link
Contributor

mkoeppe commented Feb 16, 2023

Removed the status label – there's no need to have it on both PR and Issue.

vbraun pushed a commit that referenced this issue Mar 26, 2023
…labels

    
Fixes #33255
    
URL: #35146
Reported by: David Coudert
Reviewer(s): roed314
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants