-
Notifications
You must be signed in to change notification settings - Fork 567
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BNode value not random enough #185
Comments
Uhm, why not just use UUIDs? |
Sound suggestion, IMO. I've covered this ground in a comment to issue #195 because using uuid4 BNode ids has the potential of supporting some of the set-theoretic graph operations. uuid/4 was introduced in Python 2.5, so it has to be mimicked for 2.4 I set up a uuid solution (leaving the existing BNode private API undisturbed) and exercised it in a couple of tests, setting them to fail if the issue was solved: https://github.com/RDFLib/rdflib/blob/master/test/test_issue200.py#L90 (issue 200 was the old gcode issue #) So far, I haven't merged Bertrand's pull request because:
I do think the argument for a switch to uuid4 is fairly compelling. |
Looking at this again I'd vote for UUIDs - do we have a source for a 2.4 compatible uuid faker? |
There's a putative implementation here: https://github.com/RDFLib/rdflib/blob/master/test/test_issue200.py#L17 where I created a solution for this issue using the tests as proof-of-concept. All seems to be working according to the CI build, I think it's now just a matter of propagating the implementation back into the RDFLib source. |
I have just closed my pull request... my proposed fix was bcroq/rdflib@6e8aac9 |
Thank you for the proposed fix, Bertrand. It's great to see people pitching in. BNode ids are declared to conform to NCName specs. Unadorned uuid4() won't the cut the mustard as a valid NCName due to its regrettable habit of randomly starting with an integer:
My naive trial of a "urn:uuid:" prefix was also not valid:
Comments in BNode.new assert:
However, the NCName reference given in https://github.com/RDFLib/rdflib/blob/master/rdflib/namespace.py#L364 (as the implementation spec for RDFLib's is_ncname() test) no longer seems to describe the same specification as used in RDFLib. Is this worth looking at in more detail? In the end, I used a deeply unimaginative "_" (underbar) prefix. |
Fixed in commit #03c77ea0af |
bcq.nng@gmail.com, 2011-12-23T16:13:30.000Z
When rdflib is used in a application that fork the current Python process, for exemple when using flup.server.*_fork, BNode's value generation in these processes:
This means that the parent and child processes will all get the same BNode values.
I fixed this issue using
but there might be a cleaner solution.
The text was updated successfully, but these errors were encountered: