-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Base58 instead of Base32 and Remove SCID shortening #79
Conversation
@bj-ms -- thanks! Please add DCO sign-off to the PR. See DCO - Developer Certificate of Origin - https://github.com/apps/dco. Instructions for fixing the commit are on the "Details" beside the failed check. |
bd9f54c
to
b86c422
Compare
…ring the scid generation; added changelog; added author Signed-off-by: Michel Sahli <michel.sahli@bj.admin.ch>
spec/specification.md
Outdated
@@ -620,7 +620,7 @@ first [[ref: DID log entry]] and is a portion of the hash of the DID's inception | |||
To generate the required [[ref: SCID]] for a `did:tdw` DID, the DID Controller | |||
**MUST** execute the following function: | |||
|
|||
`left(base32_lower(hash(JCS(preliminary log entry with placeholders))), <length>)` | |||
`base58(hash(JCS(preliminary log entry with placeholders)))` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that we want the flexibility/future-proofing to support different hash algorithms with backwards compatibility, I think we should use multihash (at least) and perhaps multibase here, so that a verifier can detect the hash (and encoding) algorithm(s) used. The spec. will dictate the allowed hashes to generate, but a verifier should be flexible in receiving hashes.
Do we need multibase as well as multihash?
If we use multihash, the length of the (untruncated) SCID (and entry hashes) will be 46 characters -- reference. So the SCID is up 18 from the previous 28, and we eliminate messing with the truncation. I think that is fine -- especially once we get into PQ public keys :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a minor security benefit to adding the hash algorithm, as it then could not be swapped in a replacement log file. That would still require producing a hash collision though, which is very unlikely.
Another option would be to add a version byte to the hash, fixing the generation algorithm version that was used. We're not currently using multihash so it would be a new dependency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m less worried about the security as the interop. Defense coding and future-proofing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should use multibase and multihash and remove the hash
parameter. Write normative statements saying which hash and base algs are allowed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m good with that. We definitely want to constrain what a DID Controller is allowed to use (hence the normative statements) and allow a resolver to discover the DID Controller's choices. I do think we keep the spec version and use that as an indicator to the resolver. The spec. version dictates the options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So SCID calculation becomes:
SCID = multibase(<base>,multihash(<hash>, JCS(input with placeholders)))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep version is fine I just want to remove hash
from the parameters as that does the same job as multihash is.. other than the minor security benefit andrew mentions above which I don't see as a concern
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see multihash, but I'd rather not use multibase. We don't have a need to support multiple encodings and it's more work for verifiers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok I'm fine with that. Also fine with not making the multihash change and just changing to base58btc in this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’d like to see multi-hash. I can see a time where we add, for example, sha3-256 (as we briefly did), while still allowing sha-256 (since it is needed for long lasting DIDs). Multihash would be important.
Multibase less so. There are not going to be massive breakthroughs in efficiency or security related to encoding schemes :-).
spec/specification.md
Outdated
Its output is the lower case of the Base32 encoded string of its input. | ||
5. `left` extracts the `<length>` number of characters from the string input. | ||
1. `<length>` **MUST** be at least 28 characters. | ||
4. `base58` is an implementation of the [[ref: base58]] function. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we use multihash, I think we wind up using base58btc? Or is it still base58?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The (expired) draft that is referenced is the Bitcoin version of base58, aka base58-btc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there another spec we should be referencing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should reference the VC controller document.
https://www.w3.org/TR/controller-document/#multibase-0
https://www.w3.org/TR/controller-document/#multihash
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Side question @brianorwhatever — What is “Controller Document”? There is no description of why it is and what it is for. AFAICS, it is much like DIDs, and I heard Manu said it started with the DID Spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's essentially where the normative statements for multibase/multihash had to be moved to in the VCWG as the group couldn't get consensus to have them in VCDM and there is a shared requirement between DI/JOSE.. I think basically a DID Document is a controller document
spec/specification.md
Outdated
scid = 28+( lower-base32 ) | ||
lower-base32 = [2-7a-z] | ||
scid = ( base58 ) | ||
base58 = [1-9A-Za-z] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could be a bit more specific with [1-9A-HJ-NP-Za-km-z]
!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or leave it off and leave it to the spec.
@bj-ms — are you good with updating your PR to reflect the feedback? I think yours is close, but there will be a few more places to address. Happy to do that as a follow up PR. Let me know how you want to proceed. |
I'm good with that, I will make the following modifications:
|
…n identifier and their sizes. Signed-off-by: Michel Sahli <michel.sahli@bj.admin.ch>
Signed-off-by: Michel Sahli <michel.sahli@bj.admin.ch>
Signed-off-by: Michel Sahli <michel.sahli@bj.admin.ch>
Signed-off-by: Michel Sahli <michel.sahli@bj.admin.ch>
spec/definitions.md
Outdated
digits 2-7. | ||
~ Applies [[spec:draft-msporny-base58-03]] to convert | ||
data to a `base58` encoding. Data encoded as | ||
base58 consists of a string of characters containing only the letters A-Z, a-z and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
base58 consists of 123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz
This purposely leaves off some characters that might be misread by humans.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest we don’t repeat the data here — just reference the spec.
Signed-off-by: Michel Sahli <michel.sahli@bj.admin.ch>
I am happy with the changes to base58 and skipping the trimming. Just two additional inputs.
|
The first issue is not a concern, as the input will always be a hash, which will be (for now) 32 bytes, perhaps longer if/when other hash algorithms are needed. Never more than 100 characters. I’m not sure about a double hash as I don’t know how to calculate collision probability and how that translates into “time to calculate” in typical conditions (including Quantum conditions). AFAIK, the attacker must generate a new log entry with a key they control and take over the web server publishing the file. The hashing controls the probability of the first part of that. |
There are a couple of very minor clarifications I'd like to make to this (most of which are a bit tangential), but not worth holding up the merge. I think we have all agreed on the content. Let's get this merged. |
The double hashing could be worth exploring if it increases the preimage resistance, although the difference would be pretty minor. Bitcoin uses it but without any real explanation of the purpose, IIRC. It would be easy to support without adding additional hash functions, but I'm not sure how it would compare to sha512/256 in strength. |
As part of the discussion and security concerns in https://github.com/bcgov/trustdidweb/issues/75, I propose to change Base32 to Base 58 and remove the shortening of the SCID during it's generation.
I also changed the version, changelog and added myself as an author (feel free to comment on a possible rollback on that part).