Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Base58 instead of Base32 and Remove SCID shortening #79

Merged
merged 6 commits into from
Jul 26, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions spec/definitions.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,12 +78,12 @@ when forced to change (such as when an organization is acquired by another,
resulting in a change of domain names) and when changing DID hosting service
providers.

[[def: base32_lower]]
[[def: base58]]

~ Applies [[spec:rfc4648]] to convert
data to a `base32` encoding, and then lower cases the result. Data encoded as
base32 consists of a string of characters containing only the letters A-Z and
digits 2-7.
~ Applies [[spec:draft-msporny-base58-03]] to convert
data to a `base58` encoding. Data encoded as
base58 consists of a string of characters containing only the letters A-Z, a-z and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

base58 consists of 123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz This purposely leaves off some characters that might be misread by humans.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest we don’t repeat the data here — just reference the spec.

digits 1-9.

[[def: Linked-VP, Linked Verifiable Presentation]]

Expand Down
3 changes: 2 additions & 1 deletion spec/header.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ Trust DID Web - `did:tdw`

**Specification Status:** DRAFT

**Specification Version:** 0.2 (see [Changelog](#did-tdw-version-changelog))
**Specification Version:** 0.3 (see [Changelog](#did-tdw-version-changelog))

**Latest Draft:**
[https://github.com/bcgov/trustdidweb](https://github.com/bcgov/trustdidweb)
Expand All @@ -13,6 +13,7 @@ Trust DID Web - `did:tdw`
~ [John Jordan, BC Gov](https://github.com/jljordan42)
~ [Andrew Whitehead](https://github.com/andrewwhitehead)
~ [Brian Richter](https://github.com/brianorwhatever)
~ [Michel Sahli](https://github.com/bj-ms)

**Participate:**
~ [GitHub repo](https://github.com/bcgov/trustdidweb)
Expand Down
30 changes: 14 additions & 16 deletions spec/specification.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,8 +47,8 @@ here the full ABNF of those elements from that RFC would inevitably be wrong.
tdw-did = "did:tdw:" scid ":" domain-segment 1+( "." domain-segment ) *( ":" path-segment )
domain-segment = ; A part of a domain name as defined in RFC3986, such as "example" and "com" in "example.com"
path-segment= ; A part of a URL path as defined in RFC3986, such as "path", "to", "folder" in "path/to/folder"
scid = 28+( lower-base32 )
lower-base32 = [2-7a-z]
scid = ( base58 )
base58 = [1-9A-Za-z]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could be a bit more specific with [1-9A-HJ-NP-Za-km-z] !

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or leave it off and leave it to the spec.

```

The ABNF for a `did:tdw` is almost identical to that of [[ref: did:web]], with changes only to
Expand Down Expand Up @@ -620,7 +620,7 @@ first [[ref: DID log entry]] and is a portion of the hash of the DID's inception
To generate the required [[ref: SCID]] for a `did:tdw` DID, the DID Controller
**MUST** execute the following function:

`left(base32_lower(hash(JCS(preliminary log entry with placeholders))), <length>)`
`base58(hash(JCS(preliminary log entry with placeholders)))`
Copy link
Collaborator

@swcurran swcurran Jul 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that we want the flexibility/future-proofing to support different hash algorithms with backwards compatibility, I think we should use multihash (at least) and perhaps multibase here, so that a verifier can detect the hash (and encoding) algorithm(s) used. The spec. will dictate the allowed hashes to generate, but a verifier should be flexible in receiving hashes.

Do we need multibase as well as multihash?

If we use multihash, the length of the (untruncated) SCID (and entry hashes) will be 46 characters -- reference. So the SCID is up 18 from the previous 28, and we eliminate messing with the truncation. I think that is fine -- especially once we get into PQ public keys :-)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a minor security benefit to adding the hash algorithm, as it then could not be swapped in a replacement log file. That would still require producing a hash collision though, which is very unlikely.

Another option would be to add a version byte to the hash, fixing the generation algorithm version that was used. We're not currently using multihash so it would be a new dependency.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m less worried about the security as the interop. Defense coding and future-proofing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use multibase and multihash and remove the hash parameter. Write normative statements saying which hash and base algs are allowed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m good with that. We definitely want to constrain what a DID Controller is allowed to use (hence the normative statements) and allow a resolver to discover the DID Controller's choices. I do think we keep the spec version and use that as an indicator to the resolver. The spec. version dictates the options.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So SCID calculation becomes:

SCID = multibase(<base>,multihash(<hash>, JCS(input with placeholders)))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep version is fine I just want to remove hash from the parameters as that does the same job as multihash is.. other than the minor security benefit andrew mentions above which I don't see as a concern

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see multihash, but I'd rather not use multibase. We don't have a need to support multiple encodings and it's more work for verifiers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok I'm fine with that. Also fine with not making the multihash change and just changing to base58btc in this PR

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’d like to see multi-hash. I can see a time where we add, for example, sha3-256 (as we briefly did), while still allowing sha-256 (since it is needed for long lasting DIDs). Multihash would be important.

Multibase less so. There are not going to be massive breakthroughs in efficiency or security related to encoding schemes :-).


Where:

Expand All @@ -643,10 +643,8 @@ Where:
3. `hash` is the hash algorithm enumerated in the `hash` item in the [[ref:
parameters]], or if none is specified, the default hash algorithm defined in
this specification. Its output is the hash of its input.
4. `base32_lower` is an implementation of the [[ref: base32_lower]] function.
Its output is the lower case of the Base32 encoded string of its input.
5. `left` extracts the `<length>` number of characters from the string input.
1. `<length>` **MUST** be at least 28 characters.
4. `base58` is an implementation of the [[ref: base58]] function.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we use multihash, I think we wind up using base58btc? Or is it still base58?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The (expired) draft that is referenced is the Bitcoin version of base58, aka base58-btc.

Copy link
Collaborator

@swcurran swcurran Jul 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there another spec we should be referencing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side question @brianorwhatever — What is “Controller Document”? There is no description of why it is and what it is for. AFAICS, it is much like DIDs, and I heard Manu said it started with the DID Spec.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's essentially where the normative statements for multibase/multihash had to be moved to in the VCWG as the group couldn't get consensus to have them in VCDM and there is a shared requirement between DI/JOSE.. I think basically a DID Document is a controller document

Its output is the Base58 encoded string of its input.

##### Verify SCID

Expand Down Expand Up @@ -680,7 +678,7 @@ previous log entry.
##### Generate Entry Hash

To generate the required hash for a `did:tdw` DID entry, the DID Controller
**MUST** execute the process `base32_lower(hash(JCS(entry)))` given a
**MUST** execute the process `base58(hash(JCS(entry)))` given a
preliminary log entry as the string `entry`, where:

1. `JCS` is an implementation of the [[ref: JSON Canonicalization Scheme]]
Expand All @@ -689,8 +687,8 @@ preliminary log entry as the string `entry`, where:
2. `hash` is the hash algorithm enumerated in the `hash` item in the [[ref:
parameters]], or if none is specified, the default hash algorithm defined in
this specification. Its output is the hash of its input.
3. `base32_lower` is an implementation of the [[ref: base32_lower]] function.
Its output is the lower case of the Base32 encoded string of its input.
3. `base58` is an implementation of the [[ref: base58]] function.
Its output is the Base58 encoded string of its input.

The following is an example of a preliminary log entry that is processed to
produce an entry hash. As this is a first entry in a DID Log, the input
Expand Down Expand Up @@ -720,16 +718,16 @@ Resolver **MUST** execute the following process:
3. Set the first item of the entry to the `versionId` (first item) of the
previous log entry. If this is the first entry in the log, set the value to
the `1-<scid>` where `<scid>` if the SCID of the DID.
4. Calculate the hash string as `base32_lower(hash(JCS(entry)))`, where:
4. Calculate the hash string as `base58(hash(JCS(entry)))`, where:
1. `entry` is the data from the previous step.
2. `JCS` is an implementation of the [[ref: JSON Canonicalization Scheme]]
([[spec:rfc8785]]). Its output is a canonicalized representation of its
input.
3. `hash` is the hash algorithm enumerated in the `hash` item in the [[ref:
parameters]], or if none is specified, the default hash algorithm defined in
this specification. Its output is the hash of its input.
4. `base32_lower` as defined by the [[ref: base32_lower]] function. Its
output is the lower case of the Base32 encoded string of the input hash.
4. `base58` as defined by the [[ref: base58]] function. Its
output is the Base58 encoded string of the input hash.
5. Verify that the calculated value matches the extracted value from Step 1. If
not, terminate the resolution process with an error.

Expand Down Expand Up @@ -832,14 +830,14 @@ authorization key.
1. Generate a new key pair.
2. Generate a [[ref: multikeys]] representation of the public key of the new key
pair.
3. Calculate the hash string as `base32_lower(hash(multikey))`, where:
3. Calculate the hash string as `base58(hash(multikey))`, where:
1. `multikey` is the [[ref: multikey]] representation of the public key.
2. ``hash` is the most recent hash algorithm enumerated in the `hash` item in
the [[ref: parameters]], or if none is specified, the default hash
algorithm defined in this specification. Its output is the hash of its
input.
3. `base32_lower` as defined by the [[ref: base32_lower]] function. Its
output is the lower case of the Base32 encoded string of the input hash.
3. `base58` as defined by the [[ref: base58]] function. Its
output is the Base58 encoded string of the input hash.
4. Insert the calculated hash into the `nextKeyHashes` array being built up within
the [[ref: parameters]] item.
5. The generated key pair **SHOULD** be safely stored so that it can be used in
Expand Down
3 changes: 3 additions & 0 deletions spec/version.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,9 @@

The following lists the substantive changes in each version of the specification.

- Version 0.3
- Change base32 encoding with base58, as it offers a better expansion rate.
- Remove the step to extract part of the base58 result during the generation of the [[ref: SCID]].
- Version 0.2
- Changes the location of the [[ref: SCID]] in the DID to always be the first
component after the DID Method prefix -- `did:tdw:<scid>:...`.
Expand Down