From 67c6edb39e072b963fe713b532d837299584e50c Mon Sep 17 00:00:00 2001 From: Erik Johnston Date: Wed, 5 Sep 2018 09:35:07 +0100 Subject: [PATCH 01/16] MSC 1640 Proposal: Change Event IDs to Hashes --- proposals/1640-event-id-as-hashes.md | 50 ++++++++++++++++++++++++++++ 1 file changed, 50 insertions(+) create mode 100644 proposals/1640-event-id-as-hashes.md diff --git a/proposals/1640-event-id-as-hashes.md b/proposals/1640-event-id-as-hashes.md new file mode 100644 index 00000000000..80bb8380fe8 --- /dev/null +++ b/proposals/1640-event-id-as-hashes.md @@ -0,0 +1,50 @@ +# Changing Event IDs to be Hashes + +## Motivation + +Having event IDs separate from the hashes leads to issues when a server +receives multiple events with the same event ID but different hashes. +While APIs could be changed to better support dealing with this +situation, it is easier and nicer to simply drop the idea of a separate +event ID entirely. + +## Identifier Format + +Currently hashes in JSON include the hash name, allowing servers to +choose which hash functions to use. The idea here was to allow a gradual +change between hash functions without the need to globally coordinate +shifting from one hash function to another. + +However now that room versions exist, changing hash functions can be +achieved by bumping the room version. Using this method would allow +using a simple string as the event ID rather than a full structure, +significantly easing their usage. + +One side effect of this would be that there would be no indication about +which hash function was actually used, and it would need to be inferred +from the room version. To aid debuggability it may be worth encoding the +hash function into the ID format. + +## Protocol Changes + +The `auth_events` and `prev_events` fields on an event need to be +changed from a list of tuples to a list of strings, i.e. remove the old +event ID and simply have the list of hashes. + +The auth rules also need to change: + +- The event no longer needs to be signed by the domain of the event ID + (but still needs to be signed by the sender’s domain) + +- In redactions we currently allow them if the domain of the redaction + event ID matches the domain of the event ID its redacting. This + allows self redaction for servers, but would no longer be possible + and there isn’t an obvious way round it. + +## Open Questions + +1. Format of new ID, specifically whether it should encode the hash + function used to aid debuggability. + +2. How to change the auth rules to keep allowing self redactions. + From 4c595fd0d1d24acafaaeafd9a4112ee5901927fe Mon Sep 17 00:00:00 2001 From: Erik Johnston Date: Tue, 2 Oct 2018 14:35:46 +0100 Subject: [PATCH 02/16] Update with conclusions of open questions --- proposals/1640-event-id-as-hashes.md | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/proposals/1640-event-id-as-hashes.md b/proposals/1640-event-id-as-hashes.md index 80bb8380fe8..a923ce5f51f 100644 --- a/proposals/1640-event-id-as-hashes.md +++ b/proposals/1640-event-id-as-hashes.md @@ -25,6 +25,9 @@ which hash function was actually used, and it would need to be inferred from the room version. To aid debuggability it may be worth encoding the hash function into the ID format. +**Conclusion:** Don't encode the hash function, since the hash will depend on +the version specific redaction algorithm anyway. + ## Protocol Changes The `auth_events` and `prev_events` fields on an event need to be @@ -40,11 +43,6 @@ The auth rules also need to change: event ID matches the domain of the event ID its redacting. This allows self redaction for servers, but would no longer be possible and there isn’t an obvious way round it. - -## Open Questions - -1. Format of new ID, specifically whether it should encode the hash - function used to aid debuggability. - -2. How to change the auth rules to keep allowing self redactions. - + The only practical suggestion to this is to accept the redactions and only + check if we should redact the target event once we received the target + event. From 9a87a8040f2e23fc55266ab6d722b7333e643ad5 Mon Sep 17 00:00:00 2001 From: Erik Johnston Date: Tue, 22 Jan 2019 13:57:09 +0000 Subject: [PATCH 03/16] Fixup paragraph --- proposals/1640-event-id-as-hashes.md | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/proposals/1640-event-id-as-hashes.md b/proposals/1640-event-id-as-hashes.md index a923ce5f51f..5d35883b89d 100644 --- a/proposals/1640-event-id-as-hashes.md +++ b/proposals/1640-event-id-as-hashes.md @@ -39,10 +39,9 @@ The auth rules also need to change: - The event no longer needs to be signed by the domain of the event ID (but still needs to be signed by the sender’s domain) -- In redactions we currently allow them if the domain of the redaction - event ID matches the domain of the event ID its redacting. This - allows self redaction for servers, but would no longer be possible - and there isn’t an obvious way round it. - The only practical suggestion to this is to accept the redactions and only - check if we should redact the target event once we received the target - event. +- We currently allow redactions if the domain of the redaction event ID + matches the domain of the event ID its redacting. This allows self redaction + for servers, but would no longer be possible and there isn’t an obvious way + round it. The only practical suggestion to this is to accept the redactions + and only check if we should redact the target event once we received the + target event. From aebdf2e93f61166f1922ec0ab5168b3475a592e5 Mon Sep 17 00:00:00 2001 From: Erik Johnston Date: Tue, 22 Jan 2019 14:08:30 +0000 Subject: [PATCH 04/16] Fixup and expand --- proposals/1640-event-id-as-hashes.md | 59 +++++++++++++++++----------- 1 file changed, 37 insertions(+), 22 deletions(-) diff --git a/proposals/1640-event-id-as-hashes.md b/proposals/1640-event-id-as-hashes.md index 5d35883b89d..b90f6c34887 100644 --- a/proposals/1640-event-id-as-hashes.md +++ b/proposals/1640-event-id-as-hashes.md @@ -2,42 +2,57 @@ ## Motivation -Having event IDs separate from the hashes leads to issues when a server -receives multiple events with the same event ID but different hashes. -While APIs could be changed to better support dealing with this -situation, it is easier and nicer to simply drop the idea of a separate -event ID entirely. +Having event IDs separate from the hashes leads to issues when a server receives +multiple events with the same event ID but different reference hashes. While +APIs could be changed to better support dealing with this situation, it is +easier and nicer to simply drop the idea of a separate event ID entirely, and +instead use the reference hash of an event as its ID. ## Identifier Format -Currently hashes in JSON include the hash name, allowing servers to -choose which hash functions to use. The idea here was to allow a gradual -change between hash functions without the need to globally coordinate -shifting from one hash function to another. +Currently hashes in our event format include the hash name, allowing servers to +choose which hash functions to use. The idea here was to allow a gradual change +between hash functions without the need to globally coordinate shifting from one +hash function to another. -However now that room versions exist, changing hash functions can be -achieved by bumping the room version. Using this method would allow -using a simple string as the event ID rather than a full structure, -significantly easing their usage. +However now that room versions exist, changing hash functions can be achieved by +bumping the room version. Using this method would allow using a simple string as +the event ID rather than a full structure, significantly easing their usage. -One side effect of this would be that there would be no indication about -which hash function was actually used, and it would need to be inferred -from the room version. To aid debuggability it may be worth encoding the -hash function into the ID format. +One side effect of this would be that there would be no indication about which +hash function was actually used, and it would need to be inferred from the room +version. To aid debuggability it may be worth encoding the hash function into +the ID format. **Conclusion:** Don't encode the hash function, since the hash will depend on the version specific redaction algorithm anyway. +The proposal is therefore that the event IDs are a base 64 encoded `sha256` hash +prefixed with `$` (to aid distinguishing different types of identifiers). For +example, an event ID might be: `$CD66HAED5npg6074c6pDtLKalHjVfYb2q4Q3LZgrW6o`. + +The hash is calculated in the same way as previous event reference hashes were, +which is: + +1. Redact the event +2. Remove `signatures` field from the event +3. Serialize the event to canonical JSON +4. Compute the hash of the JSON bytes + +Event IDs will no longer be included as part of the event, and so must be +calculated by servers receiving the event. + + ## Protocol Changes -The `auth_events` and `prev_events` fields on an event need to be -changed from a list of tuples to a list of strings, i.e. remove the old -event ID and simply have the list of hashes. +The `auth_events` and `prev_events` fields on an event need to be changed from a +list of tuples to a list of strings, i.e. remove the old event ID and simply +have the list of hashes. The auth rules also need to change: -- The event no longer needs to be signed by the domain of the event ID - (but still needs to be signed by the sender’s domain) +- The event no longer needs to be signed by the domain of the event ID (but + still needs to be signed by the sender’s domain) - We currently allow redactions if the domain of the redaction event ID matches the domain of the event ID its redacting. This allows self redaction From 5e75c29ae0a9c6ae273d10ea39a35f31269287e9 Mon Sep 17 00:00:00 2001 From: Erik Johnston Date: Tue, 22 Jan 2019 14:50:50 +0000 Subject: [PATCH 05/16] Spell out changes to event format and APIs --- proposals/1640-event-id-as-hashes.md | 45 ++++++++++++++++++++++++++++ 1 file changed, 45 insertions(+) diff --git a/proposals/1640-event-id-as-hashes.md b/proposals/1640-event-id-as-hashes.md index b90f6c34887..23f4d1c6e6f 100644 --- a/proposals/1640-event-id-as-hashes.md +++ b/proposals/1640-event-id-as-hashes.md @@ -42,6 +42,51 @@ which is: Event IDs will no longer be included as part of the event, and so must be calculated by servers receiving the event. +### Changes in APIs + +All APIs that accept event IDs must accept event IDs of the new format + + +### Changes to Event Formats + +As well as changing the format of event IDs, we also change the format of the +`auth_events` and `prev_events` keys to simply be lists of event IDs (rather +than being lists of tuples). + +A full event would therefore look something like (note that this is just an +illustrative example, and that the hashes are not correct): + +```json +{ + "auth_events": [ + "$5hdALbO+xIhzcLTxCkspx5uqry9wO8322h/OI9ApnHE", + "$Ga0DBIICBsWIZbN292ATv8fTHIGGimwjb++w+zcHLRo", + "$zc4ip/DpPI9FZVLM1wN9RLqN19vuVBURmIqAohZ1HXg", + ], + "content": { + "body": "Here is the message content", + "msgtype": "m.message" + }, + "depth": 6, + "hashes": { + "sha256": "M6/LmcMMJKc1AZnNHsuzmf0PfwladVGK2Xbz+sUTN9k" + }, + "origin": "localhost:8800", + "origin_server_ts": 1548094046693, + "prev_events": [ + "$MoOzCuB/sacqHAvgBNOLICiGLZqGT4zB16MSFOuiO0s", + ], + "room_id": "!eBrhCHJWOgqrOizwwW:localhost:8800", + "sender": "@anon-20190121_180719-33:localhost:8800", + "signatures": { + "localhost:8800": { + "ed25519:a_iIHH": "N7hwZjvHyH6r811ebZ4wwLzofKhJuIAtrQzaD3NZbf4WQNijXl5Z2BNB047aWIQCS1JyFOQKPVom4et0q9UOAA" + } + }, + "type": "m.room.message" +} +``` + ## Protocol Changes From cb51a11225304426ec77195ccd6d4ca0b2e73772 Mon Sep 17 00:00:00 2001 From: Erik Johnston Date: Tue, 22 Jan 2019 17:02:16 +0000 Subject: [PATCH 06/16] Expand changes to api section --- proposals/1640-event-id-as-hashes.md | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/proposals/1640-event-id-as-hashes.md b/proposals/1640-event-id-as-hashes.md index 23f4d1c6e6f..9bbc1db05cd 100644 --- a/proposals/1640-event-id-as-hashes.md +++ b/proposals/1640-event-id-as-hashes.md @@ -42,12 +42,8 @@ which is: Event IDs will no longer be included as part of the event, and so must be calculated by servers receiving the event. -### Changes in APIs -All APIs that accept event IDs must accept event IDs of the new format - - -### Changes to Event Formats +## Changes to Event Formats As well as changing the format of event IDs, we also change the format of the `auth_events` and `prev_events` keys to simply be lists of event IDs (rather @@ -87,6 +83,20 @@ illustrative example, and that the hashes are not correct): } ``` +## Changes to existing APIs + +All APIs that accept event IDs must accept event IDs of the new format. + +For S2S API, whenever a server needs to parse an event they must either already +no the room version *or* be told. There are separate MSCs to update APIs where +necessary. + +For C2S API, the only change clients will see are that the event IDs have +changed format, but clients should already be treating event IDs as opaque +strings. Note that the `auth_events` and `prev_events` fields aren't sent to +clients, and so the changes proposed above won't effect clients. Servers must +add the `event_id` when sending the event to clients, however. + ## Protocol Changes From 54d9a7e4ec58a44ce497b42da8c339e36b66d922 Mon Sep 17 00:00:00 2001 From: Erik Johnston Date: Tue, 22 Jan 2019 17:04:45 +0000 Subject: [PATCH 07/16] Grammar --- proposals/1640-event-id-as-hashes.md | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/proposals/1640-event-id-as-hashes.md b/proposals/1640-event-id-as-hashes.md index 9bbc1db05cd..6c9d11e6f95 100644 --- a/proposals/1640-event-id-as-hashes.md +++ b/proposals/1640-event-id-as-hashes.md @@ -85,17 +85,20 @@ illustrative example, and that the hashes are not correct): ## Changes to existing APIs -All APIs that accept event IDs must accept event IDs of the new format. +All APIs that accept event IDs must accept event IDs in the new format. -For S2S API, whenever a server needs to parse an event they must either already -no the room version *or* be told. There are separate MSCs to update APIs where +For S2S API, whenever a server needs to parse an event from a request or +response they must either already know the room version *or* be told the room +version in the request/response. There are separate MSCs to update APIs where necessary. -For C2S API, the only change clients will see are that the event IDs have -changed format, but clients should already be treating event IDs as opaque -strings. Note that the `auth_events` and `prev_events` fields aren't sent to -clients, and so the changes proposed above won't effect clients. Servers must -add the `event_id` when sending the event to clients, however. +For C2S API, the only change clients will see is that the event IDs have changed +format. Clients should already be treating event IDs as opaque strings, so no +changes should be required. Servers must add the `event_id` when sending the +event to clients, however. + +Note that the `auth_events` and `prev_events` fields aren't sent to clients, and +so the changes proposed above won't effect clients. ## Protocol Changes From 25115c4fb109d631091224644f8c681dc9e2062c Mon Sep 17 00:00:00 2001 From: Erik Johnston Date: Tue, 22 Jan 2019 17:05:50 +0000 Subject: [PATCH 08/16] Make the sentence make sense --- proposals/1640-event-id-as-hashes.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/proposals/1640-event-id-as-hashes.md b/proposals/1640-event-id-as-hashes.md index 6c9d11e6f95..8f27c7802aa 100644 --- a/proposals/1640-event-id-as-hashes.md +++ b/proposals/1640-event-id-as-hashes.md @@ -46,8 +46,8 @@ calculated by servers receiving the event. ## Changes to Event Formats As well as changing the format of event IDs, we also change the format of the -`auth_events` and `prev_events` keys to simply be lists of event IDs (rather -than being lists of tuples). +`auth_events` and `prev_events` keys in events to simply be lists of event IDs +(rather than being lists of tuples). A full event would therefore look something like (note that this is just an illustrative example, and that the hashes are not correct): From 74f496b91f255421c5e009050a263878222bfcf6 Mon Sep 17 00:00:00 2001 From: Hubert Chathi Date: Wed, 23 Jan 2019 09:47:39 +0000 Subject: [PATCH 09/16] Update proposals/1640-event-id-as-hashes.md Co-Authored-By: erikjohnston --- proposals/1640-event-id-as-hashes.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/1640-event-id-as-hashes.md b/proposals/1640-event-id-as-hashes.md index 8f27c7802aa..09273aeaa96 100644 --- a/proposals/1640-event-id-as-hashes.md +++ b/proposals/1640-event-id-as-hashes.md @@ -98,7 +98,7 @@ changes should be required. Servers must add the `event_id` when sending the event to clients, however. Note that the `auth_events` and `prev_events` fields aren't sent to clients, and -so the changes proposed above won't effect clients. +so the changes proposed above won't affect clients. ## Protocol Changes From 41b1855de60728334ea237646d03bd4dc9d665d6 Mon Sep 17 00:00:00 2001 From: Erik Johnston Date: Wed, 23 Jan 2019 10:00:54 +0000 Subject: [PATCH 10/16] Fixup wording of redactions --- proposals/1640-event-id-as-hashes.md | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/proposals/1640-event-id-as-hashes.md b/proposals/1640-event-id-as-hashes.md index 09273aeaa96..c4738827683 100644 --- a/proposals/1640-event-id-as-hashes.md +++ b/proposals/1640-event-id-as-hashes.md @@ -113,8 +113,9 @@ The auth rules also need to change: still needs to be signed by the sender’s domain) - We currently allow redactions if the domain of the redaction event ID - matches the domain of the event ID its redacting. This allows self redaction - for servers, but would no longer be possible and there isn’t an obvious way - round it. The only practical suggestion to this is to accept the redactions - and only check if we should redact the target event once we received the - target event. + matches the domain of the event ID its redacting; which allows self + redaction. This check is removed and redaction events always accepted. + Instead, the redaction event only takes effect and is sent down to clients + if/when the original event is received, and the domain of the events' + senders match. (While this is clearly suboptimal, it is the only practical + suggestion) From 7d5a051574d0327b44d32366872a95cf3273d9fe Mon Sep 17 00:00:00 2001 From: Hubert Chathi Date: Thu, 24 Jan 2019 16:25:05 +0000 Subject: [PATCH 11/16] Update proposals/1640-event-id-as-hashes.md Co-Authored-By: erikjohnston --- proposals/1640-event-id-as-hashes.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/1640-event-id-as-hashes.md b/proposals/1640-event-id-as-hashes.md index c4738827683..e52b18eacc7 100644 --- a/proposals/1640-event-id-as-hashes.md +++ b/proposals/1640-event-id-as-hashes.md @@ -114,7 +114,7 @@ The auth rules also need to change: - We currently allow redactions if the domain of the redaction event ID matches the domain of the event ID its redacting; which allows self - redaction. This check is removed and redaction events always accepted. + redaction. This check is removed and redaction events are always accepted. Instead, the redaction event only takes effect and is sent down to clients if/when the original event is received, and the domain of the events' senders match. (While this is clearly suboptimal, it is the only practical From 583179a3a86fd6ed3eac0df25dbd517ed0900595 Mon Sep 17 00:00:00 2001 From: Erik Johnston Date: Thu, 24 Jan 2019 16:27:59 +0000 Subject: [PATCH 12/16] Clarify we're using unpadded base64 --- proposals/1640-event-id-as-hashes.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/proposals/1640-event-id-as-hashes.md b/proposals/1640-event-id-as-hashes.md index e52b18eacc7..797c0cd5a10 100644 --- a/proposals/1640-event-id-as-hashes.md +++ b/proposals/1640-event-id-as-hashes.md @@ -27,7 +27,9 @@ the ID format. **Conclusion:** Don't encode the hash function, since the hash will depend on the version specific redaction algorithm anyway. -The proposal is therefore that the event IDs are a base 64 encoded `sha256` hash +The proposal is therefore that the event IDs are a sha256 hash, encoded using +[unpadded +Base64](https://matrix.org/docs/spec/appendices.html#unpadded-base64), and prefixed with `$` (to aid distinguishing different types of identifiers). For example, an event ID might be: `$CD66HAED5npg6074c6pDtLKalHjVfYb2q4Q3LZgrW6o`. From f9a85260284bcb366f5e2efa454b3ef4a3d48a7a Mon Sep 17 00:00:00 2001 From: Erik Johnston Date: Thu, 24 Jan 2019 17:16:24 +0000 Subject: [PATCH 13/16] Move MSC number --- .../{1640-event-id-as-hashes.md => 1659-event-id-as-hashes.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename proposals/{1640-event-id-as-hashes.md => 1659-event-id-as-hashes.md} (100%) diff --git a/proposals/1640-event-id-as-hashes.md b/proposals/1659-event-id-as-hashes.md similarity index 100% rename from proposals/1640-event-id-as-hashes.md rename to proposals/1659-event-id-as-hashes.md From ad96ae0768ede7ea688af720e37db4e8336f9f2e Mon Sep 17 00:00:00 2001 From: Erik Johnston Date: Thu, 24 Jan 2019 18:45:49 +0000 Subject: [PATCH 14/16] Explicitly state this will be a new v3 room version --- proposals/1659-event-id-as-hashes.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/proposals/1659-event-id-as-hashes.md b/proposals/1659-event-id-as-hashes.md index 797c0cd5a10..e4c1628f1b2 100644 --- a/proposals/1659-event-id-as-hashes.md +++ b/proposals/1659-event-id-as-hashes.md @@ -121,3 +121,9 @@ The auth rules also need to change: if/when the original event is received, and the domain of the events' senders match. (While this is clearly suboptimal, it is the only practical suggestion) + + +## Room Version + +There will be a new room version v3 that is the same as v2 except uses the new +event format proposed above. From af34773dadca9140390d2bbce705261dcf90ed1b Mon Sep 17 00:00:00 2001 From: Neil Johnson Date: Tue, 29 Jan 2019 17:33:03 +0000 Subject: [PATCH 15/16] Update 1659-event-id-as-hashes.md --- proposals/1659-event-id-as-hashes.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/proposals/1659-event-id-as-hashes.md b/proposals/1659-event-id-as-hashes.md index e4c1628f1b2..2be86f0585c 100644 --- a/proposals/1659-event-id-as-hashes.md +++ b/proposals/1659-event-id-as-hashes.md @@ -126,4 +126,5 @@ The auth rules also need to change: ## Room Version There will be a new room version v3 that is the same as v2 except uses the new -event format proposed above. +event format proposed above. v3 will be marked as 'stable' as defined in [MSC1804](https://github.com/matrix-org/matrix-doc/blob/travis/msc/room-version-client-advertising/proposals/1804-advertising-capable-room-versions.md) + From 66dc7714f419ca78efef9ff78b5b04457b6d9b40 Mon Sep 17 00:00:00 2001 From: Kitsune Ral Date: Wed, 30 Jan 2019 16:58:50 +0000 Subject: [PATCH 16/16] Update proposals/1659-event-id-as-hashes.md Co-Authored-By: erikjohnston --- proposals/1659-event-id-as-hashes.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/1659-event-id-as-hashes.md b/proposals/1659-event-id-as-hashes.md index 2be86f0585c..1187d95e693 100644 --- a/proposals/1659-event-id-as-hashes.md +++ b/proposals/1659-event-id-as-hashes.md @@ -115,7 +115,7 @@ The auth rules also need to change: still needs to be signed by the sender’s domain) - We currently allow redactions if the domain of the redaction event ID - matches the domain of the event ID its redacting; which allows self + matches the domain of the event ID it is redacting; which allows self redaction. This check is removed and redaction events are always accepted. Instead, the redaction event only takes effect and is sent down to clients if/when the original event is received, and the domain of the events'