-
Notifications
You must be signed in to change notification settings - Fork 385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MSC1703: encrypting recovery keys for online megolm backups #1703
Changes from all commits
7960290
8c3e04b
c53aaee
8ab9ece
f66f0f5
3f282af
80abfe2
89e2556
c26bd4f
2ffb58c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,156 @@ | ||||||
# Proposal for storing an encrypted recovery key on the server to aid recovery of megolm key backups | ||||||
|
||||||
## Problem | ||||||
|
||||||
[MSC1219](https://github.com/matrix-org/matrix-doc/issues/1219) proposes an API | ||||||
for optionally storing encrypted megolm keys on your homeserver, so if a user | ||||||
loses all their devices, they can still recover their history. The megolm keys | ||||||
are public-key encrypted using a private Curve25519 key that only the end-user | ||||||
has. | ||||||
|
||||||
However, there are usability concerns about users having to store their | ||||||
Curve25519 recovery private key in a secure manner. Casual users are likely to | ||||||
be scared away by having to file away a relatively long (e.g. 10 word) | ||||||
generated recovery key. | ||||||
|
||||||
We would like to give the user the option to access their key backup using a | ||||||
passphrase in addition to their recovery key. We can take inspiration from | ||||||
Apple’s [FileVault 2](https://hal.inria.fr/hal-01460615/document) where Apple | ||||||
store encrypted copies of your FileVault AES key on your hard disk, encrypted | ||||||
by your UNIX account password, or a passphrased SSH private key on a server for | ||||||
convenience. | ||||||
|
||||||
## Proposed solution | ||||||
|
||||||
Three solutions are given here (two of which are viable, one included for | ||||||
completeness), varying in the implications of the user changing their | ||||||
passphrase. | ||||||
|
||||||
Option 1 has been chosen, on the basis that we do not require the user to | ||||||
be able to change their passphrase without also changing their recovery key. | ||||||
|
||||||
### Recovery Key | ||||||
|
||||||
In all options below, the process for generating a recovery key from a byte | ||||||
string, b is as follows: | ||||||
* Prepend the two bytes 0x8B, 0x01 to the byte string b | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. magic bytes are magical. Any reason for choosing these values? |
||||||
* Compute a parity bit by XORing all bytes of the resulting string (ie. prefix | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
+ `byte string`) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this explodes into a bullet list in the rendered version. also why is |
||||||
* Append the parity byte to the prefix + b | ||||||
* base58 encode the resulting byte string with alphabet | ||||||
'123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz'. | ||||||
* Format the resulting ASCII string into groups of 4 characters separated by | ||||||
spaces. | ||||||
|
||||||
### Option 1 | ||||||
|
||||||
The user provides a passphrase, P. The client generates the backup encryption | ||||||
private key, K<sup>-1</sup> by running PBKDF on this passphrase. The PBKDF | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why are these called K-1 and K rather than, say, Kpriv and Kpub? (eg, https://git.matrix.org/git/olm/about/docs/olm.rst uses Kprivate and Kpublic) |
||||||
parameters are stored in the auth_data of the key backup under | ||||||
'private_key_salt' and 'private_key_iterations' keys, respectively: | ||||||
|
||||||
```json | ||||||
{ | ||||||
[...] | ||||||
"private_key_salt": "MmMsAlty", | ||||||
"private_key_iterations": 100000 | ||||||
} | ||||||
``` | ||||||
|
||||||
The backup public encryption key, K, is determined by running the curve25519 | ||||||
function on K<sup>-1</sup> with basepoint {9}. The recovery key is then | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
It took me a long time to figure out wtf this was talking about, and in particular what the Honestly I think the description of how Curve25519 derives the public key from the private key is out of place here and only makes me wonder if we are doing something which is different from the normal use of Curve25519. Can we just say "K-1 is used as a Curve25519 private key" and leave the derivation of the public key implied? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd find it helpful if the recovery key got an identifier, like "R" or sth |
||||||
generated by encoding K<sup>-1</sup> as above. | ||||||
|
||||||
To change the passphrase, a client creates a completely new backup version, | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As per #1538 (comment), I suggest s/ version//, particularly given you refer to it as a "new backup" on the next line. |
||||||
performing the steps above with the new passphrase. The client then re-encrypts | ||||||
all sessions keys and uploads them to the new backup. The user will always get | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. tbh I think 'will always' confuses more than it adds.
Suggested change
|
||||||
a new recovery key whenever they change their passphrase. | ||||||
|
||||||
In this option, the recovery key is generated directly from the passphrase | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
using PBKDF. This means the ciphertext of the backed up keys is more vulnerable | ||||||
to dictionary attacks. Option 2b attempts to offer a mitigation against this. | ||||||
|
||||||
### Option 2a | ||||||
|
||||||
The backup encryption private key, K<sup>-1</sup> is generated by a secure | ||||||
random number generator. A private key, K<sup>-1</sup><sub>p</sub> is generated | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
by running PBKDF on the passphrase. K<sup>-1</sup><sub>p</sub>' is generated by | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what do we actually do with K-1p (or K-1p') other than store them along with the backup? presumably one of them gets used to encrypt the backup? |
||||||
XORing K<sup>-1</sup> with K<sup>-1</sup><sub>p</sub>. | ||||||
K<sup>-1</sup><sub>p</sub>' is stored on the along with the key backup in the | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. stored on the ... ? |
||||||
`private_key` object above. The recovery key is generated by encoding | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. sorry, in which |
||||||
K<sup>-1</sup> as above. | ||||||
|
||||||
To change the passphrase, the client generates the new | ||||||
K<sup>-1</sup><sub>p</sub> from the new passphrase then computes a new | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. could we have 'q' rather than 'new p'? |
||||||
K<sup>-1</sup><sub>p</sub>'. It then updates the backup information with this | ||||||
new K<sup>-1</sup><sub>p</sub>'. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It would be nice to spell out how the keys are recovered |
||||||
|
||||||
This would require the API to support updating the metadata stored with a | ||||||
backup (or the key parameters to be stored elsewhere, eg. in account data). | ||||||
|
||||||
This option, however, allows the server to obtain K<sup>-1</sup> by obtaining | ||||||
any one of the users previous passphrases, assuming it keeps copies of the | ||||||
previous versions of the key parameters. This option is therefore not viable, | ||||||
but included for completeness. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OH MAN NOW YOU TELL ME c'mon already. put this up-front. |
||||||
|
||||||
### Option 2b | ||||||
|
||||||
A variant on option 2a is to regenerate K<sup>-1</sup> when the passphrase is | ||||||
changed, meaning the recovery does change when the passphrase is changed, | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. s/recovery/recovery key/ |
||||||
making it identical feature-wise to option 1 and without the problem of any | ||||||
previous passphrase being sufficient to obtain K<sup>-1</sup>. It differs, | ||||||
however, in that K<sup>-1</sup> is generated randomly and therefore not | ||||||
vulnerable to dictionary attacks. However, K<sup>-1</sup><sub>p</sub> is still | ||||||
vulnerable to dictionary attacks and is stored in the same place with the same | ||||||
protection, and, if compromised, gives access to K<sup>-1</sup>. This option | ||||||
therefore offers no significant security benefit over option 1. | ||||||
|
||||||
### Option 3 | ||||||
|
||||||
The backup encryption private key, K<sup>-1</sup>, and a private, | ||||||
passphrase-derived key, K<sup>-1</sup><sub>p</sub> are generated as above.The | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. s/./. / There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
passphrase key counterpart, K<sup>-1</sup><sub>p</sub>', is also generated as | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. there are a bunch of formulae here which are imprecisely and inconsistently described. How about giving each calculation its own line? the passphrase key counterpart, K-1p', is calculated as:
|
||||||
above from the K<sup>-1</sup> XOR K<sup>-1</sup><sub>p</sub>. Another private | ||||||
key, K<sup>-1</sup><sub>r</sub> is generated also by a secure random number | ||||||
generator and encoded to give the recovery key as above. | ||||||
K<sup>-1</sup><sub>r</sub>' is generated by XORing K<sup>-1</sup><sub>r</sub> | ||||||
with K<sup>-1</sup>. Both K<sup>-1</sup><sub>p</sub>' and | ||||||
K<sup>-1</sup><sub>r</sub>' are stored in the `private_key` in the backup under | ||||||
keys `passphrase_counterpart` and `recovery_key_counterpart` respectively. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. how are all these keys actually used for encryption? |
||||||
|
||||||
To change the passphrase, the client starts a new backup version as in option 1 | ||||||
(generating a new K<sup>-1</sup>), but additionally computes a new | ||||||
K<sup>-1</sup><sub>r</sub>' by XORing K<sup>-1</sup><sub>r</sub> with the new | ||||||
K<sup>-1</sup>. This refreshes all keys, but allows the user to keep the same | ||||||
recovery key for their backup, on the assumption that the recovery key itself | ||||||
has not been compromised. If it has, the client generates a new backup with a | ||||||
completely fresh recovery key instead. | ||||||
|
||||||
## Security considerations | ||||||
|
||||||
The proposal above is vulnerable to a malicious server admin performing a | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. surely only option 1 is vuln to dict attack? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. and 2b? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (and possibly 3; I haven't grokked how it works yet) |
||||||
dictionary attack against the encrypted passphrases stored on their server to | ||||||
access history. (It's worth bearing in mind that the server admin can also | ||||||
always hijack its user's accounts; the thing that stopping them from | ||||||
impersonating their users is E2E device verification.) | ||||||
|
||||||
## Possible extensions | ||||||
|
||||||
In future, we could consider supporting authenticating users for login based on | ||||||
their encrypted passphrase, meaning that users only have to remember one | ||||||
password for their Matrix account rather than a login password and a | ||||||
history-access passphrase. However, this of course exposes the user's whole | ||||||
E2E history to the risk of dictionary attacks by public attackers (i.e. not | ||||||
just server admins), keysniffer-at-login attacks or clients which are lazy | ||||||
about storing account passwords securely. There's also a risk that because | ||||||
login passwords are much more commonly entered than history passwords, they | ||||||
might encourage users to force a weaker password. It's unclear whether this | ||||||
reduction in security-in-depth is worth the UX benefits of a single master | ||||||
password, so we suggest checking how this proposal goes first (given in general | ||||||
we expect key recovery to happen by cross-verifying devices at login rather | ||||||
than by entering a recovery key or passphrase). | ||||||
|
||||||
## See also: | ||||||
|
||||||
Notes from discussing this IRL are at | ||||||
https://docs.google.com/document/d/11fF1rbX5eTkrfxXRS8UhpW5sBENOCydYlLWzB8X1IuU/edit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could the file be renamed so that it matches the MSC number?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please?