Skip to content

Latest commit

 

History

History
451 lines (329 loc) · 20.5 KB

4161-crypto-terminology.md

File metadata and controls

451 lines (329 loc) · 20.5 KB

MSC4161: Crypto terminology for non-technical users

Background

Matrix makes use of advanced cryptographic techniques to provide secure messaging. These techniques often involve precise and detailed language that is unfamiliar to non-technical users.

This document provides a list of concepts and explanations that are intended to be suitable for use in Matrix clients that are aimed at non-technical users.

Ideally, encryption in Matrix should be entirely invisible to end-users (much as WhatsApp or Signal users are not exposed to encryption specifics). This initiative is referred to as "Invisible Cryptography" and is tracked as:

  • MSC4153 - Exclude non-cross-signed devices,
  • MSC4048 - Authenticated key backup,
  • MSC4147 - Including device keys with Olm-encrypted events, and
  • MSC4161 - this document

Why is this important?

Use of common terminology should help further these goals:

  • to reduce confusion: many members of the community are confused by the crypto features in Matrix clients, and the profusion of different words for the same thing makes it much worse. By reducing the number of words, and carefully choosing good words, we hope to develop a common language which makes Matrix easier to understand, and easier to explain.

  • to ease migration: one of the key features of Matrix for end-users is the choice of clients, meaning no-one is locked in to a particular piece of software. If each client uses conflicting terminology, it becomes much more difficult to move to a different client, which works against the user's ability to migrate.

This proposal uses "SHOULD" language rather than "MUST", because there are many good reasons why a particular client might choose different wording. In particular, different clients may have very different audiences who communicate in different ways and understand different metaphors. This proposal hopes to nudge client developers towards consistency, but never at the cost of their unique relationship with their users.

Outcomes

We hope that Matrix client developers will like the terms and wording we provide, and adapt their user interfaces and documentation to use them. (If this MSC is accepted, Element will use it as a reference for English wording in its clients.)

Where concepts and terms exactly match existing terms in the Matrix spec, we propose changing the spec to use the terms from this document. Where they do not match, we are very comfortable with different words being used in the spec, given it is a highly technical document, as opposed to a client user interface.

We hope that this MSC will:

  • Cause small changes in the spec (as described in the previous paragraph), and
  • Become an appendix in the spec, with a description that makes clear that the intended audience is different from most of the spec, meaning different words are used from the main spec body.

Clients may, of course, choose to use different language. Some clients may deliberately choose to use more technical language, to suit the profiles of their users. This document is aimed at clients targeting non-technical users.

Where changes are made in the spec, we suggest that notes be added mentioning the old name, as in this example.

Proposal

When communicating about cryptography with non-technical users, we propose using the following terms and concepts.

When referring to concepts outlined in this document in their user interface, clients SHOULD use the language specified, except where their own users are known to understand different terms more easily. When making such exceptions, clients SHOULD document how they deviate from this document, and why.

Devices

Note: this section depends on MSC4153 ("Exclude non-cross-signed devices"), which specifies clients should avoid sending and receiving encryption info with devices that are not cross-signed by their owner ("insecure" devices in our terminology). While MSC4153 remains unmerged, the parts of this section relating to insecure devices should be considered non-normative.

Instances of a client are called 'devices' (not 'sessions'). Aligned with MSC4153, we take it as granted that all devices taking part in encryption have been cross-signed by the user who owns them, and we call these devices.

Devices which have published cryptographic keys (thus being visible as "cryptographic devices" to other users) but which have not been cross-signed are considered an error state, primarily to be encountered during the transition to MSC4153 and/or due to buggy/incomplete/outdated clients. These devices are referred to as not secure or insecure and their existence is considered a serious and dangerous error condition, similar to an invalid TLS certificate.

"This device is not secure. Please verify it to continue."

"Ignoring 5 messages that were sent from a device that is not secure."

"Confirm it's you" (when asking to verify a device during login)

⚠️ Avoid saying "secure device". All devices are considered secure by default; the user doesn't typically need to worry about the fact that insecure devices are a thing, given they should only ever occur in error (or transitional) scenarios.

⚠️ Avoid saying "trusted device" or "verified device". Devices are not users, and it is helpful to use different language for users vs. devices. (However, we do use the verb "verify" to describe how to make a device secure. By using the same verb, we help users understand the confusing fact that verifying devices and verifying users are similar processes, but with different outcomes.)

⚠️ Avoid using "cross-signing", which requires a deeper knowledge of cryptography to understand.

⚠️ Avoid mentioning "device keys" - a device is just secure or not.

⚠️ Avoid "session" to mean device. Device better describes what most users encounter, and is more commonly used in other messaging apps.

Logging out

In contrast to some other services, logging out (or signing out) of a Matrix device is quite a significant operation: it means the encryption data on this device is lost, and if you log out of all devices you will need to use your recovery key to re-establish your identity and regain access to your old messages.

If using a trusted physical device, the right choice for a user may well be not to log out, but simply to close the app or browser and re-open it later. This preserves their identity and their access to message history.

"Are you sure you want to log out?"

"If you log out of all devices, you will lose access to message history and will need to reset your identity."

Verified user

When you verify a user they become verified. This means that you have cryptographic proof that no-one is listening in on your conversations. (You need this if you suspect someone in a room may be using a malicious homeserver.)

In many contexts, most users are not verified: verification is a manual step (scanning a QR code or comparing emojis). (In future, verification will probably become more common thanks to MSC2882 Transitive Trust or something similar). When an unverified user resets their identity, we should warn the user, so they are aware of the change.

If Alice is verified with Bob, and then Alice's identity changes (i.e. Alice resets their master cross-signing key) then this is very important to Bob: Bob verified Alice because they care about proof that no-one is listening, and now someone could be. Bob can choose to withdraw verification (i.e. "demote" Alice from being verified), or re-verify with Alice. Until Bob does one or the other, Bob's communication with Alice should contain a prominent and serious warning that Alice's verified identity has changed.

"This user is verified."

"WARNING: Bob's verified identity has changed!"

"You verified this user's identity, but it has changed. Please choose to re-verify them or withdraw verification."

⚠️ Avoid using "cross-signing", which requires a deeper understanding of cryptography to understand.

⚠️ Avoid using "trust on first use (TOFU)", which is a colloquial name for noting the identity of users who are not verified so that we can notify the user if it changes. (This is a kind of "light" form of verification where we assume that the first identity we can see is trusted.)

⚠️ Avoid confusing verification of users with verification of devices: the mechanism is similar but the purpose is different. Devices must be verified to make them secure, but users can optionally be verified to ensure no-one is listening in or tampering with communications.

⚠️ Avoid talking about "mismatch" or "verification mismatch" which is very jargony - it is the identity which is mismatched, not the verification process. Just say "Bob's verified identity has changed".

⚠️ Where possible, avoid talking about "cryptographic identity" which is very jargony. In many contexts, just the word "identity" is sufficient: the dictionary definition of identity meaning that someone is who they claim they are, not someone else. The fact we confirm identity cryptographically is usually irrelevant to the user.

Identity

A user's identity is proof of who they are, and, if you have verified them, proof that you have a secure communication channel with them. Your own identity proves who you are, and gives you access to key storage.

Technical note: we use "identity" here to describe a collection of keys: the master signing key, user signing key, device signing key, key storage key and others.

Your identity allows you to be identified by other users, and also allows you to access key storage and therefore see message history. This identity may be stored on the server by using recovery. The recovery key is not part of your identity, but allows you to re-establish your identity if you lose all your devices.

When a non-verified user resets their identity: "Warning: Alice's identity has changed."

Longer explanation: This can happen if the user lost all their devices and the recovery key, but it can also be a sign of someone taking over the account. To be sure, please verify their identity by going to their profile.

When a verified user resets their identity: "WARNING: Bob's verified identity has changed!"

(During login, at the "Confirm it's you" stage):

"If you don't have any other device and you have lost your recovery key, you can create a new identity. (Warning: you will lose access to your old messages!)" button text (in red or similar): "Reset my identity"

"Are you sure you want to reset your identity? You will lose access to your message history."

⚠️ Avoid saying "master key" - this is an implementation detail.

⚠️ Avoid saying "Alice reset their encryption" - the reason that Alice's identity changed could be due to attack rather than because they reset their encryption (plus "encryption" is jargony).

Message key

A message key is used to decrypt a message. The metaphor is that messages are "locked in a box" by encrypting them, and "unlocked" by decrypting them.

"Store message keys on the server."

⚠️ Avoid saying "key" without a previous word saying what type of key it is.

⚠️ Avoid using "room key". These keys are used to decrypt messages, not rooms.

Note: this clashes with the term "message key" in the double ratchet. Since the double ratchet algorithm is for a very different audience, we think that this is not a problem.

Unable to decrypt

When we have an encrypted message but no message key to decrypt it, we are unable to decrypt it.

When we expect the key to arrive, we are waiting for this message.

"Waiting for this message" with a button: "learn more" that explains that the message key for this message has not yet been received, but that we expect it to arrive shortly. Further detail may be provided, for instance explaining that connectivity issues between the sender's homeserver and our own can cause key delivery delays.

When the user does not have the message key for a permanent and well-understood reason, for example if it was sent before they joined the room, we say you don't have access to this message.

"You don't have access to this message" e.g. if it was sent before the user entered the room, or the user does not have key storage set up.

Message history

Your message history is a record of every message you have received or sent, and is particularly used to describe messages that are stored on the server rather than your device(s). Where messages are encrypted, the message keys are required to be able to read them, so "message history" includes those keys, which are held in key storage.

Key storage

Key storage means message keys that are kept on the server, so that they can be shared with the user's other devices (including new devices added in the future).

The keys inside key storage are themselves encrypted, so that the server operator is not able to access them and read your messages.

In the spec, key storage is referred to as server-side key backup.

"Allow key storage"

"Key storage holds the keys that allow you to read your message history."

"Message history is unavailable because key storage is disabled."

⚠️ Avoid using "key backup" to talk about storing message keys: this is too easily confused with exporting keys or messages to an external system. Key storage is for day-to-day use (reading message history), not a redundant store for disaster recovery.

⚠️ Avoid talking about more keys: "the backup key is stored in the secret storage, and this allows us to decrypt the messages keys from key backup". Instead, we simply say that both identity and message keys are stored in key storage.

Recovery

Recovery is useful when a user loses all their devices (or logs out of them all).

If recovery is enabled, the user's identity is saved on the server, allowing them to recover it if they lose all their devices. This in turn allows them to recover their key storage and see message history. To recover their identity the user must enter the recovery key.

The server is not able to read or manipulate the saved identity, because it is encrypted using the recovery key.

If a user loses their recovery key, they may reset their identity. Unless they have old devices, they will not be able to access old encrypted messages because the new identity does not have access to the old key storage.

A recovery key (or recovery code) is a way of re-establishing your identity if you lose all your devices. This in turn allows you to access key storage, and therefore see message history. If you re-establish your identity instead of resetting it, other users won't see "Alice's identity has changed" messages, and you will be able to read your message history, even if you logged out everywhere or lost your devices.

A recovery passphrase is an easier-to-remember way of accessing the recovery key and has the same purpose as the recovery key.

In the spec, recovery is referred to as secret storage, or "4S".

"Write down your recovery key in a safe place"

"If you lose access to your devices and your recovery key, you will need to reset your identity, meaning you will lose all your message history"

⚠️ Avoid "4S" or "quad-S" - these are not descriptive terms.

⚠️ Avoid using "security key", "security code", "master key". A recovery key allows "unlocking" the key storage, which is a "box" that is on the server, containing your identity and message keys. It is used to recover the situation if you lose access to your devices. None of these other terms express this concept so clearly.

⚠️ Remember that users may have historically been trained to refer to these concepts as "security key" or "security passphrase", and so user interfaces should provide a way for users to be educated on the terminology change (e.g. a tooltip or help link): e.g. "Your recovery key may also have been referred to as a security key in the past"

⚠️ Be aware that old versions of the spec use "recovery key" to refer to the private half of the backup encryption key, which is different from the usage here. The recovery key described in this section is referred to in the spec as the secret storage key.

Losing the recovery key

If the user loses their recovery key, they no longer have a way to recover their identity.

If the user still has a secure device, then that device has its own copy of the identity information, so they can change recovery key without losing their identity, meaning other users will not see "Alice's identity has changed", and they will be able to continue using key storage to access message history.

Note: users should be encouraged to change their recovery key if they have forgotten their recovery key, because they are in a precarious position - if they lose access to their device, they will be forced to reset their identity and lose message history.

If the user does not have a device, or all their devices are insecure, then they will need to reset their identity, meaning other users see "Alice's identity has changed", and they lose access to their old key storage, meaning they cannot read message history.

"If you lose your recovery key you can generate a new one if you are signed in elsewhere"

⚠️ Distinguish between "Reset identity" and "Change recovery key" - these are very different actions: resetting identity is destructive, whereas changing recovery key from a device that holds the full identity information is benign.

Potential issues

Lots of existing clients use a whole variety of different terminology, and many users are familiar with different terms. Nevertheless we believe that working together to agree on a common language is the only way to address this issue over time.

Alternatives

Device vs. Session

There is debate over the use of the word "device" to identify an instance of a client. Objections to "device" include:

  • Multiple apps on the same physical device will be listed as separate devices, which may cause confusion.
  • Logging out and in on the same physical device will result in a new "device" being created.
  • Some applications, especially on Web, use "session" for this concept.

The most popular alternative is "session". Objections to "session" include:

  • It is an unfamiliar word for non-technical users: they have no metaphor to work with to understand it.
  • It has multiple existing alternative meanings within Matrix.

"Device" was chosen in the proposal because:

  • It is familiar from similar messaging apps.
  • It has a clear meaning in everyday speech, giving users a stepping-stone towards understanding what it means in this context.
  • For novice users, it corresponds well with the everyday meaning: when they first engage with Matrix, they will use one "device" per physical device.
  • The extension to think of multiple virtual "devices" on a physical device is simple and familiar from other applications.
  • Messaging apps are increasingly used on mobile devices, especially as the first point of contact, and "device" is commonly used in mobile apps.
  • The spec uses "device" for precisely this concept, which is a bonus.

Further work

Several other concepts might benefit from similar treatment. Within cryptography, "device dehydration" is a prime candidate. Outside cryptography, many other terms could be agreed, including "export chat" (particularly in contrast to "export message keys").

Security considerations

In order for good security practices to work, users need to understand the implications of their actions, so this MSC should be reviewed by security experts to ensure it is not misleading.

Dependencies

None

Credits

Written by Andy Balaam, Aaron Thornburgh and Patrick Maier as part of our work for Element. Richard van der Hoff, Matthew Hodgson and Denis Kasak contributed many improvements before the first draft was published.