Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIP007 - Per-session subkey #42

Closed
riobard opened this issue Feb 10, 2017 · 36 comments
Closed

SIP007 - Per-session subkey #42

riobard opened this issue Feb 10, 2017 · 36 comments

Comments

@riobard
Copy link
Contributor

riobard commented Feb 10, 2017

Background

As discussed in #36 and #40, currently Shadowsocks has a fundamental design flaw of potential key-nonce pair reuse. Specifically, we are using a long-term M-bit pre-shared key and N-bit random nonces, resulting in a (M+N)-bit key-nonce pair containing only N-bit randomness, where N is usually insufficient (N < 128).

#36 proposed to deprecate ciphers where N <= 96, which unfortunately eliminated many good ciphers like Chacha20 and Chacha20-Poly1305 where N=64. However the problem still exists, albeit at slightly lower probability. Even when N=96, there are practical concerns about the safety margin. Needless to say, we are operating against best practices and security recommendations.

#40 identified the design flaw and paved the way to this SIP.

Proposal

I propose to introduce a per-session subkey derived from the pre-shared master key using HKDF, and use the subkey to encrypt/decrypt in both stream ciphers and AEAD ciphers. Essentially it means we are moving from (M+N)-bit (PSK, nonce) pair to (M+N)-bit (HKDF(PSK, salt), nonce) pair. Because HKDF is a PRF, the new construction significantly expands the amount of randomness (from N to at least M where M is much greater than N), thus correcting the previously mentioned design flaw.

Additionally, because the pre-shared key is usually generated from a human-chosen text password of insufficient entropy, the result is not very strong. HKDF gives us the benefit of producing cryptographically strong derived keys even if the input master key is weak.

Details

Assuming we already have a user-supplied pre-shared master key PSK.

Function HKDF_SHA1 is a HKDF constructed using SHA1 hash. Its signature is

HKDF_SHA1(secret_key, salt, info)

The "info" string argument allows us to bind the derived subkey to a specific application context.

For stream ciphers, the revised encryption scheme is:

  1. Pick a random R-bit salt (R = max(128, len(SK)))
  2. Derive subkey SK = HKDF_SHA1(PSK, salt, "ss-subkey")
  3. Derive initialization vector IV = HKDF_SHA1(PSK, salt, "ss-iv")
  4. Send salt
  5. Send StreamEncrypt(SK, IV, payload)

Note that even with the above changes, the old approach using stream ciphers is still not secure. We encourage users to switch to AEAD ciphers as soon as possible.

For compatibility reasons, we will probably NOT adding per-session key to stream ciphers.

For AEAD ciphers, the revised encryption scheme is:

  1. Pick a random R-bit salt (R = max(128, len(SK)))
  2. Derive subkey SK = HKDF_SHA1(PSK, salt, "ss-subkey")
  3. Send salt
  4. For each chunk, encrypt and authenticate payload using SK with a counting nonce (starting from 0 and increment by 1 after each use)
  5. Send encrypted chunk
@madeye
Copy link
Contributor

madeye commented Feb 10, 2017

LGTM. +1.

I suggest to follow the naming convention skey-cipher when adding support for this SIP, i.e. skey-chacha20 and skey-chacha20-poly1305.

@riobard
Copy link
Contributor Author

riobard commented Feb 10, 2017

@wongsyrone HKDF has two phases: extract and expand, which allows the input keying material to be weak. My understanding is that crypto_generichash_blake2b_salt_personal has only the expand phase, therefore it's not as good. However you could construct HKDF_blake2b, which would be similar to HKDF_SHA1. SHA1 has the advantage of being more common with mature lib support.

The protocol mostly stays the same. This SIP only changes how the cipher key is generated.

@madeye I'm wondering if we should amend SIP004 to force the new design at least for AEAD ciphers since it's pretty new and AEAD ciphers are more vulnerable to nonce reuse. For stream ciphers, we definitely need the prefix.

@madeye
Copy link
Contributor

madeye commented Feb 10, 2017

@riobard I'm fine with it. AEAD is still in pre-release stage, so I think we are safe now.

@madeye
Copy link
Contributor

madeye commented Feb 10, 2017

@madeye
Copy link
Contributor

madeye commented Feb 10, 2017

@wongsyrone I think you mean "not that random" salt in broken clients? Yes, it's still a problem.

@riobard
Copy link
Contributor Author

riobard commented Feb 10, 2017

Alright, I guess the right plan is to amend SIP004 with SIP007 and rewrite AEAD ciphers to use the new construction and provide additional stream ciphers.

I'm not familiar with C, but in other higher-level languages like Go and Python, using HKDF is very simple. You can pick your own hash function. SHA1 is just a conservative choice.

@riobard
Copy link
Contributor Author

riobard commented Feb 10, 2017

Updated the proposal to include guidance on the length of random salt.

@madeye
Copy link
Contributor

madeye commented Feb 10, 2017

Actually it's an open pull request for mbedtls. Mbed-TLS/mbedtls#542

After reviewing the code, I think the implementation is very solid.

@madeye
Copy link
Contributor

madeye commented Feb 10, 2017

It depends. If we don't have better library for HKDF, I'll directly implement it in shadowsocks, like what we are doing for PBKDF currently.

@riobard
Copy link
Contributor Author

riobard commented Feb 11, 2017

I have implemented AEAD ciphers with subkey here https://github.com/riobard/go-shadowsocks2/tree/sip007

@riobard
Copy link
Contributor Author

riobard commented Feb 11, 2017

I applied similar treatment to derive a new subkey for each UDP packet with nonce setting to all zero.

Also, I think it's pointless to retrofit the old stream ciphers with the session key. If people want better security, they should use AEAD ciphers anyway. Less compatibility headache.

@madeye
Copy link
Contributor

madeye commented Feb 11, 2017

@riobard OK. Let's only apply this on AEAD.

@riobard
Copy link
Contributor Author

riobard commented Feb 11, 2017

@madeye Could you please test the interoperability once you finish the changes in shadowsocks-libev? I think I've implemented this SIP correctly, but without another reference point, it's not guaranteed. Let me know if there's any problem.

@madeye
Copy link
Contributor

madeye commented Feb 11, 2017

Thanks. I'm working on it.

@hellofwy
Copy link

Don't remove original stream ciphers, not everyone think they need the extra security.
Some people care performance ( Encryption/Decryption Speed ) more.
Learning a lot from discussions in recent issues, thank you all.

PS:
Really hope someone can pickup the ShadowVPN project.

@riobard
Copy link
Contributor Author

riobard commented Feb 11, 2017

The new AEAD ciphers actually perform very well on modern devices. On my machines, AES-128-GCM gives me about 2x the throughput of AES-128-CTR and AES-128-CFB.

I've actually written my own implementation of ShadowVPN (with some experimental changes) in Go. However, UDP traffic is heavily throttled on many ISPs, making it less viable for widespread adoption. That's why I'm back to Shadowsocks.

@riobard riobard changed the title SIP007 Per-session subkey SIP007 - Per-session subkey Feb 11, 2017
@hellofwy
Copy link

One TCP connection with one tunnelled TCP connection would be better. And from my currently knowledge, no VPN supports this .

AES-128-GCM is good, but I'm using AES-256-CFB now which is a silly choice made year ago before knowing more. And ChaCha20 with authentication is a little slower from the benchmark in the ss-libev issue #1173.

Current Shadowsocks client on Windows is just a proxy front end, not very convenient in some case. And my router doesn't have TPROXY module for UDP redirection. This is my first motive to see how ShadowVPN works.

And I think UDP with an auto BBR-like congestion control method is great and may help uploading on OSes which don't support BBR. KCP has too many parameters which need some tweak time. UDP may working better in some ISPs. But I don't have the capability to implement this.

@riobard
Copy link
Contributor Author

riobard commented Feb 11, 2017

TCP over TCP is bad for performance.

KCP basically moves the congestion control from kernel to userspace, making it easier for people to experiment. However this empowerment comes with great danger that selfish people would abuse available bandwidth and we all suffer as a result. That's why fairness is a core concern in TCP.

@madeye
Copy link
Contributor

madeye commented Feb 12, 2017

@riobard
Copy link
Contributor Author

riobard commented Feb 12, 2017

Great!

@hellofwy
Copy link

Fairness protocols in subscriber sides just save ISP's efforts to improve router design.
So temporal unfair usage will rebalance in some time.

@riobard
Copy link
Contributor Author

riobard commented Feb 12, 2017

I'm afraid the Internet doesn't work that way.

The only thing to happen for sure is even worse QoS on all traffic.

@Mygod
Copy link
Contributor

Mygod commented Feb 13, 2017

I suggest to follow the naming convention skey-cipher when adding support for this SIP, i.e. skey-chacha20 and skey-chacha20-poly1305.

I wonder what convention.

@madeye
Copy link
Contributor

madeye commented Feb 13, 2017

@Mygod Never mind. We don't need to change any cipher name now. All AEAD ciphers are forced to use per-session subkey.

@Mygod
Copy link
Contributor

Mygod commented Feb 13, 2017

Hmm okay. However has the downside of this proposal been discussed? This proposal effectively introduces the risk of skey collisions and leaves a large space of nonce unused for most (if not all) of the time.

@riobard
Copy link
Contributor Author

riobard commented Feb 13, 2017

With random salt at minimal 128-bit, the probability of session key collision is negligible.

And you can always use 192 or 256-bit key (which will in turn use 192/256-bit salt).

@netheril96
Copy link

OK, so the server and client both use the same derived key, and with the same nounce? That is a key-nounce reuse vulnerability. If I don't get it wrong, the reuse is also present in old design, but as we are devising a brand new protocol here, it should fixed. The server and client should have different subkey by having different "info" argument.

@madeye
Copy link
Contributor

madeye commented Feb 20, 2017

@netheril96 The two directions of one TCP stream are totally independent. It means their salts and subkeys are also generated from different random bytes.

@netheril96
Copy link

@madeye Is that also the case in the old protocol? Please update the documentation on the spec to clarify this, because I read it several times, and found no evidence of independent IVs in the design.

@madeye
Copy link
Contributor

madeye commented Feb 20, 2017

@netheril96

The first packet of a shadowsocks TCP connection sent either from server-side or client-side must contains the randomly generated IV that used for the encryption.

https://shadowsocks.org/en/spec/protocol.html#TCP

@madeye
Copy link
Contributor

madeye commented Feb 24, 2017

@dszhengyu
Copy link

dszhengyu commented Sep 7, 2017

According to the spec https://shadowsocks.org/en/spec/AEAD-Ciphers.html
the length and the data fields are encrypted separately.
So it seems more like doing AE TWICE rather than AEAD, because we haven't encrypted any associated data.

@riobard
Copy link
Contributor Author

riobard commented Sep 7, 2017

Yes, because we do not have any associated data (yet).

@arthurkiller
Copy link

arthurkiller commented Sep 7, 2017

Exactly,
As I see, ss-protocol cut TCP streaming data into several wrapped data block, and then put them into stream. So the associated data is the data block u gonna to wrap I think.

@fortuna
Copy link
Contributor

fortuna commented Aug 2, 2019

Question from 2019: do we need to worry about the fact the key derivation uses SHA1, given that it's known to be broken?

@fortuna
Copy link
Contributor

fortuna commented Aug 2, 2019

Well, it seems I found my answer: https://crypto.stackexchange.com/questions/26510/why-is-hmac-sha1-still-considered-secure

HMAC-SHA1 seems to still be secure, which is what we seem to use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants