Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
PHP: Implement TLS 1.2 to decrypt https:// and ssl:// traffic and tra…
…nslate it into fetch() (#1926) Enables HTTPS requests from PHP via `file_get_contents()`, curl, and all other networking mechanisms. This PR effectively performs a MITM attack on the PHP instance to decrypt the outbound traffic, run the request using `fetch()`, and then provide an encrypted response – everything as if PHP was directly talking to the right server. ## How is it implemented? Emscripten can be configured to stream all network traffic through a WebSocket. `@php-wasm/node` and `wp-now` use that to access the internet via a local WebSocket->TCP proxy, but the in-browser version of WordPress Playground exposes no such proxy. This PR ships a "fake" WebSocket class. Instead of starting a `ws://` connection, it translates the raw HTTP/HTTPS bytes into a `fetch()` call. In case of HTTP, the raw request bytes are parsed into a Request object with a body stream and passes it to `fetch()`. Then, as the response status, headers, and the body arrive, they're stream-encoded as raw response bytes and exposed as incoming WebSocket data. In case of HTTPS, we the raw bytes are first piped through a custom TCPConnection class as follows: 1. We generate a self-signed CA certificate and tell PHP to trust it using the `openssl.cafile` PHP.ini option 1. We create a domain-specific child certificate and sign it with the CA private key. 1. We start accepting raw encrypted bytes, process them as structured TLS records, and perform the TLS handshake. 1. Encrypted tunnel is established * TLSConnection decrypts the encrypted outbound data sent by PHP * TLSConnection encrypts the unencrypted inbound data fed back to PHP From there, the plaintext data is treated by the same HTTP<->fetch() machinery as described in the previous paragraph. ## Implementation details This PR ships: * PHP.wasm bindings to pipe the outbound bytes through a `WebSocket <-> TLS <-> fetch()` pipeline. * A subset of TLS 1.2 protocol implementation (parts of [RFC 5246](https://datatracker.ietf.org/doc/html/rfc5246), [RFC 6066](https://datatracker.ietf.org/doc/html/rfc6066.html), [RFC 4492](https://datatracker.ietf.org/doc/html/rfc4492#section-5.4), [RFC 8446](https://www.iana.org/go/rfc8446), [RFC 6070](https://www.ietf.org/rfc/rfc6070.txt)) * SSL certificate generator supporting CA certs signed certs ### TLS 1.2 * Parses all TLS record types: handshakes, alerts, application data. * Performs the full TLS handshake required for ECDH encryption including the necessary TLS 1.2 extensions. * Correctly encrypts and decrypts all the post-handshake data. * Uses `window.crypto()` for encryption. * Only supports the `TLS1_CK_ECDHE_RSA_WITH_AES_128_GCM_SHA256` mode. * Doesn't support multiple `ChangeCipherSpec` messages. ### SSL certificate generator * CA certificate is generated at WASM boot (if networking is enabled) * Host-specific certificate is generated at every request and signed with CA private key * Certificates are created using a custom [ASN.1/DER](https://letsencrypt.org/docs/a-warm-welcome-to-asn1-and-der/) encoder and a PEM exporter shipped in this PR * Only RSA 2048 with SHA-256 supported today ## Avenues explored but not pursued This work supersedes #1093 where `node-forge` was used. Here's why I'm moving to a custom TLS implementation: * `node-forge` runs everything synchronously and ships a lot of code. `window.crypto` is async, faster, bundles less code, and is more convenient than `node-forge`. * With `node-forge`, every error made me question fundamentals like the RSA implementation. With `window.crypto()`, I feel confident assuming that encryption, hashing, signing etc. are implemented correctly. * `node-forge` doesn't support TLS 1.3. Neither does this PR, but after implementing TLS 1.2 I think adding TLS 1.3 support would be reasonably easy ## Testing instructions Go to the URL below and confirm you see "Hello-dolly.zip downloaded from https://downloads.wordpress.org/plugin/hello-dolly.1.7.3.zip has this many bytes: int(1887)" ``` http://localhost:5400/website-server/?php=8.0&wp=6.6&networking=yes&language=&multisite=no&random=f1qv1twpssr#%7B%22landingPage%22:%22/network-test.php%22,%22preferredVersions%22:%7B%22php%22:%228.0%22,%22wp%22:%22latest%22%7D,%22phpExtensionBundles%22:%5B%22kitchen-sink%22%5D,%22steps%22:%5B%7B%22step%22:%22writeFile%22,%22path%22:%22/wordpress/network-test.php%22,%22data%22:%22%3C?php%20echo%20'Hello-dolly.zip%20downloaded%20from%20https://downloads.wordpress.org/plugin/hello-dolly.1.7.3.zip%20has%20this%20many%20bytes:%20';%20var_dump(strlen(file_get_contents('https://downloads.wordpress.org/plugin/hello-dolly.1.7.3.zip')));%22%7D%5D%7D ``` From there, you could manipulate the URL in the `file_get_contents()` call to fetch a different file, file with no CORS headers, invalid URLs etc. Confirm that each time PHP did something sensible, e.g. displayed the length, displayed the error message, etc. It should never just hang. Also, confirm the newly added CI tests work as expected. ## Remaining work - [x] Add a solid unit and E2E test suite, especially for: - [x] Streaming: bytes, pause, more bytes - [x] `fetch()` exceptions - [x] Slow servers - [x] POST requests - [x] Add abundant docstrings to explain what's happening at each stage - [x] Core work - [x] Be more strict in `httpRequestToFetch` about HTTP (plaintext) vs HTTPS (go through TLS) vs other protocols (reject connection). For example, check ports, pay attention to parsing errors, etc. - [x] Rebuild all the in-browser PHP.wasm versions - [x] Don't run any of this code when networking is disabled - [x] Continue using the custom handler for the Requests library to enable direct `fetch()` calls without the encrypt->decrypt->encrypt->decrypt overhead. - [x] Clean it up ## Follow up work * Caching – perhaps as a follow-up * Ship a precomputed CA cert and private key * Memoize host-specific certificates CC @brandonpayton @bgrgicak @dmsnell @mho22
- Loading branch information