Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ThreadSanitizer crashes on indirect use of stack-referenced SystemRandomNumberGenerator #66099

Open
gwynne opened this issue May 24, 2023 · 4 comments
Labels
arm64 Architecture: arm64 (aarch64) — any 64-bit ARM bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. TSan For issues in the Thread Sanitizer itself

Comments

@gwynne
Copy link
Contributor

gwynne commented May 24, 2023

Description

When Thread Sanitizer is enabled, an apparent miscompile takes place in NIOWebSocket.WebSocketMaskingKey.random(), causing a write to an invalid address on the second invocation of the method on macOS and on the first invocation thereof on Linux, in all current and development Swift versions (see below for specifics). The issue does not appear when no sanitizer is enabled, nor with --sanitize=address.

Steps to reproduce

This is the most minimal reproducer I could find (simplified post-merge of apple/swift-nio#2433):

  1. git clone https://github.com/apple/swift-nio.git && cd swift-nio
  2. swift test --sanitize=thread --filter=testRandomMaskingKeyIsNotAlwaysZeroWithDefaultGenerator

Expected behavior
The tests pass.

Environment
Failure observed against:

  • macOS 13.4 (22F66), Xcode 14.3 (Build version 14E222b), Swift version swift-driver version: 1.75.2 Apple Swift version 5.8 (swiftlang-5.8.0.124.2 clang-1403.0.22.11.100)\nTarget: arm64-apple-macosx13.0
  • swift:5.8-jammy Docker image, Swift version swift-5.8-RELEASE
  • swiftlang/swift:nightly-5.9-jammy Docker image, Swift version swift-5.9-DEVELOPMENT-SNAPSHOT-2023-05-22-a
  • swiftlang/swift:nightly-main-jammy Docker image, Swift version swift-DEVELOPMENT-SNAPSHOT-2023-05-20-a

Additional Context
Output from the crash in each test environment is attached (note: these were made before the new test was added to NIO and thus refer to the tweaked versions of the existing tests - the logs are otherwise identical using the new tests):
tsan-crash-macos.txt
tsan-crash-swift-5.8-jammy.txt
tsan-crash-swift-nightly-5.9-jammy.txt
tsan-crash-swift-nightly-main-jammy.txt

@gwynne gwynne added bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. triage needed This issue needs more specific labels labels May 24, 2023
gwynne added a commit to vapor/websocket-kit that referenced this issue May 24, 2023
gwynne added a commit to vapor/websocket-kit that referenced this issue May 24, 2023
@weissi
Copy link
Contributor

weissi commented May 24, 2023

Apple folk, this is tracked as rdar://109594242 and seems to reproduce on Apple Silicon only on both macOS & Linux.

Thanks very much for to Vapor's websocket-kit and @gwynne for reporting and reducing this bug so much, even contributing a new SwiftNIO patch which runs into this. Fab work!

@weissi weissi added arm64 Architecture: arm64 (aarch64) — any 64-bit ARM TSan For issues in the Thread Sanitizer itself labels May 24, 2023
@gwynne
Copy link
Contributor Author

gwynne commented Nov 21, 2023

This is still happening in the latest 5.9.1 (most recently encountered via https://github.com/apple/swift-nio/blob/main/Sources/NIOWebSocket/NIOWebSocketClientUpgrader.swift#L157-L160)

@gwynne
Copy link
Contributor Author

gwynne commented Nov 21, 2023

I tinkered with NIOWebSocketClientUpgrader. randomRequestKey() a little, and found out that:

  1. @inlinable's presence or absence makes no difference whatsoever - I tried removing it, replacing it with @inline(__always), and specifying both; nothing changed.
  2. Adding @_transparent makes the crash go away (presumably due to it forcing inlining even in a debug build).

@gwynne
Copy link
Contributor Author

gwynne commented Nov 21, 2023

Additional information:

The crash no longer seems to occur on macOS. Environment is macOS 14.1.1 (23B81), Xcode 15.0.1 (15A507), Swift swift-driver version: 1.87.1 Apple Swift version 5.9 (swiftlang-5.9.0.128.108 clang-1500.0.40.1).

The crash also no longer appears on Linux under standard swift test conditions. However, it does still appear on Linux when --parallel is in effect - and it is still a SIGSEGV identical to the original report, not just some previously unnoticed threading issue. The following Linux environments are affected, with --parallel required in all cases (although even that now fails evoke the crash on macOS):

  • swift:5.9-jammy image on aarch64, Swift tag swift-5.9.1-RELEASE
  • swift:5.9-jammy image on amd64, Swift tag swift-5.9.1-RELEASE
  • swiftlang/swift:nightly-5.10-jammy image on aarch64, Swift tag swift-5.10-DEVELOPMENT-SNAPSHOT-2023-11-15-a
  • swiftlang/swift:nightly-main-jammy image on aarch64, Swift tag swift-DEVELOPMENT-SNAPSHOT-2023-11-17-a

Notably, the error on the Intel image (with Rosetta emulation enabled) is different; rather than crashing with a segfault mid-test, ThreadSanitizer immediately fails with the message FATAL: ThreadSanitizer: unexpected memory mapping 0x800000000000-0x800000023000. The address range is always the same, even on different images and across restarts of Docker. This message appears REGARDLESS of the use of --parallel!

Edit: Perhaps unsurprisingly, the address range in question appears to be allocated to Docker's revised Rosetta support; this entry appears in the /proc/<pid>/maps of every process in the container:

800000000000-800000023000 r--p 00000000 00:28 2                          /rosetta/rosetta

Environment: Docker Desktop for Mac, version 4.25.1 (128006), Linux kernel 6.4.16-linuxkit

@hborla hborla removed the triage needed This issue needs more specific labels label Apr 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arm64 Architecture: arm64 (aarch64) — any 64-bit ARM bug A deviation from expected or documented behavior. Also: expected but undesirable behavior. TSan For issues in the Thread Sanitizer itself
Projects
None yet
Development

No branches or pull requests

3 participants