Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

generator: Cache the lld binary after extracting it from the LLVM archive #149

Merged
merged 2 commits into from
Nov 14, 2024

Conversation

euanh
Copy link
Contributor

@euanh euanh commented Nov 13, 2024

Extracting lld from the llvm archive takes about 40s on my machine, and
after the binary has been extracted a 5.5GB directory is left behind in the
Artifacts directory.

This change uses tar filters to reduce the number of files unpacked from
the archive and caches the resulting lld binary.

Excluding the time to download the archives, building an SDK from
Debian packages before lld has been cached takes about 57s on an
M3 MacBook Air. A subsequent build with lld cached takes about
17s.

The SDKs generated by the old and new methods are identical.

This PR adds CacheKey conformance to Array, as suggested by
@yingguqing in #106. Adding this conformance to Range was not
necessary in this case.

@euanh
Copy link
Contributor Author

euanh commented Nov 13, 2024

@swift-ci test

@euanh euanh changed the title Host lld unpack speed generator: Cache the lld binary after extracting it from the LLVM archive Nov 13, 2024
@euanh euanh force-pushed the host-lld-unpack-speed branch from 9851d28 to 2bcfdd6 Compare November 13, 2024 15:31
@euanh
Copy link
Contributor Author

euanh commented Nov 13, 2024

@swift-ci test

@euanh euanh added performance Speeding up SDK generation no functional change labels Nov 13, 2024
@euanh
Copy link
Contributor Author

euanh commented Nov 13, 2024

We won't need to download LLVM for Swift 6.0 and later because the open source toolchain package from swift.org already includes ld.lld.

@euanh euanh marked this pull request as ready for review November 13, 2024 15:45
@euanh euanh requested a review from MaxDesiatov as a code owner November 13, 2024 15:45
Copy link
Contributor

@MaxDesiatov MaxDesiatov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGMT modulo some nits

This is needed to cache arrays of [FilePath.Component], so a subsequent commit
can cache the slow extraction of the `ld.lld` binary from the `llvm` tar archive.

This problem was also reported by @yingguqing in swiftlang#106, although Range conformance
does not seem to be required to cache lld.

Suggested-by: @yingguqing
…hive

Extracting `lld` from the `llvm` archive takes about 40s on my
machine, and after the binary has been extracted a 5.5GB directory
is left in the `Artifacts` directory.

This change uses `tar` filters to reduce the number of files unpacked from
the archive and caches the resulting `lld` binary.

Excluding the time to download the archives, building an SDK from
Debian packages before `lld` has been cached takes about 57s on an
M3 MacBook Air.   A subsequent build with `lld` cached takes about
17s.

The SDKs generated by the old and new methods are identical.
@euanh euanh force-pushed the host-lld-unpack-speed branch from 2bcfdd6 to 17bb372 Compare November 13, 2024 17:46
@euanh
Copy link
Contributor Author

euanh commented Nov 13, 2024

@swift-ci test

@euanh euanh merged commit e07ecaa into swiftlang:main Nov 14, 2024
3 checks passed
@euanh euanh deleted the host-lld-unpack-speed branch November 14, 2024 08:43
euanh added a commit to euanh/swift-sdk-generator that referenced this pull request Nov 28, 2024
After Swift 6.0 the swift.org toolchain includes LLD, so it is not
necessary to install it separately from an LLVM archive.   The LLD
provided by the swift.org toolchain is also a multiarch binary, so
the SDK will run on both x86_64 and aarch64 macOS hosts.

This change slightly reduces the size of a Swift 6.0 SDK because
currently these SDKs include two copies of LLD - one from the SDK
and one from LLVM.   This is because LLD is typically installed as
a single binary with multiple symlinks pointing to it representing
different 'personalities' such as `ld.lld`, `wasm-ld` and so on.   SDK
generator copies the `lld` binary in as `ld.lld`, overwriting one of
the symlinks:

    -rwxr-xr-x   1 euanh  staff   108M 14 Nov  2023 ld.lld
     lrwxr-xr-x   1 euanh  staff     3B 24 Oct 15:11 ld64.lld -> lld
    -rwxr-xr-x   1 euanh  staff   265M 24 Oct 19:28 lld
    lrwxr-xr-x   1 euanh  staff     3B 24 Oct 15:11 lld-link -> lld
    lrwxr-xr-x   1 euanh  staff     3B 24 Oct 15:11 wasm-ld -> lld

After this PR, we only have one copy of the `lld` binary:

    lrwxr-xr-x   1 euanh  staff     3B 24 Oct 15:11 ld.lld -> lld
    lrwxr-xr-x   1 euanh  staff     3B 24 Oct 15:11 ld64.lld -> lld
    -rwxr-xr-x   1 euanh  staff   265M 24 Oct 19:28 lld
    lrwxr-xr-x   1 euanh  staff     3B 24 Oct 15:11 lld-link -> lld
    lrwxr-xr-x   1 euanh  staff     3B 24 Oct 15:11 wasm-ld -> lld

This change does not significantly change the time to build a new
SDK with a warm cache (PR swiftlang#149), however the very first build with
a cold cache improves considerably because the LLVM archive does
not need to be downloaded and unpacked.

For example, two tests run one after the other on an M3 MacBook Air
with a fast network connection:

    rm -rf Artifacts Bundles
    swift run swift-sdk-generator make-linux-sdk --swift-version 6.0.2-RELEASE

`main`: 1m21s,     (2.9GB in `Artifacts` directory)
this PR: 39s,    (2.1GB in `Artifacts` directory)
euanh added a commit to euanh/swift-sdk-generator that referenced this pull request Nov 28, 2024
After Swift 6.0 the swift.org toolchain includes LLD, so it is not
necessary to install it separately from an LLVM archive.   The LLD
provided by the swift.org toolchain is also a multiarch binary, so
the SDK will run on both x86_64 and aarch64 macOS hosts.

This change slightly reduces the size of a Swift 6.0 SDK because
currently these SDKs include two copies of LLD - one from the SDK
and one from LLVM.   This is because LLD is typically installed as
a single binary with multiple symlinks pointing to it representing
different 'personalities' such as `ld.lld`, `wasm-ld` and so on.   SDK
generator copies the `lld` binary in as `ld.lld`, overwriting one of
the symlinks:

    -rwxr-xr-x   1 euanh  staff   108M 14 Nov  2023 ld.lld
    lrwxr-xr-x   1 euanh  staff     3B 24 Oct 15:11 ld64.lld -> lld
    -rwxr-xr-x   1 euanh  staff   265M 24 Oct 19:28 lld
    lrwxr-xr-x   1 euanh  staff     3B 24 Oct 15:11 lld-link -> lld
    lrwxr-xr-x   1 euanh  staff     3B 24 Oct 15:11 wasm-ld -> lld

After this PR, we only have one copy of the `lld` binary:

    lrwxr-xr-x   1 euanh  staff     3B 24 Oct 15:11 ld.lld -> lld
    lrwxr-xr-x   1 euanh  staff     3B 24 Oct 15:11 ld64.lld -> lld
    -rwxr-xr-x   1 euanh  staff   265M 24 Oct 19:28 lld
    lrwxr-xr-x   1 euanh  staff     3B 24 Oct 15:11 lld-link -> lld
    lrwxr-xr-x   1 euanh  staff     3B 24 Oct 15:11 wasm-ld -> lld

This change does not significantly change the time to build a new
SDK with a warm cache (PR swiftlang#149), however the very first build with
a cold cache improves considerably because the LLVM archive does
not need to be downloaded and unpacked.

For example, two tests run one after the other on an M3 MacBook Air
with a fast network connection:

    rm -rf Artifacts Bundles
    swift run swift-sdk-generator make-linux-sdk --swift-version 6.0.2-RELEASE

`main`: 1m21s,     (2.9GB in `Artifacts` directory)
this PR: 39s,    (2.1GB in `Artifacts` directory)
euanh added a commit that referenced this pull request Nov 28, 2024
After Swift 6.0 the swift.org toolchain includes LLD, so it is not
necessary to install it separately from an LLVM archive.   The LLD
provided by the swift.org toolchain is also a multiarch binary, so
the SDK will run on both x86_64 and aarch64 macOS hosts.

This change slightly reduces the size of a Swift 6.0 SDK because
currently these SDKs include two copies of LLD - one from the SDK
and one from LLVM.   This is because LLD is typically installed as
a single binary with multiple symlinks pointing to it representing
different 'personalities' such as `ld.lld`, `wasm-ld` and so on.   SDK
generator copies the `lld` binary in as `ld.lld`, overwriting one of
the symlinks:

    -rwxr-xr-x   1 euanh  staff   108M 14 Nov  2023 ld.lld
    lrwxr-xr-x   1 euanh  staff     3B 24 Oct 15:11 ld64.lld -> lld
    -rwxr-xr-x   1 euanh  staff   265M 24 Oct 19:28 lld
    lrwxr-xr-x   1 euanh  staff     3B 24 Oct 15:11 lld-link -> lld
    lrwxr-xr-x   1 euanh  staff     3B 24 Oct 15:11 wasm-ld -> lld

After this PR, we only have one copy of the `lld` binary:

    lrwxr-xr-x   1 euanh  staff     3B 24 Oct 15:11 ld.lld -> lld
    lrwxr-xr-x   1 euanh  staff     3B 24 Oct 15:11 ld64.lld -> lld
    -rwxr-xr-x   1 euanh  staff   265M 24 Oct 19:28 lld
    lrwxr-xr-x   1 euanh  staff     3B 24 Oct 15:11 lld-link -> lld
    lrwxr-xr-x   1 euanh  staff     3B 24 Oct 15:11 wasm-ld -> lld

This change does not significantly change the time to build a new
SDK with a warm cache (PR #149), however the very first build with
a cold cache improves considerably because the LLVM archive does
not need to be downloaded and unpacked.

For example, two tests run one after the other on an M3 MacBook Air
with a fast network connection:

    rm -rf Artifacts Bundles
    swift run swift-sdk-generator make-linux-sdk --swift-version 6.0.2-RELEASE

`main`: 1m21s,     (2.9GB in `Artifacts` directory)
this PR: 39s,    (2.1GB in `Artifacts` directory)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no functional change performance Speeding up SDK generation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants