Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop mergeDiagnosticJson having @CacheableTransform; speed up build #1693

Merged
merged 5 commits into from
Sep 12, 2024

Conversation

CRogers
Copy link
Contributor

@CRogers CRogers commented Sep 12, 2024

Before this PR

There's a task mergeDiagnosticJson, which gathers up all the different possible diagnostics listed in json files from libraries and combines them together into the SLS manifest.

It does this by using an artifact transform, which has been listed as build cacheable using @CacheableTransform. We use remote build caches in builds, so this will, for each jar in the runtime classpath, attempt to load the result of extracting a single file from this jar from the remote build cache.

The runtime of actually executing extracting a single small file at a known path from a zip is milliseconds. Doing a network request to the build cache is likely 100s of milliseconds. Since there are many jars, we're making many build cache requests which all take significantly longer than just doing the operation locally. We don't even get to avoid downloading the input jars by using the build cache - they still need to exist to have their hash calculated for cache keys.

The result is that on internal projects this is a common sight:

mergeDiagnosticJson

and we can check a build scan to see how bad this actually is: 372 cacheable transforms executed, taking a whopping 53 secs!

Screenshot 2024-09-12 at 17 25 36

After this PR

==COMMIT_MSG==
Do not use the remote build cache for mergeDiagnosticJson; it takes far longer than just doing the operation locally. Avoids spamming output with "Requesting from remote build cache" too.
==COMMIT_MSG==

This speeds it up hugely: 372 non-cacheable transforms executed, taking under 3 seconds!

Screenshot 2024-09-12 at 17 25 43

Possible downsides?

@changelog-app
Copy link

changelog-app bot commented Sep 12, 2024

Generate changelog in changelog/@unreleased

What do the change types mean?
  • feature: A new feature of the service.
  • improvement: An incremental improvement in the functionality or operation of the service.
  • fix: Remedies the incorrect behaviour of a component of the service in a backwards-compatible way.
  • break: Has the potential to break consumers of this service's API, inclusive of both Palantir services
    and external consumers of the service's API (e.g. customer-written software or integrations).
  • deprecation: Advertises the intention to remove service functionality without any change to the
    operation of the service itself.
  • manualTask: Requires the possibility of manual intervention (running a script, eyeballing configuration,
    performing database surgery, ...) at the time of upgrade for it to succeed.
  • migration: A fully automatic upgrade migration task with no engineer input required.

Note: only one type should be chosen.

How are new versions calculated?
  • ❗The break and manual task changelog types will result in a major release!
  • 🐛 The fix changelog type will result in a minor release in most cases, and a patch release version for patch branches. This behaviour is configurable in autorelease.
  • ✨ All others will result in a minor version release.

Type

  • Feature
  • Improvement
  • Fix
  • Break
  • Deprecation
  • Manual task
  • Migration

Description

Do not use the remote build cache for mergeDiagnosticJson; it takes far longer than just doing the operation locally. Avoids spamming output with "Requesting from remote build cache" too.

Check the box to generate changelog(s)

  • Generate changelog entry

…m:palantir/sls-packaging into callumr/stop-merge-diagnostic-build-cache
@felixdesouza
Copy link
Contributor

The savings!!! Nice find!

@CRogers
Copy link
Contributor Author

CRogers commented Sep 12, 2024

I think some of these transforms managed to run in parallel with other tasks; but the overall build time still improved from 2mins 7secs -> 1min 45secs, so about 20 secs overall build time improvement, which is still great.

@felixdesouza
Copy link
Contributor

I think some of these transforms managed to run in parallel with other tasks; but the overall build time still improved from 2mins 7secs -> 1min 45secs, so about 20 secs overall build time improvement, which is still great.

That is strange though, I would have expected artifact transforms (cached or not) to be treated as a separate entity allowing it to run in parallel. One day I will understand Gradle's execution model 😢

@CRogers
Copy link
Contributor Author

CRogers commented Sep 12, 2024

I suspect there's a limited number of execution threads and the transform is consuming a huge number of them because there are so many files. It certainly seems like that with the big block of Requesting from the remote build cache you see.

@CRogers
Copy link
Contributor Author

CRogers commented Sep 12, 2024

Any other comments or is this good to merge?

@felixdesouza
Copy link
Contributor

I commented https://github.com/palantir/sls-packaging/pull/1693/files#r1757240550 but not sure if you saw it.

@bulldozer-bot bulldozer-bot bot merged commit 6bf6358 into develop Sep 12, 2024
5 checks passed
@bulldozer-bot bulldozer-bot bot deleted the callumr/stop-merge-diagnostic-build-cache branch September 12, 2024 17:18
@autorelease3
Copy link

autorelease3 bot commented Sep 12, 2024

Released 7.66.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants