Added ability to run linux workflows on large runners #6273

rashidnhm · 2024-09-13T20:15:18Z

Context:

Currently the CI gets congested when large amounts of pull requests are being updated simultaneously. This pull request gives PRs an escape hatch and use large runners and use different queue to have CI jobs be picked up.

Description of the Change:

This pull request adds two new features:

Ability to add the urgent label to any pull request and switch it over to large runners
Automatic swap of rc branch to large runner
- This assumes the rc branch is of the format vX.Y.Z-rcN

Large runners, albeit slightly more powerful than standard runners, can be spawned at a much higher volume than standard runners ... this is because we pay per minute for these runners vs being included on our GitHub Plan.

If a PR needs CI run without waiting for a runner, add the urgent label to the pull request.

Important Note:

This only affect jobs that run on pull_request and use ubuntu runners.
This change is already in-place in lightning and catalyst.
- sc-66351 auto onboard large runenrs pennylane-lightning#774
- Update determine runner workflow to use large runners if the branch name matches the rc branching format catalyst#846

Benefits:
Ability to leverage large runner to have quick time for a runner to pick up a job.

Possible Drawbacks:
Though we dictate the pool size of large runners, it is possible to still saturate it.

Related GitHub Issues:
None. sc-73711

github-actions · 2024-09-13T20:15:34Z

Hello. You may have forgotten to update the changelog!
Please edit doc/releases/changelog-dev.md with:

A one-to-two sentence description of the change. You may include a small working example for new features.
A link back to this PR.
Your name (or GitHub username) in the contributors section.

.github/workflows/determine-workflow-runner.yml

.github/workflows/format.yml

.github/workflows/tests.yml

mudit2812

Looks quite good! Just a couple of questions about where it might make sense to not use large runners.

rashidnhm · 2024-09-13T20:51:23Z

There is also a syntax error with the runs-on expression, I will fix it up Monday morning!

…n once vs many times)

codecov · 2024-09-16T14:27:03Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.40%. Comparing base (056bb92) to head (51754c7).
Report is 326 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #6273      +/-   ##
==========================================
- Coverage   99.71%   99.40%   -0.32%     
==========================================
  Files         447      447              
  Lines       42418    42418              
==========================================
- Hits        42299    42164     -135     
- Misses        119      254     +135

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

mudit2812

Are the test runtimes expected to stay the same even while using the large runners? I don't see noticeable improvements in the runtimes based on the most recent CI run, especially in the core tests. Additionally, looks like the jax tests are quite imbalanced when using the large runners. We might need different durations.json files for the case where large runners are used.

If that's expected, once the conversation about using large runners with the docs and format workflows is resolved, I'm happy to approve.

mudit2812

Probably a good idea to unconditionally set the upload.yml action to use the large runners as well.

Alex-Preciado · 2024-10-15T20:31:15Z

Automatic swap of rc branch to large runner

Hey @rashidnhm, how big is the large runners pool? I don't know a lot about the specifics but I’m concerned that having all codebases use the large runners pool during feature freeze (when multiple rc branches will be created in the ecosystem) might lead to the exact situation we’re trying to avoid—competing for resources. I’m worried this could severely impact Lightning during an already busy period for example.

rashidnhm · 2024-10-15T20:39:15Z

Automatic swap of rc branch to large runner

Hey @rashidnhm, how big is the large runners pool? I don't know a lot about the specifics but I’m concerned that having all codebases use the large runners pool during feature freeze (when multiple rc branches will be created in the ecosystem) might lead to the exact situation we’re trying to avoid—competing for resources. I’m worried this could severely impact Lightning during an already busy period for example.

Great question Alex!

The current pool that is shared across the PL org is 60 runners, this is what GitHub provides as standard for our billing plan. The large runner pool is already set to 150, which is 2.5x the capacity of the standard pool, if we are worried about congestion, we can scale the pool up all the way to 1000, more than enough for PL needs!

Since we pay per build minute on large runners, there is a lot of flexibility with scaling.

We can also increase the pool size just during feature freeze and then scale back down after release

Alex-Preciado · 2024-10-15T21:30:43Z

Nice, This is music to my ears 🚀 ... Thank you so much for the details, @rashidnhm !!

…11-onboard-to-large-runners

mudit2812

Thanks Rashid. Just a couple of comments, but very close to approval ready :)

.github/workflows/determine-workflow-runner.yml

.github/workflows/upload.yml

.github/workflows/determine-workflow-runner.yml

mudit2812

Thanks! There's still one unresolved conversation, but that won't impact how we use the workflow, so approving :)

Co-authored-by: Mudit Pandey <mudit.pandey@xanadu.ai>

…aneAI/pennylane into sc-73711-onboard-to-large-runners

PietropaoloFrisoni

Thank you @rashidnhm, it seems very good to me!

I just have a non-blocking observation, but I don't see any problem at this stage

.github/workflows/upload.yml

mudit2812 · 2024-10-18T14:07:08Z

Since @rashidnhm is away today, I'm going to merge this once all checks pass.

**Context:** Currently the CI gets congested when large amounts of pull requests are being updated simultaneously. This pull request gives PRs an escape hatch and use large runners and use different queue to have CI jobs be picked up. **Description of the Change:** This pull request adds two new features: - Ability to add the `urgent` label to any pull request and switch it over to large runners - Automatic swap of rc branch to large runner - This assumes the rc branch is of the format `vX.Y.Z-rcN` Large runners, albeit slightly more powerful than standard runners, can be spawned at a much higher volume than standard runners ... this is because we pay per minute for these runners vs being included on our GitHub Plan. If a PR needs CI run without waiting for a runner, **add the `urgent` label to the pull request**. Important Note: - This only affect jobs that run on `pull_request` and use `ubuntu` runners. - This change is already in-place in lightning and catalyst. - PennyLaneAI/pennylane-lightning#774 - PennyLaneAI/catalyst#846 **Benefits:** Ability to leverage large runner to have quick time for a runner to pick up a job. **Possible Drawbacks:** Though we dictate the pool size of large runners, it is possible to still saturate it. **Related GitHub Issues:** None. [sc-73711](https://app.shortcut.com/xanaduai/story/73711/update-pennylane-ci-to-use-large-runner-group) --------- Co-authored-by: Mudit Pandey <mudit.pandey@xanadu.ai>

Added ability to run linux workflows on large runners

9b38435

Trigger CI

8125f8b

rashidnhm added the urgent Mark a pull request as high priority label Sep 13, 2024

mudit2812 reviewed Sep 13, 2024

View reviewed changes

.github/workflows/determine-workflow-runner.yml Outdated Show resolved Hide resolved

mudit2812 reviewed Sep 13, 2024

View reviewed changes

.github/workflows/format.yml Show resolved Hide resolved

mudit2812 reviewed Sep 13, 2024

View reviewed changes

.github/workflows/tests.yml Outdated Show resolved Hide resolved

mudit2812 reviewed Sep 13, 2024

View reviewed changes

Move determine_runner job to interface-unit-test workflow (have it ru…

b030d4a

…n once vs many times)

rashidnhm added 2 commits September 16, 2024 10:34

Merge branch 'master' into sc-73711-onboard-to-large-runners

c2c1ede

Remove usage of large runners on scheduled job

18fd83e

mudit2812 reviewed Sep 17, 2024

View reviewed changes

mudit2812 reviewed Oct 9, 2024

View reviewed changes

rashidnhm added 9 commits October 16, 2024 13:41

Merge branch 'master' of github.com:PennyLaneAI/pennylane into sc-737…

a615108

…11-onboard-to-large-runners

Fix duplicate yaml key issue

3ee1894

try new input

b5b9a43

try new input

b097f3a

try new input

b4250f8

Chain

f4cb6fa

On-board upload.yml to use large runner

acc640c

Merge branch 'master' of github.com:PennyLaneAI/pennylane into sc-737…

c02242e

…11-onboard-to-large-runners

Remove benchmark workflow

289a73b

rashidnhm requested review from mudit2812 and PietropaoloFrisoni October 17, 2024 15:22

mudit2812 reviewed Oct 17, 2024

View reviewed changes

.github/workflows/determine-workflow-runner.yml Show resolved Hide resolved

.github/workflows/determine-workflow-runner.yml Show resolved Hide resolved

.github/workflows/upload.yml Outdated Show resolved Hide resolved

Fix issue in upload.yml

fcdb907

rashidnhm requested a review from mudit2812 October 17, 2024 17:37

mudit2812 reviewed Oct 17, 2024

View reviewed changes

.github/workflows/upload.yml Outdated Show resolved Hide resolved

mudit2812 reviewed Oct 17, 2024

View reviewed changes

.github/workflows/determine-workflow-runner.yml Show resolved Hide resolved

mudit2812 approved these changes Oct 17, 2024

View reviewed changes

rashidnhm and others added 3 commits October 17, 2024 14:07

Update .github/workflows/upload.yml

9f95f3e

Co-authored-by: Mudit Pandey <mudit.pandey@xanadu.ai>

Use large runner if PR base is rc branch

c571182

Merge branch 'sc-73711-onboard-to-large-runners' of github.com:PennyL…

d6f5b25

…aneAI/pennylane into sc-73711-onboard-to-large-runners

PietropaoloFrisoni approved these changes Oct 17, 2024

View reviewed changes

.github/workflows/upload.yml Show resolved Hide resolved

Merge branch 'master' into sc-73711-onboard-to-large-runners

51754c7

mudit2812 enabled auto-merge (squash) October 18, 2024 14:07

albi3ro disabled auto-merge October 18, 2024 14:34

albi3ro merged commit 1bc346a into master Oct 18, 2024
39 of 40 checks passed

albi3ro deleted the sc-73711-onboard-to-large-runners branch October 18, 2024 14:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added ability to run linux workflows on large runners #6273

Added ability to run linux workflows on large runners #6273

rashidnhm commented Sep 13, 2024 •

edited

Loading

github-actions bot commented Sep 13, 2024

mudit2812 left a comment

rashidnhm commented Sep 13, 2024

codecov bot commented Sep 16, 2024 •

edited

Loading

mudit2812 left a comment •

edited

Loading

mudit2812 left a comment

Alex-Preciado commented Oct 15, 2024

rashidnhm commented Oct 15, 2024

Alex-Preciado commented Oct 15, 2024

mudit2812 left a comment

mudit2812 left a comment

PietropaoloFrisoni left a comment

mudit2812 commented Oct 18, 2024

Added ability to run linux workflows on large runners #6273

Added ability to run linux workflows on large runners #6273

Conversation

rashidnhm commented Sep 13, 2024 • edited Loading

github-actions bot commented Sep 13, 2024

mudit2812 left a comment

Choose a reason for hiding this comment

rashidnhm commented Sep 13, 2024

codecov bot commented Sep 16, 2024 • edited Loading

Codecov Report

mudit2812 left a comment • edited Loading

Choose a reason for hiding this comment

mudit2812 left a comment

Choose a reason for hiding this comment

Alex-Preciado commented Oct 15, 2024

rashidnhm commented Oct 15, 2024

Alex-Preciado commented Oct 15, 2024

mudit2812 left a comment

Choose a reason for hiding this comment

mudit2812 left a comment

Choose a reason for hiding this comment

PietropaoloFrisoni left a comment

Choose a reason for hiding this comment

mudit2812 commented Oct 18, 2024

rashidnhm commented Sep 13, 2024 •

edited

Loading

codecov bot commented Sep 16, 2024 •

edited

Loading

mudit2812 left a comment •

edited

Loading