Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Publish docs as github artifacts during CI #112407

Merged
merged 1 commit into from
Jun 17, 2023

Conversation

tgross35
Copy link
Contributor

@tgross35 tgross35 commented Jun 7, 2023

Discussed here: https://rust-lang.zulipchat.com/#narrow/stream/242791-t-infra/topic/Building.20docs.20for.20PR.20CI

The goal is to make docs available for download after CI runs on PRs, for easy review of API changes.

Notes:

You can find the results at the bottom of the CI "summary" page:

image

@rustbot
Copy link
Collaborator

rustbot commented Jun 7, 2023

r? @Mark-Simulacrum

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added A-testsuite Area: The testsuite used to check the correctness of rustc S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue. labels Jun 7, 2023
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@tgross35

This comment was marked as resolved.

@rust-log-analyzer

This comment has been minimized.

@tgross35 tgross35 force-pushed the ci-docs-publish branch 2 times, most recently from 1b1cada to d2471bc Compare June 8, 2023 03:35
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@tgross35 tgross35 force-pushed the ci-docs-publish branch 2 times, most recently from 06c1b68 to f452065 Compare June 8, 2023 04:53
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@tgross35 tgross35 marked this pull request as ready for review June 8, 2023 07:26
@tgross35
Copy link
Contributor Author

tgross35 commented Jun 8, 2023

Okay, results of the last CI run look good.

@rustbot ready

@Mark-Simulacrum
Copy link
Member

One question here is how much storage we expect to use for these artifacts. At 36MB/PR push, and using the last 7 days as an estimate of how many PRs we have (195) I think we can expect somewhere in the range of 10-100 GB of github storage used for this presuming PRs are pushed to somewhere between 10-100 times. Ideally I think we would limit PRs to creating a single artifact (i.e., if you push again, the old docs are replaced with new ones), which would greatly decrease the potential for accidentally using a lot of storage with frequent pushes. How feasible is that?

(Storage wise we may also want to use zstd for compression. Testing locally that gets me a 20MB vs. 36MB blob.) It also looks like the artifact is a zip file of a tarball. Is that indirection necessary? Can we avoid the zip archive?

I think another aspect is discoverability. Right now digging up the artifact is pretty painful - there's no UI surface area for it that I can see. Maybe we can have a custom check in triagebot or elsewhere that adds a dedicated clickable check to PR CI that links to a rendered copy of those docs somewhere? Seems like that ought to be feasible...

Why are we naming the artifact with a SHA hash? Is that trying to avoid some kind of overwriting?

@Mark-Simulacrum
Copy link
Member

It looks like there are some references in various github docs about "github pages preview" -- I'll see if we can talk with GitHub about maybe using that for this.

@Mark-Simulacrum
Copy link
Member

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 10, 2023
@pitaj
Copy link
Contributor

pitaj commented Jun 10, 2023

For rendering, I was thinking of having a site, something like

pr-docs.rust-lang.org/rust-lang/rust/pull/123456/docs/std

When that page is first opened, it pulls the artifact from actions and extracts the contents. Then it will just serve the docs normally.

It's on demand so there's no wasted space or energy if nobody uses it for a given PR. After a certain time period of no requests, the docs for a given PR will be deleted.

I suppose we could just do the whole thing by checking out with git and running rustdoc on the files ourselves, but then we'd have to worry about sandboxing and the rustdoc run would be extra work.

I do wonder if there's a heuristic we can use to avoid uploading a docs artifact when no docs changes happened (or maybe even when the changes are trivial). I think I may have mentioned before - we could choose to only upload the parts of docs that were changed.

@tgross35
Copy link
Contributor Author

tgross35 commented Jun 10, 2023

One question here is how much storage we expect to use for these artifacts. At 36MB/PR push, and using the last 7 days as an estimate of how many PRs we have (195) I think we can expect somewhere in the range of 10-100 GB of github storage used for this presuming PRs are pushed to somewhere between 10-100 times. Ideally I think we would limit PRs to creating a single artifact (i.e., if you push again, the old docs are replaced with new ones), which would greatly decrease the potential for accidentally using a lot of storage with frequent pushes. How feasible is that?

This does sound reasonable but I have looked and don't know of any way to do this. I think you could expire them manually but it would be nice if there is an automated way (maybe a nice thing to ask Github?)

(Storage wise we may also want to use zstd for compression. Testing locally that gets me a 20MB vs. 36MB blob.) It also looks like the artifact is a zip file of a tarball. Is that indirection necessary? Can we avoid the zip archive?

I've updated the PR to use zstd (L12 gets it down further to 14MB) and 10 -> 7 5 day retention

The double compression thing is really goofy, but there doesn't seem to be a way around it actions/upload-artifact#39

I think another aspect is discoverability. Right now digging up the artifact is pretty painful - there's no UI surface area for it that I can see. Maybe we can have a custom check in triagebot or elsewhere that adds a dedicated clickable check to PR CI that links to a rendered copy of those docs somewhere? Seems like that ought to be feasible...

It looks like there are some references in various github docs about "github pages preview" -- I'll see if we can talk with GitHub about maybe using that for this.

Agreed, I am curious what they suggest. I think as-is could be a MVP if those other things take time, but having some way to directly render would be best.

Why are we naming the artifact with a SHA hash? Is that trying to avoid some kind of overwriting?

Just to be able to identify the commit if you download >1, I don't believe the artifacts overwrite each other at all.

@tgross35
Copy link
Contributor Author

For rendering, I was thinking of having a site, something like

pr-docs.rust-lang.org/rust-lang/rust/pull/123456/docs/std

When that page is first opened, it pulls the artifact from actions and extracts the contents. Then it will just serve the docs normally.

That is kind of what I originally had in mind. If that site existed, we could just upload the files directly to S3 or something via actions so the site can be "dumb" (not sure how API keys would work here though).

I'm not opposed to doing something like this if it's desirable, but I need to know what would be possible and where exactly to go.

@Mark-Simulacrum Mark-Simulacrum added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jun 10, 2023
@tgross35
Copy link
Contributor Author

Looks like viewing is a well-known request actions/upload-artifact#14

This PR saves library docs as github artifacts so they can be easily
viewed for review.

Discussed in <https://rust-lang.zulipchat.com/#narrow/stream/242791-t-infra/topic/Building.20docs.20for.20PR.20CI>
@tgross35
Copy link
Contributor Author

@Mark-Simulacrum do you consider this specific PR blocked until the conversation with github (I believe you said 3 weeksish, but I don't really expect them to have any solutions) and/or us setting up an infra-based viewing solution like @pitaj suggested? Or would it be suitable to merge this as-is and improve the UX later? (for which I would open an issue)

I decreased the retention time further to 5 days. The compression and retention reduction should drop storage costs to 27% of the values you estimated in #112407 (comment) (not sure how exactly you did the estimate but based on percentages alone, usage should drop from 10100GB to 2.727GB)

@Mark-Simulacrum
Copy link
Member

Mark-Simulacrum commented Jun 13, 2023

No, I don't think so. I was going to wait when that conversation was today (as originally planned, we had to move it out), but I think we'll likely move ahead with this PR before then - I just need to sit down and review it. My hope is to do that before the weekend, but if not that's when I do a pass over all PRs open + marked as ready for review.

@tgross35
Copy link
Contributor Author

Thanks for clarifying - I wasn't intending to pressure a review, just curious if the labels were accurate or if I should direct my effort somewhere

@Mark-Simulacrum
Copy link
Member

@bors r+

Let's go ahead with this as-is, and we can come back and update further if there's any problems encountered in practice.

@bors
Copy link
Contributor

bors commented Jun 17, 2023

📌 Commit 696b0dd has been approved by Mark-Simulacrum

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 17, 2023
@bors
Copy link
Contributor

bors commented Jun 17, 2023

⌛ Testing commit 696b0dd with merge 1d7d824...

@bors
Copy link
Contributor

bors commented Jun 17, 2023

☀️ Test successful - checks-actions
Approved by: Mark-Simulacrum
Pushing 1d7d824 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Jun 17, 2023
@bors bors merged commit 1d7d824 into rust-lang:master Jun 17, 2023
@rustbot rustbot added this to the 1.72.0 milestone Jun 17, 2023
@tgross35 tgross35 deleted the ci-docs-publish branch June 17, 2023 06:28
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (1d7d824): comparison URL.

Overall result: no relevant changes - no action needed

@rustbot label: -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

This benchmark run did not return any relevant results for this metric.

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 656.882s -> 656.42s (-0.07%)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-testsuite Area: The testsuite used to check the correctness of rustc merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants