-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for Treeless clones #1152
Comments
This feature would be very helpfull indeed |
Thanks for filing this!
Actually, shallow clones are "great" until you start trying to use git; then they can turn into disasters. This is the most time-consuming shallow clone disaster I found: Even So the (newer?) |
@derrickstolee your (excellent!) blog and work about this is now 1.5 years old. It's a largely wasted effort if the official and universal way to clone on github still can't use these optimizations. |
One thing to be really careful about is the fact that fetches from treeless clones can be very strange if there is a
The main reason to use a treeless clone over a shallow clone is if you need the commit history for something. For example, Git Credential Manager uses full clones because its build determines the version number from the commit history. This example could use treeless clones instead, saving a lot of effort. |
When looking into migrating a build pipeline to GitHub I was certain there would be some way to customize the checkout in order to get a treeless clone, since I read about this magic first on the GitHub blog a few years ago. To my surprise this was not the case. Even worse, there's barely any checkout options, with efforts to implement improvements such as sparse checkouts going completely unanswered, despite plenty of demand from the community. Even worse, if I want to use reusable workflows I seem to be forced to opt-in to using the official checkout action so I can't even efficiently replace it with my own checkout logic that is optimized for speed. Treeless clones are a game changer and can drastically improve performance on larger repositories. Please consider supporting this. |
No one is asking to change the default behavior. This issue is only about making the new git feature available.
... and it would also save Github a lot CPU and network cycles - hence $$$. It's really unexpected to see a Github employee doing all the amazing work in https://github.blog/2020-12-21-get-up-to-speed-with-partial-clone-and-shallow-clone/ and upstreaming it all which makes it available out of the box for every git user... except for Github! :-) |
Commenting to cast my support for this feature request. I have a workflow to build a documentation site where I need to use the git commit log to find the last modified date of a page ( /edit Looks like this is now possible(!) Treeless clone steps:
- name: Checkout
uses: actions/checkout@v4
with:
filter: tree:0
fetch-depth: 0 # (no history limit) Blobless clone (which is what I needed) steps:
- name: Checkout
uses: actions/checkout@v4
with:
filter: blob:none
fetch-depth: 0 # (no history limit) |
Indeed #1396 was merged last week. Testing it now. EDIT: when measuring pay attention to this:
But the |
I got a few numbers from the https://github.com/thesofproject/linux/actions/runs/6451238988/job/17511599536?pr=4622 test run and a couple similar others. This is cloning the Linux kernel repo. Obviously, this sort of end-to-end timings depends on a gazillion of other parameters like the current workload and network traffic so the numbers below are only orders of magnitude, not accurate numbers. Also, performance is HIGHLY dependent on your particular git repo and I suspect the size of the Linux kernel is way above the average - which also makes its performance extreme and interesting. => Do your own testing and measurements.
So a treeless clone is 2x-3x slower than a shallow clone but this is still an amazing and incredibly useful speed-up because:
Treeless could probably be faster if the action supported the default fetch behavior with respect to tags: "By default, any tag that points into the histories being fetched is also fetched; the effect is to fetch tags that point at branches that you are interested in." (from: https://git-scm.com/docs/git-fetch) Unfortunately, fetching tags in this action is an "all or nothing" boolean (#579). I'm assuming many people interested by Note a shallow AND treeless clone took about 1min 10s |
There is already fetch-depth: 1 to retrieve only the latest commit and working tree, which is great.
However, for my particular CI project using Actions, we use the git tags to track version information, and the commit messages to generate changelogs. Seems like the feature "treeless clones" would be ideal for our situation.
I couldn't figure out how to make this work with actions/checkout@v3, so I assume that support for it would need to be added.
https://github.blog/2020-12-21-get-up-to-speed-with-partial-clone-and-shallow-clone/
The text was updated successfully, but these errors were encountered: