Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gps list performance when have a lot of branches #275

Open
drewdeponte opened this issue Nov 17, 2023 · 7 comments
Open

gps list performance when have a lot of branches #275

drewdeponte opened this issue Nov 17, 2023 · 7 comments

Comments

@drewdeponte
Copy link
Owner

There was a thread within the slack channel about this and I wanted to move the conversation here so that it could live longer and we could continue the exploration over time. The following is the Slack transcrip that started this conversation.

Ali Caglayan 12:12 PM
@Drew I am now back. I've started using the dev version of gps in my workflow. I've noticed a regression in perf for gps ls however. Here are some detailed statistics:

Dev version:
Benchmark 1: gps ls
  Time (mean ± σ):     907.6 ms ±  28.5 ms    [User: 866.3 ms, System: 41.1 ms]
  Range (min … max):   875.1 ms … 953.9 ms    10 runs
6.9.0:
Benchmark 1: gps ls
  Time (mean ± σ):      69.5 ms ±   1.5 ms    [User: 64.3 ms, System: 5.0 ms]
  Range (min … max):    67.4 ms …  74.5 ms    42 runs

12:13
The stack I have has 13 patches for reference.

Drew 12:16 PM
What might be interesting is I don’t think the performance characteristics are bound to the stack size as much anymore. Now they are likely bound to the number of branches you have. Because it has to go and investigate each branch, looking at each commit, to build up its state of the world.

It would probably be worth adding some timing into ls so we can see where the time is actually being eaten up.

Ali Caglayan
12:17 PM
I wonder if there are any tools that let us take a chrome trace of a rust program in debug mode.

Drew
12:18 PM
Not sure about perf analysis in Rust really. We will have to investigate
12:19
This might be useful. https://nnethercote.github.io/perf-book/profiling.html

Ali Caglayan
12:19 PM
https://github.com/flamegraph-rs/flamegraph
12:19
Yes, that list mentions the flamegraph I found

gps v6.9
Image

gps 7.0
Image

12:28
What's all this stuff about merging bases?

Drew
12:34 PM
Pretty sure that is how we determine where to stop the reading of commits for a particular branch.

Ali Caglayan
12:35 PM
I'll see if this tool is able to produce some more useful results with timing annotations. its more than a 10x regression which is pretty noticeable so I wonder what is going on.

Drew
12:35 PM
Flame Graphs just outline the things called frequently correct. So the darkest red is the most frequently called. I need a refresher on how to read Flame Graphs, it has been a while

Ali Caglayan
12:36 PM
I'm not sure the colors actually mean anything here. I'm mostly visually comparing the two. You can see the resemblence of the last part of the dev flamegraph to the old one. there is all sorts of stuff going on before however. hmm I wonder if this is because I left the debug flag on

Drew
12:39 PM
One thing to try is that I know from previously doing performance stuff with Rust is that you will see a big performance gains by running a cargo build --release build vs the debug

Ali Caglayan
12:43 PM
yeah it was in release mode, takes around 1s in release mode, and around 1.7s in debug, compared to 60ms in 6.9

Drew
5:39 PM
@Ali Caglayan
how many local branches do you have?

Drew
6:01 PM
Looking at the flame graph it seems to show that the state_computation::get_list_patch_info is what is taking up 83.04% of the time, and out of that it looks the state_computation::get_patch_info_collection is consuming 72.14% of that time. Then out of that time it looks like git2::repo::Repository::merge_base is taking 44.38% and commit_diff_patch_id is taking 26.76%.

I would guess you have something near 403 branches based on that there are 403 samples of merge_base or maybe some amount related to that. Maybe half of that if it does two of those per local branch.

I am not sure how we could eliminate the need to figure out the merge base or how to eliminate the need to process every local branch. I will continue to ponder

Ali Caglayan
7:52 PM
$ git branch -r | wc -l
1128

Drew
7:54 PM
Not remotes. Just local branches. We only look at upstream branches of the local branches

Ali Caglayan
7:56 PM
hmm how do I count those
$ git branch --all | wc -l
1514
so is it the difference?

Drew
10:07 PM
sorry, I am back now. you should be able to count the local branches with git branch | wc -l.
But yeah the difference is probably also right, 1514-1128 = 386

Drew
10:18 PM
Here is the current process. Basically, it goes through every local branch. For each one it tries to find the common ancestor between it and the current patch stack. This is the git2::repo::Repository::merge_base. It does this so that it knows where to stop the commit traversal and which commit should be considered part of that branch and which shouldn't. For each of those commits it builds the state information from them, this includes a number of things. But the end result of this process is a map keyed by ps-id to all the associated state information related to that patch. This basically includes 0 or more branches and then for each branch we have the state information.

I am open to other ideas for gaining/computing state, which might scale better.

But at the moment, I haven't thought of any others.

One idea I thought about earlier on was that we could name a patch series branch something like ps/series/<name> and then we could use the naming convention to be able to explicitly inspect only those branches. That could be a potential scoping mechanism for state building. However, I really feel like part of the magic with this new version is that it is all just normal Git and this is just a different way of looking at it.

So, I don't really want to lose that.

Out of curiosity what is the reason for having so many branches?

Ali Caglayan
5:02 AM
Alot of these branches are just stale and can be deleted. It comes from a long time developing and reviewing. I wonder if its possible to restrict the set of branches to non-stale ones so that we don't spend time computing state about branches we will immediately reject.

I could of course delete my stale branches, but that would be a workaround to the performance issues we are observing. I think the scaling issue is not so good and we need to think of a good fix.

I agree that naming branches specially would be a step back from the new ideas we are trying to develop.

Ali Caglayan
5:11 AM
I wonder if it would be worth dumping this hash table into a cache in .git/.git-ps? That way we wouldn't have to recompute it so often.

The perf of libgit2 is an issue noticed by similar tools also: https://github.com/arxanas/git-branchless/blob/ec0d27427ab7a505d4109e4588e356d6a18da2fe/src/git/repo.rs#L726-L836
In conclusion:
restricting the branches we consider
speeding up merge_base or using an equivalent
should be enough to speed this up again.

Drew
6:42 AM
Yeah, don’t nuke your branches just yet so we can have at least a test scenario.
👍
1

6:43
One crazy idea would be to shell out to git proper using git merge-base and see if that is significantly faster. I know we would obviously being paying the spin up cost of another process. But it is possible that, that could still be significantly better than libgit2 implementation.

What tool are you using to generate the flamegraphs?

Ali Caglayan
6:45 AM
https://github.com/flamegraph-rs/flamegraph

Drew
6:50 AM
are you using time gps ls to time a run of it? or are you using something else?

Ali Caglayan
6:50 AM
I'm using a tool called hyperfine

Drew
6:51 AM
what incantation are you using, just trying to make sure when I am comparing things, we are comparing apples to apples

Ali Caglayan
6:52 AM
I just run
hyperfine "gps ls", swapping gps for the abs dir to cargo release build
Looks like this:

[ali@allosaurus:~/dune]$ hyperfine "~/repos/git-ps-rs/target/release/gps ls"
Benchmark 1: ~/repos/git-ps-rs/target/release/gps ls
  Time (mean ± σ):     971.2 ms ±  25.3 ms    [User: 927.7 ms, System: 41.3 ms]
  Range (min … max):   934.9 ms … 1001.2 ms    10 runs


[ali@allosaurus:~/dune]$ hyperfine "gps ls"
Benchmark 1: gps ls
  Time (mean ± σ):      73.3 ms ±   2.4 ms    [User: 68.3 ms, System: 4.9 ms]
  Range (min … max):    69.5 ms …  78.3 ms    40 runs

Drew
6:58 AM
ok, we should probably do hyperfine --warmup 20 -N "gps ls"
that should get us more accurate readings, if you could run it with those new options and paste it here that would probably be a good base line. I will whip up a change to switch to git merge-base

Ali Caglayan
7:01 AM
It didn't help much it seems:

[ali@allosaurus:~/dune]$ hyperfine "~/repos/git-ps-rs/target/release/gps ls" --warmup 20
Benchmark 1: ~/repos/git-ps-rs/target/release/gps ls
  Time (mean ± σ):      1.035 s ±  0.032 s    [User: 0.986 s, System: 0.049 s]
  Range (min … max):    0.999 s …  1.083 s    10 runs


[ali@allosaurus:~/dune]$ hyperfine "~/repos/git-ps-rs/target/release/gps ls" --warmup 20
Benchmark 1: ~/repos/git-ps-rs/target/release/gps ls
  Time (mean ± σ):      1.049 s ±  0.020 s    [User: 0.999 s, System: 0.050 s]
  Range (min … max):    0.993 s …  1.061 s    10 runs

  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.


[ali@allosaurus:~/dune]$ hyperfine "gps ls" --warmup 20
Benchmark 1: gps ls
  Time (mean ± σ):      74.9 ms ±   2.3 ms    [User: 69.6 ms, System: 5.3 ms]
  Range (min … max):    70.3 ms …  78.8 ms    38 runs

7:02
but now we can probably eliminate filesystem cache as a difference. I wonder if IO is the bottleneck in the git2 implementation. it would make sense for such operations to be cached as the files need to only be read once. I'm not sure how git2 achieves this

Drew
7:33 AM
Ok, so the externally running git merge-base has seemed to make things faster I think, but I will push it up and you can pull it down and try it, actually, hmm, hold please...

Drew
7:45 AM
I think using the external merge-base is actually 2x slower

Drew
7:56 AM
Ok, I pushed it up. If you pull the latest and then build using the --features ext_common_ancestor option it will enable the external common ancestor functionality and you can test it out, it seems slower for me but hard to tell because I only have a couple branches

Drew
8:03 AM
@Ali Caglayan
the other thing I was thinking was that I am not sure exactly what libgit2 or git proper for that matter do when you do a git merge-base of two shas that don't have a common ancestor. It seems like the only options would be exhaustively traverse the tree to find that out. Or, have some sort of threshold where it would give up after said threshold of commits. So if you have a lot of branches that aren't related/based on your current patch stack's base then gps ls will naturally be slower I would imagine because it would have to traverse a lot more.

Ali Caglayan
8:27 AM
I'll check it out later, thanks.

Drew
9:24 AM
@Ali Caglayan another thing I was thinking about in relation to getting an understanding for performance would be to compare gps ls to git log —all. As they are probably the closest in concept to what they are doing

Ali Caglayan
11:43 AM
It appears to be much slower. hyperfine results coming in a sec

[ali@allosaurus:~/dune]$ hyperfine "~/repos/git-ps-rs/target/release/gps ls" --warmup 20
Benchmark 1: ~/repos/git-ps-rs/target/release/gps ls
  Time (mean ± σ):      2.705 s ±  0.019 s    [User: 1.421 s, System: 1.297 s]
  Range (min … max):    2.681 s …  2.732 s    10 runs


[ali@allosaurus:~/dune]$ hyperfine "~/repos/git-ps-rs/target/release/gps ls" --warmup 20
Benchmark 1: ~/repos/git-ps-rs/target/release/gps ls
  Time (mean ± σ):      1.023 s ±  0.034 s    [User: 0.974 s, System: 0.048 s]
  Range (min … max):    0.991 s …  1.089 s    10 runs

the top is with the feature enabled. This is probably not surprising if we are invoking git for every branch

Drew
11:44 AM
Yeah that makes sense. Just wasn’t sure how bad the libgit2 implementation was.

Ali Caglayan
11:44 AM
I think maybe a better optimization would be to filter the branch list first.

Drew
11:45 AM
Yeah we could filter it first. But by what criteria? Or do we make it configurable somehow

Ali Caglayan
11:45 AM
When we do git branch there is a way to filter out merged branches. we will need to work out how that works and do the libgit2 equivalent

Drew
11:46 AM
Yeah we can try and figure that out. Are we sure we wouldn’t want to see branches that have been merged? Wonder how it determines which are merged. It might be just as slow

Ali Caglayan
11:47 AM
Doesn't this usually mean the branch (and the patches in it) have been integrated? In that case, what is it good for?

Drew
11:48 AM
Yeah that should mean that they have been integrated. Yeah I am not sure what the use case for keeping the branches around is. I don’t keep them around in my workflows

Ali Caglayan
11:49 AM
Yeah I am only keeping them around by accident but its a shame that it affects perf so drastically

Drew
11:49 AM
Yeah

Ali Caglayan
11:49 AM
I think the git command is git branch --no-merged

Drew
11:49 AM
I will look at git2 and see if there is an easy way to figure that out

Ali Caglayan
11:50 AM
It's pretty fast:

[ali@allosaurus:~/dune]$ hyperfine "git branch --no-merged" --warmup 20
Benchmark 1: git branch --no-merged
  Time (mean ± σ):       8.4 ms ±   0.4 ms    [User: 6.3 ms, System: 2.1 ms]
  Range (min … max):     7.8 ms …  11.0 ms    268 runs


[ali@allosaurus:~/dune]$ hyperfine "git branch" --warmup 20
Benchmark 1: git branch
  Time (mean ± σ):       2.0 ms ±   0.2 ms    [User: 1.1 ms, System: 1.0 ms]
  Range (min … max):     1.7 ms …   2.9 ms    588 runs

Drew
11:50 AM
Maybe get some timings on running that vs just running git branch. Still no where near the ballpark of the merge-base cost. So it probably isn’t using that to check

Ali Caglayan
11:51 AM
How do we get the branches currently?

Drew
11:52 AM
Does that option significantly change the number of branches that come back for you?

Ali Caglayan
11:53 AM
huh I guess not

[ali@allosaurus:~/dune]$ git branch | wc -l
386

[ali@allosaurus:~/dune]$ git branch --no-merged | wc -l
378

maybe its not so straightforward to filter then, if we really wanted to check it is merged, I guess we would do a merge for each branch. I wonder if there is a better way to do what we want without having to iterate branches. As I understand, we are doing that in order to find any branches a sha might be in right? is there a way to query branches for a commit directly? that ought to be way more efficient --no-merged compares to the currently checked out branch btw for instance:

[ali@allosaurus:~/dune]$ git branch --contains 6b27c4b4dc6c2654ddd4b93471a5fbf23af501d0
* main3

[ali@allosaurus:~/dune]$ git branch --contains 583c82e525725fa4e6174ce1b1cbd0c32f6f56f8
* main3

[ali@allosaurus:~/dune]$ git branch --contains 583c82e525725fa4e6174ce1b1cbd0c32f6f56f8 -r
  origin/HEAD -> origin/main
  origin/main

11:58
this is better than iterating branches I think. even if libgit2 doesn't support it, and ext call would be much faster than what is currently happening on large scales. not sure for low branch counts tho. ideally a libgit2 solution for the same thing is what we want

Ali Caglayan
12:05 PM
ah but this only considers exact matches, hmmmm

Drew
12:12 PM
We also count the commits in the branch. And we identify ps-ids in those commits and commits without ps-ids. But yeah we can’t query based on sha

Ali Caglayan
12:18 PM
ok what about this. We can diff each branch with main (before stack) to determine if it is stale or not before considering it? my thinking is that this should be cheaper than a merge which is what libgit2 is doing

Drew
12:23 PM
a merge has to first do a merge-base so doing a merge won't likely help performance (edited)

Ali Caglayan
12:23 PM
right I meant merge-base before

Drew
12:24 PM
gotcha, doing a diff I am not sure actually. That we may have to try out out. I don't know that it will equate to what you are conceptually thinking though. because the diff gets all the files at one side, all the files at the other side and diffs them. which isn't an indication of that branch having been merged. as changes could have happened since the merging of that and it still reporting a diff

Ali Caglayan
12:26 PM
oh yes that's a good point

Drew
12:26 PM
Lets take a step back. you have a ton of branches. that for some reason aren't identified as being merged in to the branch you are checked out on. maybe understanding why that is will help us

Ali Caglayan
12:29 PM
I think git branch --no-merged is just checking if the sha of that branch appears in the current branch history. so if a branch was rebased when been added to main it won't appear to be merged. here are the results in git-ps-rs where I don't have a lot of branches

[ali@allosaurus:~/repos/git-ps-rs]$ hyperfine "target/release/gps ls" --warmup 20
Benchmark 1: target/release/gps ls
  Time (mean ± σ):      37.1 ms ±   0.7 ms    [User: 34.2 ms, System: 2.7 ms]
  Range (min … max):    35.9 ms …  40.6 ms    76 runs


[ali@allosaurus:~/repos/git-ps-rs]$ hyperfine "gps ls" --warmup 20
Benchmark 1: gps ls
  Time (mean ± σ):      30.7 ms ±   0.6 ms    [User: 28.7 ms, System: 1.9 ms]
  Range (min … max):    29.5 ms …  32.8 ms    92 runs

12:32
that seems expected to me. There is a 7ms slowdown because we are checking various other branches.

Drew
12:34 PM
Yeah. I haven’t noticed a performance hit with this in my workflows but I generally have only a few local branches at a time

Ali Caglayan
12:35 PM
Just created 100 branches
12:35

[ali@allosaurus:~/repos/git-ps-rs]$ n=100; for ((i=1; i<=n; i++)); do git branch "test_branch_$i"; done

[ali@allosaurus:~/repos/git-ps-rs]$ hyperfine "gps ls" --warmup 20
Benchmark 1: gps ls
  Time (mean ± σ):      30.8 ms ±   0.9 ms    [User: 28.7 ms, System: 2.0 ms]
  Range (min … max):    29.4 ms …  33.7 ms    90 runs


[ali@allosaurus:~/repos/git-ps-rs]$ hyperfine "target/release/gps ls" --warmup 20
Benchmark 1: target/release/gps ls
  Time (mean ± σ):      53.3 ms ±   1.7 ms    [User: 47.0 ms, System: 6.1 ms]
  Range (min … max):    51.1 ms …  57.2 ms    56 runs

12:36
200:

[ali@allosaurus:~/repos/git-ps-rs]$ hyperfine "target/release/gps ls" --warmup 20
Benchmark 1: target/release/gps ls
  Time (mean ± σ):      68.5 ms ±   1.9 ms    [User: 58.8 ms, System: 9.5 ms]
  Range (min … max):    65.8 ms …  72.9 ms    43 runs

12:36
1000:

[ali@allosaurus:~/repos/git-ps-rs]$ hyperfine "target/release/gps ls" --warmup 20
Benchmark 1: target/release/gps ls
  Time (mean ± σ):     199.2 ms ±   5.3 ms    [User: 159.7 ms, System: 38.8 ms]
  Range (min … max):   190.3 ms … 206.9 ms    14 runs

12:36
these are all trivial branches hence the merge-base call is probably quick

Drew
12:37 PM
Yeah

Ali Caglayan
12:37 PM
the output of gps ls is quite large

[ali@allosaurus:~/repos/git-ps-rs]$ target/release/gps ls
main tracking origin/main [ahead 1]
0    d09f67a nix: add flake                                     ( foo l* ps/rr/nix__add_flake l*r* test_branch_1 l test_branch_10 l test_branch_100 l test_branch_1000 l test_branch_101 l test_branch_102 l test_branch_103 l test_branch_104 l test_branch_105 l test_branch_106 l test_branch_107 l test_branch_108 l test_branch_109 l test_branch_11 l test_branch_110 l test_branch_111 l test_branch_112 l test_branch_113 l test_branch_114 l test_branch_115 l test_branch_116 l test_branch_117 l test_branch_118 l test_branch_119 l test_branch_12 l test_branch_120 l test_branch_121 l test_branch_122 l test_branch_123 l test_branch_124 l test_branch_125 l test_branch_126 l test_branch_127 l test_branch_128 l test_branch_129 l test_branch_13 l test_branch_130 l test_branch_131 l test_branch_132 l test_branch_133 l test_branch_134 l test_branch_135 l test_branch_136 l test_branch_137 l test_branch_138 l test_branch_139 l test_branch_14 l test_branch_140 l test_branch_141 l test_branch_142 l test_branch_143 l test_branch_144 l test_branch_145 l test_branch_146 l test_branch_147 l test_branch_148 l test_branch_149 l test_branch_15 l test_branch_150 l test_branch_151 l test_branch_152 l test_branch_153 l test_branch_154 l test_branch_155 l test_branch_156 l test_branch_157 l test_branch_158 l test_branch_159 l test_branch_16 l test_branch_160 l test_branch_161 l test_branch_162 l test_branch_163 l test_branch_164 l test_

but thats a separate issue

Drew
12:38 PM
lol yeah

Ali Caglayan
12:39 PM
here is 10,000:

[ali@allosaurus:~/repos/git-ps-rs]$ hyperfine "target/release/gps ls" --warmup 20
Benchmark 1: target/release/gps ls
  Time (mean ± σ):      1.832 s ±  0.032 s    [User: 1.453 s, System: 0.374 s]
  Range (min … max):    1.794 s …  1.864 s    10 runs

Drew
12:40 PM
I assume those branches are directly on the patch and don’t have additional commits

Ali Caglayan
12:40 PM
yep

Drew
12:43 PM
maybe we should have it use a naming convention like ps/series/ but also support a --all option or something that will have it look at everything. then we could filter branches down to ps/series/ prefix before doing the merge-base. maybe we also have a config option for it as well

Ali Caglayan
12:48 PM
I'll collect some more data. I think we need to ponder on the naming convention a bit more

Drew
12:56 PM
yeah, Ok, I have another crazy idea

Ali Caglayan
12:59 PM
I did hyperfine "target/release/gps ls" --warmup 20 -P N 100 1000 --prepare 'for ((i=1; i<={N}; i++)); do git branch "test_branch_$i"; done || true' --cleanup 'git branch -D $(git branch | grep 'test_branch_' | xargs)' -D 50
which generated a bunch of data I won't bother sharing here, but I plotted it like this:

Image

Drew
12:59 PM
looks pretty linear

Ali Caglayan
1:00 PM
yeah so I don't imagine we can improve that further by doing anything clever. I guess going back to improving merge-base is the better solution

Drew
1:03 PM
ok, now time this on your repo with all the branches, git --no-pager log --all --graph --abbrev-commit --pretty=oneline. that is doing a complete walk of the entirety of your Git tree from all references

Ali Caglayan
1:04 PM

[ali@allosaurus:~/dune]$ hyperfine "git --no-pager log --all --graph --abbrev-commit --pretty=oneline" --warmup 20
Benchmark 1: git --no-pager log --all --graph --abbrev-commit --pretty=oneline
  Time (mean ± σ):     133.9 ms ±   4.3 ms    [User: 127.1 ms, System: 6.9 ms]
  Range (min … max):   128.1 ms … 140.8 ms    22 runs

1:04
I can do a plot like before too if you like

Drew
1:04 PM
that would be useful. but can you run that in your real repo with the 3xx branches, and then also time the gps ls in there so we can compare

Ali Caglayan
1:05 PM
that was the 3xx branches repo

Drew
1:06 PM
what was the 7.0.0 gps ls in that repo?

Ali Caglayan
1:06 PM
1000ms

Drew
1:06 PM
yeah, so that is a significant difference. what is interesting about that is that it is doing a full walk of the entire git tree. in that git log command. maybe we should be building the entire tree. instead of trying to use merge-base to only iterate over the commits in each branch

Ali Caglayan
1:09 PM
how will that scale in large repos?

Drew
1:10 PM
🤷 but seems like it will scale the same way that the git log I provided above would scale. as I would probably be modeling the state building off of how git log works

Ali Caglayan
1:11 PM
Here is a plot of that command btw

Image

1:11
quite noisy but basically linear, and much better than gps ls

Drew
1:12 PM
yeah, now just to think through how I could actually use that. still need to know the same things. but the question is can I find them out by simply walking all of the refs in completion. I basically need a data structure that would allow me to quickly identify if I have previously walked a sha. because that would let me determine if I need to short circuit the walking of a ref. as well as it would tell me if that sha was a common ancestor

Ali Caglayan
1:18 PM
I don't quite follow

Drew
1:20 PM
Right now the process is this. For each ref (branch) do the following.

  • find the merge-base
  • walk the commits from the head of the ref to the merge base, collecting information

Drew
1:24 PM
Possible alternate based on git log. walk all the commits in the currently checked out branch from the headto the beginning of time, collecting info (shas that have been walked). For each ref (branch) do the following. walk from the head of the ref, until I hit a sha that was already walked in the currently checked out branch, that sha is the common ancestor.

Drew
1:31 PM
So I think the big question is if there is a data structure we can use to store the number whatever the mainline number of shas is and facilitate querying against that to find if one has already been seen

Ali Caglayan
1:31 PM
what is the mainline number of shas?

Drew
1:32 PM
depends on how large your repository is

Ali Caglayan
1:32 PM
I mean is it just all the shas?

Drew
1:32 PM
yeah, git --no-pager log --graph --abbrev-commit --pretty=oneline main | wc -l, will give you an idea of the count in your repository

Ali Caglayan
1:33 PM
yh 15k

Drew
1:34 PM
so, yeah we would likely want to find something that supports say 50k shas, and being able to very quickly answer the question of if that data structure contains a specific sha. Vec performance alone might be the best thing. it often is

Ali Caglayan
1:35 PM
the biggest repo I have is nixpkgs and that is 974192. that's likely the biggest repo on github too

Drew
1:36 PM
I wonder if there are some smarts we could do in terms of shas as well. for example ordering them lexigraphical. Does that help?

Ali Caglayan
1:37 PM
well we know what we are looking for, could we binary search for hits?

Drew
1:37 PM
may be to costly to order on the insert

Ali Caglayan
1:38 PM
wait am I confused. nevermind actually

Drew
1:38 PM
what is the quickest easiest way to test this idea to see if it should be further explored. maybe we build a small little rust example. maybe you could generate a list of shas from those repos. so we could just define as static Vec in a sample program. and just have this sample cli take in a sha. and just try and find it in the Vec. and then we can time that. and see how long that takes, to determine if we think it will be a substantial change

Ali Caglayan
1:41 PM
might as well do the largest

Drew
1:41 PM
yeah

Ali Caglayan
1:42 PM
ouch its 500MB

Drew
1:42 PM
lol, for just the list of shas?

Ali Caglayan
1:42 PM
yeah lol

Drew
1:42 PM
oh, well

Ali Caglayan
1:42 PM
lets see if it compresses good

Drew
1:42 PM
lets do it anyways

Ali Caglayan
1:43 PM
did you want me to share it?

Drew
1:43 PM
yeah

Ali Caglayan
1:43 PM
ok not bad 18MB
Gzip

Drew
1:44 PM
oh no, that is a lot more than just the shas

Ali Caglayan
1:46 PM
that was git --no-pager log --graph --abbrev-commit --pretty=oneline master > all_shas_of_nixpkgs. we want no graph i guess

Drew
1:47 PM
yeah, one min and I will have the command we want, git --no-pager log --abbrev-commit --pretty=format:%H main

Ali Caglayan
1:49 PM
that's better

all_shas_of_nixpkgs
Plain Text

Drew
1:51 PM
yeah, so 532054 shas

Ali Caglayan
1:52 PM
that's more like it

Drew
3:28 PM
@Ali Caglayan

✘ hyperfine --warmup 20 "./target/release/test_find_common_ancestor linear bc5843f8d1cb623f7cfe30df32293f884287a332"                                                                  103m main 81d82fa ✗
Benchmark 1: ./target/release/test_find_common_ancestor linear bc5843f8d1cb623f7cfe30df32293f884287a332
  Time (mean ± σ):      53.2 ms ±   1.1 ms    [User: 41.1 ms, System: 9.7 ms]
  Range (min … max):    50.6 ms …  56.5 ms    55 runs


------------------------------------------------------------
~/code/uptech/git-patch-stack/test_find_common_ancestor
✔ hyperfine --warmup 20 "./target/release/test_find_common_ancestor binary bc5843f8d1cb623f7cfe30df32293f884287a332"                                                                  104m main 81d82fa ✗
Benchmark 1: ./target/release/test_find_common_ancestor binary bc5843f8d1cb623f7cfe30df32293f884287a332
  Time (mean ± σ):     248.2 ms ±   5.8 ms    [User: 229.9 ms, System: 15.4 ms]
  Range (min … max):   240.9 ms … 256.5 ms    12 runs

3:29
that is testing with a sha that is 300k deep in the vector, both linear and binary

Ali Caglayan
3:30 PM
which end do you start the lnear search from? or does that not matter?

Drew
3:33 PM
The linear starts at the beginning unsorted. Just based on the order they came in the file

Ali Caglayan
3:34 PM
right, i'm wondering if they are ordered by time. it might be more likely if the commits are recent to get a hit is what I was thinking

Drew
3:34 PM
Yeah most likely so 300k is an older commit

Ali Caglayan
3:35 PM
if you reverse the vector we should see a different result for the linear. that would explain its speed w.r.t binary

Drew
3:35 PM
For this one case sure. But that isn’t the point. The point I think is that it is roughly 50ms to answer the question is this commit sha a common ancestor using this approach. If we were to implement this approach we would have to go to every ref and walk each commit from that ref asking that question until we found one or exhausted the commits in that branch. Each one of those checks would cost ~50ms. At least in a repository of this size

Ali Caglayan
3:38 PM
and we do this for each branch?

Drew
3:38 PM
Yep

Ali Caglayan
3:38 PM
ouch, but maybe this is the best we can do. for now

Drew
3:39 PM
Yep so I think this strategy is dead in the water unless we figure out a better data structure

Ali Caglayan
3:39 PM
if you can implement this approach I can bench it on the examples I had. I still think its an improvement over the current

Drew
3:40 PM
Really. I would imagine it would be significantly worse than the current approach.

Ali Caglayan
3:40 PM
shall I share the shas of my project and see how they compare? they should be around 8ms I would reckon

Drew
3:40 PM
Sure

Ali Caglayan
3:41 PM
all_shas_of_dune

Drew
3:49 PM
@Ali Caglayan

✔ hyperfine --warmup 20 "./target/release/test_find_common_ancestor linear a43d56e4803c17479d3c1d161f017e3ef488d684"                                                                  125m main 81d82fa ✗
Benchmark 1: ./target/release/test_find_common_ancestor linear a43d56e4803c17479d3c1d161f017e3ef488d684
  Time (mean ± σ):       2.5 ms ±   0.8 ms    [User: 1.2 ms, System: 0.6 ms]
  Range (min … max):     1.1 ms …   7.2 ms    539 runs

  Warning: Command took less than 5 ms to complete. Note that the results might be inaccurate because hyperfine can not calibrate the shell startup time much more precise than this limit. You can try to use the `-N`/`--shell=none` option to disable the shell completely.


------------------------------------------------------------
~/code/uptech/git-patch-stack/test_find_common_ancestor
✔ hyperfine --warmup 20 "./target/release/test_find_common_ancestor binary a43d56e4803c17479d3c1d161f017e3ef488d684"                                                                  125m main 81d82fa ✗
Benchmark 1: ./target/release/test_find_common_ancestor binary a43d56e4803c17479d3c1d161f017e3ef488d684
  Time (mean ± σ):       4.3 ms ±   0.8 ms    [User: 3.0 ms, System: 0.6 ms]
  Range (min … max):     2.8 ms …   8.1 ms    371 runs

  Warning: Command took less than 5 ms to complete. Note that the results might be inaccurate because hyperfine can not calibrate the shell startup time much more precise than this limit. You can try to use the `-N`/`--shell=none` option to disable the shell completely.

3:50
that is using a sha that is 6000 shas deep, basically half the size

Ali Caglayan
3:50 PM
ok so if we times this by 400 we should expect a 2x speedup over the current situation

Drew
3:51 PM
yeah so if we knew on average the number of commits on a branch were and then the number of branches we could multiple it by those two values and get a rough idea

Ali Caglayan
3:51 PM
That seems worth it in my opinion. We could hide this behind a feature flag like you did before

Drew
3:52 PM
if each branch say had 1 commit on it

Ali Caglayan
3:52 PM
ideally we would remove all of those when we actually release

Drew
3:52 PM
each branch would cost 2 searches for a common ancestor

Ali Caglayan
3:53 PM
hmm so possibly 800ms. still better than 1000

Drew
3:53 PM
2 (common anc searches) * 400 (branches) * 7.2 (search cost) = 5,760ms = 5s (edited)

Ali Caglayan
3:53 PM
where did you get 7.2? oh as an upper bound. I think 2.5 is probably better to use

Drew
3:54 PM
yeah, from the max

Ali Caglayan
3:54 PM
still bad tho

Drew
3:55 PM
yeah using the 2.5 we are looking at 2s. isn't that worse than the current implementation

Ali Caglayan
3:55 PM
I think we need a good datastructure like you said. yes its worse

Drew
3:56 PM
it feels like you could do something smart with sha

Ali Caglayan
3:56 PM
yeah i was thinking about putting it in a hash table with some traversal info

Drew
3:57 PM
hash tables are netoriouslly slower than vecs. We can try it though

Ali Caglayan
for lookups aren't they both O(1)?

Drew
4:08 PM
Hash implementation

✔ hyperfine --warmup 20 "./target/release/test_find_common_ancestor hash a43d56e4803c17479d3c1d161f017e3ef488d684"                                                                    145m main 81d82fa ✗
Benchmark 1: ./target/release/test_find_common_ancestor hash a43d56e4803c17479d3c1d161f017e3ef488d684
  Time (mean ± σ):       4.1 ms ±   1.2 ms    [User: 2.5 ms, System: 0.8 ms]
  Range (min … max):     1.8 ms …  10.5 ms    329 runs

  Warning: Command took less than 5 ms to complete. Note that the results might be inaccurate because hyperfine can not calibrate the shell startup time much more precise than this limit. You can try to use the `-N`/`--shell=none` option to disable the shell completely.

4:13
seems worse than the linear

Drew
7:01 PM
I implemented a custom algorithm based on sorting and short circuiting and it took about ~3ms

Ali Caglayan
9:57 AM
@Drew
Excellent! Did you push?

Drew
9:58 AM
No because 3ms is not good. We had 2.5ms with linear. But 2.5 when Mathed out is still worse than the current implementation. I think that approach is flawed also as it is robbing Peter to save Paul. Because it is largely impacted by the total commits in mainline. vs the current implementation which is largely impacted by the number of local branches. Commits in main is ever growing. Local branches at least in general are cleaned up or can be cleaned up. So I think we are back to the drawing board a bit in terms of this. The only thought we have had thus far that seems it would improve performance of the ls command maybe is to filter branches somehow before building state

Ali Caglayan
10:11 AM
Ah I misunderstood then. I would also imagine doing something faster than merge-base in libgit2 would help. From my vague reading around, I understood that's what other users of the library were doing. I think it would be good to understand what exactly git's merge-base and libgit2's merge-base are doing. That way we can be a bit leaner if there are things we don't need.

Drew
10:15 AM
Where did you read that? The link you provided before wasn't about merge-base it was about other functionality

Ali Caglayan
10:16 AM
I think it was libgit2/libgit2#6036. Maybe I was misremembering tho, this seems to say that using merge-base is a workaround for something else being slow.

Drew
10:17 AM
That is about merge and the Index though. merge-base is just looking at the tree of commits and their relationships. it isn't trying to do a merge or anything like that.

Ali Caglayan
10:18 AM
in that case we can't really improve on it much. I'm still curious how the implementations in git and libgit2 compare. We tried the external git command but obviously that is dominated by exec

Drew
10:19 AM
yeah, you can go look at the implementations. what I believe they do is that they traverse the git commit tree from the two provided points/refs. and simply walk down them until it finds commit sha that exists in both paths (edited) . that is probably a simplified view of things, but it should be the basis of it

Ali Caglayan
10:22 AM
Yes that is probably what they both do (I hope)

I have an idea. There is a way to attach metadata to branches. git branch --edit-description. we could use this to mark branches as belonging to a patch series and allowing us to restrict our search. I'm thinking it might make sense to have a command that "cherry-picks" a branch and marks it as a patch series. That way if you have a branch "feature-a" that you would like to include on your stack, you do "gps add branch feature-a" and it will rebase the commits on the stack and edit the metadata for the branch to say it is a patch series. Then when you gps ls we check the metadata of all the branches and only display the ones that have been marked as a patch series. On the other hand, when you want to rr a patch series you do gps rr 1-3 and that will create a branch. That branch should be marked with the metadata also. The disadvantage of this approach is that we cannot share this metadata AFAIK. So there is no way to preserve the patch series data. but sharing patches should be possible and ideally these could be rr'd as patch series again.

@drewdeponte drewdeponte converted this from a draft issue Nov 17, 2023
@drewdeponte
Copy link
Owner Author

@Alizter Yeah the not being able to share and simply not being able to see the branches associated to patches is a big issue for me.

Maybe we should take a different stance, maybe we just let the user know that they have a ton of branches and that they are impacting performance and they should clean them up.

The other thinking that I had is that ideally you want to see all the branches you are interested in and not the ones you aren't. Us defining that criteria is going to be hard as I don't think it is the same for everyone. So maybe we need to facilitate the user customizing that criteria.

In terms of filtering branches, doing it by naming scheme is shared. Which makes me think that is a better approach than the git branch --edit-description if we at some point decide to go that direction.

@drewdeponte drewdeponte moved this from Triage to Discussion in git-ps-rs Nov 17, 2023
@Alizter
Copy link
Contributor

Alizter commented Nov 17, 2023

I see the naming scheme as a big loss compared to the current status quo as being able to seamlessly integrate the branched workflow allows for the removal of all friction gps causes. Take for example when patches have their description changed, patches are ordered etc. With the current branch interaction this is better, but scales poorly as the discussion above shows.

@drewdeponte
Copy link
Owner Author

So I think at this point there are onlya few real paths of exploration that we have come up with.

  1. figure out a way to make merge-base substantially faster
  2. apply some sort of filtering of branches prior to the state computation
  3. come up with some other way to do state computation in general that is more performant (I am not aware of anything off hand)

@Alizter
Copy link
Contributor

Alizter commented Nov 17, 2023

One idea for (3) is to cache the state computation in a database, that way we would only need to recompute things that have actually changed. This is a little more sophisticated than anything we currently have however.

@drewdeponte
Copy link
Owner Author

drewdeponte commented Nov 17, 2023

Caching is tricky because we have to implement change detections somehow.

If we grabbed all the branches and their head sha and stored them. We might be able to check that cheaply to know if things changes or didn't with respect to local branches. But the 2nd step of state computation is looking at the upstream branches as well. So we would have to cache those as well.

Then if the shas are different for a particular branch or its upstream branch then we recompute state for it.

Something like this could maybe improve follow up runs. But we would have to pay the cost up front initially.

I do like the idea that it is a true cache and that it can be blown away. I also like that we should be able to do change detection pretty easily I think.

@Alizter
Copy link
Contributor

Alizter commented Nov 20, 2023

In the end, this only affects users with many local branches in which case we could document that cleaning these up is a way to improve performance. We discussed extensively how to improve the performance here but were unsuccessful in coming up with a good solution. I no longer think this is something that is blocking and therefore we should come back to this in the future.

@drewdeponte
Copy link
Owner Author

I agree this isn't a blocker for 7.0.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants