Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Two copies of the image references parser included #16205

Closed
alexlarsson opened this issue Oct 17, 2022 · 20 comments
Closed

Two copies of the image references parser included #16205

alexlarsson opened this issue Oct 17, 2022 · 20 comments
Labels
locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. stale-issue

Comments

@alexlarsson
Copy link
Contributor

alexlarsson commented Oct 17, 2022

Podman vendors the github.com/containers/image/v5/docker/reference, with the README.md saying:

This is a copy of github.com/docker/distribution/reference as of commit 3226863cbcba6dbc2f6c83a37b28126c934af3f8,
except that ParseAnyReferenceWithSet has been removed to drop the dependency on github.com/docker/distribution/digestset.

However, it also vendors the above mentioned github.com/docker/distribution/reference, including the "digestset" dependency that was apparently problematic.

This is not great, because both of these libraries are very heavy during initialization. Running podman with GODEBUG=inittrace=1 produces this:

init github.com/containers/image/v5/docker/reference @1.1 ms, 1.8 ms clock, 1341880 bytes, 11453 allocs
init github.com/docker/distribution/reference @5.1 ms, 1.8 ms clock, 1341816 bytes, 11452 allocs

I.e. each instance creates 1.3 MB of compiled regexps.

rhatdan added a commit to rhatdan/podman that referenced this issue Oct 17, 2022
containers/image/v5/docker/reference has same content, but
less overhead.

Partial fix for: containers#16205

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
rhatdan added a commit to rhatdan/podman that referenced this issue Oct 17, 2022
containers/image/v5/docker/reference has same content, but
less overhead.

Partial fix for: containers#16205

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
@rhatdan rhatdan mentioned this issue Oct 17, 2022
@mtrmac
Copy link
Collaborator

mtrmac commented Oct 17, 2022

@alexlarsson I guess eventually we should catch up with (unreleased) https://github.com/distribution/distribution/commit/1c89ce5fc1cec85e0726f598bccc12b873d04de3 . .

So far I’ve been preferring to keep c/image/v5/docker/reference as similar to the other one as possible, but we can diverge with a good reason. (I’m not inclined to think that < 1.8ms is a good reason, but I’m open to changing my mind on that one. Is it really noticeable in the total?)

@alexlarsson
Copy link
Contributor Author

2msec is not a huge problem in general, but the use of 1.3 meg of ram completely unnecessary seems not great.

Also, I should note that the actual code is pretty slow. Loading the shortname alias file is very slow, and the majority of that time is spent in reference.Parse(). So, rewriting this to use a hand-written parser instead of some humongous regexps might be a good idea in general.

@mtrmac
Copy link
Collaborator

mtrmac commented Oct 18, 2022

  • Isn’t it 1.3 MB allocated, not necessarily used at the same time? Quite a bit of that is regexp.MustCompile(…).String(), where the regexp is temporary.
  • The various complied regexps are a part of the public API, which we don’t want to break.
  • It’s not really just reference.Parse; a lot of the individual API points use a regexp for validation. I guess that would make for a bigger win?
  • A hand-written parser can certainly be written but it’s a quite a nightmare to debug and maintain. I guess we can live with that for this relatively unchanging subpackage.

For a POC, https://github.com/mtrmac/image/tree/parse-manual :

goos: darwin
goarch: amd64
pkg: github.com/containers/image/v5/docker/reference
cpu: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
BenchmarkParseRegexp-12    	  638163	      1820 ns/op	     473 B/op	       6 allocs/op
BenchmarkParseManual-12    	 4587074	       245.7 ns/op	     240 B/op	       2 allocs/op
PASS

which is relatively a huge improvement — OTOH 2 µs should still be dwarfed by network latency in most actual uses, and it took so many iterations to make at least the test pass, and I have very little confidence in correctness.

I guess that’s … a week of work to add tests and make this actually reliable?

@alexlarsson
Copy link
Contributor Author

Yeah, i guess not all the 1.3 MB are kept, but still, lots of churn.

For the reference.Parse() part, I'm just going of what I was seeing in profiles. The thing is that on rhel/centos, the /etc/containers/registries.conf.d/001-rhel-shornames.conf file is 300KB, and contains 3200 aliases, and for each we call reference.Parse() which is taking a lot of time during startup.

@alexlarsson
Copy link
Contributor Author

By lot of time, i was seeing 63msec, although 20msec of those was in the toml parser, but the rest was mainly in the reference parser.

@alexlarsson
Copy link
Contributor Author

For the API issue, maybe one could use something like https://github.com/charlievieth/reonce to lazily initialize the regexps?

@alexlarsson
Copy link
Contributor Author

Also, It seems a bit weird to do the regexp construction at runtime, using those helper functions. I understand that it makes things easier to write/maintain, but could maybe have that code as is and run at buildtime to just generate the final regexp strings.

@vrothberg
Copy link
Member

vrothberg commented Oct 19, 2022

Just taking over the note from another issue that parsing short names is only an issue when using short names. No cycles are wasted parsing aliases when using a FQN. A week worth of work may very well be more effective tackling issues that pop up when using FQNs; at least in the automotive context. Yet, I am all in favor of optimizing the reference package.

@mtrmac
Copy link
Collaborator

mtrmac commented Oct 19, 2022

For the API issue, maybe one could use something like https://github.com/charlievieth/reonce to lazily initialize the regexps?

AFAICS that would still be an API break.

@mtrmac
Copy link
Collaborator

mtrmac commented Oct 19, 2022

Also, It seems a bit weird to do the regexp construction at runtime, using those helper functions. I understand that it makes things easier to write/maintain, but could maybe have that code as is and run at buildtime to just generate the final regexp strings.

That linked upstream commit doesn’t quite do that, but at least it only concatenates strings. Sure, that could be improved further to move to build time.


Ideally, I’d like to see a generator that turns the regexes into parsers similar to the test branch above. That parser really is a rat nest of bug opportunities.

(Or, well — alternatively, I’d just like the regex engine to be fast. It’s understandable that processing a regexp representation can be slower than hand-crafted code (although not strictly obvious, e.g. character class testing can be done with bitmaps that programmers aren’t going to prepare manually), but 7 times does seem quite excessive.)

@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@rhatdan
Copy link
Member

rhatdan commented Nov 21, 2022

@alexlarsson @mtrmac @vrothberg What should we do with this one?

@mtrmac
Copy link
Collaborator

mtrmac commented Nov 21, 2022

Keep the docker one as is, unless someone wants to sign up to forking the repo and maintaining the fork over time.

WRT the c/image one, I can imagine updating it to the current distribution/distribution implementation which uses fewer regexes, or replacing the parsers entirely, but I’m not going to work on either short-term.

rhatdan added a commit to rhatdan/image that referenced this issue Nov 21, 2022
Improvements on containers/podman#16205

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
rhatdan added a commit to rhatdan/image that referenced this issue Nov 21, 2022
Improvements on containers/podman#16205

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
@alexlarsson
Copy link
Contributor Author

I'm not sure what the best thing to do is here, because I understand not wanting to spend time on essentially useless things like this.

However, we're currently looking at the performance of podman run, and as can be see with GODEBUG=inittrace=1 the initialization of the two reference libs is taking around 5 msec where to total initialization time is 24msec. That is almost 20% of the time. Using a single copy would at least make it half that.

@mtrmac
Copy link
Collaborator

mtrmac commented Nov 24, 2022

I’m not at all saying this work is useless, just that I have to prioritize and spending a lot of time on this is currently, short-term, not high enough vs. data corruption and business-requested features. Others might have more time.

containers/image#1730 is working on reducing the overhead of one of the copies.

WRT the other one … either way we are forking something (docker/distribution/reference the implementation or docker/docker the caller). Using a replace directive in Podman’s go.mod would be a possible way to make a fork without actually having a separate repo… but I don’t think that allows saying ”for docker/distribution/reference, use c/image/docker/reference from the version that is being used for other purposes”.

@vrothberg
Copy link
Member

vrothberg commented Nov 25, 2022

@mtrmac, I can carve out some time next week. @alexlarsson brought up the idea of c/image directly using the docker/reference regex where possible. I didn't explore it further but found the idea pragmatic and efficient. WDYT?

Update: That means c/image/docker/reference using the docker/reference regex.

@mtrmac
Copy link
Collaborator

mtrmac commented Nov 25, 2022

@mtrmac, I can carve out some time next week

Thanks!

c/image directly using the docker/reference regex where possible

In general, that is somewhat worrisome; the reference parser is a critical dependency of signing, and with the Go vendoring model, a completely unrelated dependency of a larger project can trigger an update of the docker/distribution subpackage that changes behavior in unanticipated ways.

And in particular, containers/image#1730 (comment) is an immediate example of this concern. We can’t just switch to those regexes without a detailed investigation of the impact, and possible changes to other code. (I.e. no matter what we do about this issue, it’s not a small dependency change; it’s at least one of a non-trivial implementation effort or a non-trivial investigation effort.)

@vrothberg
Copy link
Member

I opened containers/image#1733 as an alternative.

@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@alexlarsson
Copy link
Contributor Author

I think containers/image#1733 is as good as we're likely to get without major work. Lets close this.

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 6, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 6, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. stale-issue
Projects
None yet
Development

No branches or pull requests

4 participants