-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Manifest commands RFC #8
Conversation
Repo supports branches (this is what the tip manifest of the AOSP uses), tags (via revision="refs/tags/xxx", this is what all Android releases use), SHAs (this is what pinned manifests generated by The ability to support SHAs is important: this is how you can use a manifest to freeze and distribute an entire configuration of all the Git trees to others, including users. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @ulfalizer, thanks and sorry for missing the discussion in the zephyr tree.
I think the biggest questions are around sync, as might be expected.
Comments below.
west/cmd/git.py
Outdated
@@ -0,0 +1,255 @@ | |||
# Copyright (c) 2018 Open Source Foundries Limited. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
??
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Processed it as copy-paste. Glad I didn't go into law. :D
I'll put Nordic Semiconductor there instead.
west/cmd/git.py
Outdated
def _git_helper(project, cmd, extra_args, cwd, capture_output): | ||
# Runs a git command. | ||
# | ||
# cwd: Directory to switch to first (None = current directory) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a nit, but it'd be nice for these to be in the same order as the function signature, and to document the project
argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops... will fix
west/cmd/git.py
Outdated
|
||
pipe = subprocess.PIPE if capture_output else None | ||
|
||
return subprocess.run(('git', *args, *extra_args), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- We unfortunately can't use
subprocess.run()
yet. Zephyr's minimum Python version is 3.4, and it was added in 3.5. - Might be nice to add checking for Git with
shutil.which
and erroring out like cmake.py'srun_build
to the TODO list
west/cmd/git.py
Outdated
|
||
'''west git commands''' | ||
|
||
import argparse |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unused
west/cmd/git.py
Outdated
import collections | ||
import os | ||
import subprocess | ||
import sys |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unused
(Mind running this through flake8 when taking it out of RFC? It can be annoying but it's of course nice to have a style and also catches real bugs, so I think it'd be nice for west to run cleanly through it.)
west/cmd/git.py
Outdated
# Load project information from manifest | ||
|
||
with open(manifest_filename) as f: | ||
manifest = yaml.load(f)['manifest'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my opinion, it's better to always use yaml.safe_load
instead of yaml.load
unless the unsafe version is explicitly desired. This is especially important here, since the manifest is coming from the Internet.
See https://pyyaml.org/wiki/PyYAMLDocumentation for details; tl;dr is yaml.load
allows arbitrary code execution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hadn't noticed it. Will fix.
|
||
# TODO: Error checking (if _git(...).returncode != 0: ...) | ||
|
||
for project in projects: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs a TODO to handle projects that have been removed as well.
(I guess this means persisting a representation of the current projects somewhere and comparing it to the updated manifest that west sync
pulls down.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What should happen if they have been removed? Might be rude to just remove them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Projects removed from manifests should be removed from working copies. Otherwise there's old cruft around which no longer matters.
What to do with any local branches, though? Perhaps repo keeps git repositories and objects under .repo
, and symlinks to those from working trees, precisely to handle this case without deleting local work. No symlinks on Windows, though.
Maybe error out and ask for a re-run with a --force
option if there are repositories with local branches that would be removed, as a starting point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd also be mad if it removed any non-checked-in files I happened to have in there. "Lemme remove this stuff you no longer need for you" is a bit dangerous...
Personally, I'd be fine if they were just left behind, but we'll see.
if not os.path.exists(project.path): | ||
_git_top(project, 'clone -b (branch) (url)/(name) (path)') | ||
else: | ||
# Fetch first to make sure the project's branch exists. It |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the thinking here seems to be that we can assume that local branches with the same names track remote branches named in the manifest. As you've noticed, this doesn't play well if the manifest names SHAs (or I guess tags) in its revision fields.
That kind of makes the way repo behaves (as I understand it) seem at least not crazy:
- if detached head, synchronize to latest manifest version
- if there is a local branch checked out, rebase on top of the new manifest branch
Updating a detached HEAD to another detached HEAD seems fine.
I've said before that I think rebase by default is counter-intuitive and not a good idea for West, though -- in my experience teaching people Repo, this is one of the most common frustrations and stumbling blocks, to the point that I usually teach repo sync -d
first.
If we want to avoid that for West, I think these are the potentially reasonable options if there is a local branch:
- act like git pull (fetch + merge)
- act like git fetch (fetch only)
- scream and die unless a
--rebase
or--local-branch-strategy={rebase,detach,merge}
or so flag was given - always fetch and detach HEAD (unless overridden by a flag)
Am I missing any?
I'm leaning against 1, because it encourages people to submit bad pull requests with upstream merges inside, and 2, because west sync
ought to, well, synchronize with upstream whenever possible.
The other two seem potentially OK to me.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- act like git pull (fetch + merge)
Would need to check out the branch first in that case (and there would have to
be a local branch). The current version does that, and then switches back to
the previous position. That feels a bit wonky though, but maybe it could work
alright in practice.
- act like git fetch (fetch only)
Would want some way to rebase local work on top of new upstream stuff, so would
probably need additional commands in that case. Wonder if that might be too
complex.
- scream and die unless a --rebase or --local-branch-strategy={rebase,detach,merge} or so flag was given
- always fetch and detach HEAD (unless overridden by a flag)
IMO, it's nice if sync (by default at least) brings in upstream work while also
preserving local changes. I suspect that's what most people would expect.
Keeping a local branch gets a bit tricky when you can have SHAs. I wonder if
there might be gotchas related to changing the revision in the manifest file
too (e.g. to a different branch).
Always just doing a git rebase <revision>
is conceptually simplest at least
(and will work with a detached HEAD as well).
The implementation gets a bit tricky though, because e.g. git status
won't
know what the upstream is, and so won't tell you stuff like how many commits
behind you are. Gotta implement that stuff manually I think. Would have that
problem with an SHA regardless though...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only "robust" way I can think of to do it with a non-detached HEAD would be to have a single local branch (e.g. master
), and updating it against the revision from the manifest (ignoring the branch name there).
That'd give you e.g. git status
for free, but only if the revision from the manifest is a branch name (plus you'd have to update which branch the local branch tracks if the manifest is changed to use a different branch), so might have to implement that stuff manually anyway.
When the revision is an SHA, I don't think the local branch would give you much, beyond avoiding people having to see "detached HEAD" when they run git status
manually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... thought of another thing that might work. Wonder if it's too magic though.
Could keep a ref (branch) that's always updated to point to the revision specified in the manifest, and that the user isn't expected to update (could put it in upstream/revision
or the like, to make it clearer that it's "internal", though it'd be nice to explain it still).
That branch could then be set as the upstream of the local branch, and git status
and the like would automatically give sensible information (might get confusing for git push
and the like though).
Could be a horrible idea for reasons I haven't thought of yet though. It's something that's not a detached HEAD at least. :P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems kinda neat in that you will always have an easy reference for the upstream revision at least, without having to go to the manifest.
Could have west fetch
update the upstream/revision
refs, and west sync
rebase the local branches on top of them, if you wanted to split it up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- act like git pull (fetch + merge)
Would need to check out the branch first in that case (and there would have to
be a local branch).
Right -- note that this list was prefixed with (emphasis added): "I
think these are the potentially reasonable options if there is a
local branch". Since we're discussing what to do in case of a local
branch (instead of detached HEAD), it's a safe assumption that there
is one :).
The current version does that, and then switches back to
the previous position. That feels a bit wonky though, but maybe it could work
alright in practice.
Sorry, I don't follow what you're saying. I'm thinking this option 1
would be something like:
git fetch <manifest-remote>/<manifest-revision>
git merge FETCH_HEAD
That doesn't require switching back to a previous position.
Note that I'm against this option as it encourages invalid pull
request submissions (i.e. PRs which include "merge upstream into my
feature branch" commits).
- act like git fetch (fetch only)
Would want some way to rebase local work on top of new upstream stuff, so would
probably need additional commands in that case. Wonder if that might be too
complex.
Sorry, I don't follow this either. Why would we need a separate
rebase command, when there is already git rebase
?
We haven't discussed the "submit a pull request" or "fetch a pull
request" commands, but updating local branches to rebase against
upstream might better belong as part of those.
- scream and die unless a --rebase or --local-branch-strategy={rebase,detach,merge} or so flag was given
- always fetch and detach HEAD (unless overridden by a flag)
IMO, it's nice if sync (by default at least) brings in upstream work while also
preserving local changes. I suspect that's what most people would expect.
Keeping a local branch gets a bit tricky when you can have SHAs. I wonder if
there might be gotchas related to changing the revision in the manifest file
too (e.g. to a different branch).
Always just doing a git rebase is conceptually simplest at least
(and will work with a detached HEAD as well).
I'm confused about what you're saying here. Do you think I've missed any
options? Do you lean towards or against any of the four I've listed
above?
The implementation gets a bit tricky though, because e.g. git status won't
know what the upstream is, and so won't tell you stuff like how many commits
behind you are. Gotta implement that stuff manually I think. Would have that
problem with an SHA regardless though...
Do you want to play around with repo start
, repo sync
, and repo status
to get a baseline idea of how it works there? That might help
as a starting point, just seeing how at least one other tool works.
The way repo start
works is that it creates a local branch that
tracks <manifest-remote>/<manifest-revision>
. So repo start test <project>
will create a test
branch with the following in the
project's .git/config:
[branch "test"]
remote = osf-dev
merge = refs/heads/master
Then repo sync
will rebase on top of there. And you also always know
the remote, so you can do things like:
$ repo info -o <project>
Manifest branch: master
Manifest merge branch: refs/heads/master
Manifest groups: all,-notdefault
----------------------------
Projects Overview
<project>
* test ( 1 commit, Mon Aug 13 13:13:35 2018 -0500)
- 5136aa26 add test commit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That doesn't require switching back to a previous position.
I was assuming that you had some local branch representing the revision at the latest sync (and possibly containing some local changes as well). I think you'd want to update that one whenever you sync
, even if you aren't currently on it, and you can't rebase a branch without being on it.
The switching back was just to return to whatever branch (or detached HEAD state) the repository was on previously, after rebasing the "revision-tracking" branch.
If the revision-tracking branch (upstream/revision
or the like) is just a direct reference to the upstream revision and not meant to hold any local changes, it simplifies things though. Could just do git update-ref upstream/revision <revision>
or the like to update it.
Sorry, I don't follow this either. Why would we need a separate
rebase command, when there is already git rebase?
Because I think you'd want some way to rebase all projects to the latest revision, without having to manually rebase each project.
Internally, it could just be an alias for west forall rebase upstream/revision
, if a command like that gets added. That'd in turn run git rebase upstream/revision
in each project.
We haven't discussed the "submit a pull request" or "fetch a pull
request" commands, but updating local branches to rebase against
upstream might better belong as part of those.
I wonder if you'd already have all the information you need from upstream/revision
there. Would probably have to tinker a bit to figure out if there's something missing though.
I'm confused about what you're saying here. Do you think I've missed any
options? Do you lean towards or against any of the four I've listed
above?
I like rebase-by-default personally, so something like (1) is what I prefer. I wouldn't expect a sync operation to throw away my local changes (or scream about them), just bring them up-to-date against the latest version from upstream.
With the approach I'm thinking of right now, west sync
would actually do something like this though:
git fetch
new changes from the remote- Update the ref
upstream/revision
(or whatever a good name would be) to the revision specified in the manifest (which could be a branch or an SHA) - Rebase the currently checked-out branch (or detached HEAD state) against
upstream/revision
To start some new work, you could do git checkout -b fix-xyz upstream/revision
in each repository (could have the start
command do that for each repository, for example).
The way repo start works is that it creates a local branch that
tracks <manifest-remote>/<manifest-revision>. So repo start test will create a test branch with the following in the
project's .git/config:
Wonder if it does something similar to the above in the case where is an SHA. Don't think you can track an SHA. Feels weird at least.
Or do you mean "tracks" just in the sense that repo
knows to update the local branch against it?
In my version (which might be bad for reasons I haven't thought of yet), the branch would explicitly track the upstream/revision
branch. That way, git
itself would give sensible information when you run git status
("You are x commits behind upstream/revision"), and it'd be pretty easy to understand how the west commands operate internally.
try: | ||
pykwalify.core.Core( | ||
source_file=manifest_filename, schema_files=[schema_filename] | ||
).validate() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked the pykwalify sources: they use safe_load, so this should be OK.
for mp in manifest['projects']: | ||
# Fill in any missing fields in 'mp' with values from the 'defaults' | ||
# dictionary | ||
for key, val in manifest['defaults'].items(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems safe for now, but note that the manifest-wide defaults might contain keys which do not apply to projects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thought it was defined as defaults for projects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's defaults for the manifest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought the manifest was a project description. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Repo lets you manage default values that are related to the behavior of the tool itself there. For example: <default sync-j="4"/>
uses 4 concurrent TCP connections while syncing. I imagine we may end up with use cases that are similar eventually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yeah, might have to change the approach at that point, if they get added to the defaults
dict.
west/cmd/git.py
Outdated
_git(project, 'status', extra_args=user_args) | ||
|
||
|
||
def _add_common_git_flags(parser_adder, command): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, also: it would be nice to be able to take a nargs='*'
argument specifying the list of projects to synchronize, diff against, or get status for.
Supporting either the name of the project in the manifest or its locally checked out path would be most convenient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, would be nice to have later
Fixed most of the code complaints. |
This RFC adds manifest parsing and three basic commands (sync, diff, and status). More error checking needs to be added. This is mostly to get some feedback on the approach. There are some cases that turn tricky if you always keep a local branch to avoid a detached HEAD. I'm wondering if 'revision' is supposed to always point to a branch (as opposed to e.g. a SHA). SHAs would be more flexible, but make it even trickier to keep a local branch. I've written a bit in zephyrproject-rtos/zephyr#6770 as well. Signed-off-by: Ulf Magnusson <Ulf.Magnusson@nordicsemi.no>
Messed-up the branch name when I created this one. New PR is at #11. |
This RFC adds manifest parsing and three basic commands (sync, diff, and
status).
More error checking needs to be added. This is mostly to get some
feedback on the approach. There are some cases that turn tricky if you
always keep a local branch to avoid a detached HEAD.
I'm wondering if 'revision' is supposed to always point to a branch (as
opposed to e.g. a SHA). SHAs would be more flexible, but make it even
trickier to keep a local branch.
I've written a bit in
zephyrproject-rtos/zephyr#6770 as well.
This is a quick-and-dirty version of the local branch strategy.
Signed-off-by: Ulf Magnusson Ulf.Magnusson@nordicsemi.no