Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pull request merge script #3263

Merged
merged 1 commit into from
Jan 16, 2021
Merged

Conversation

HebaruSan
Copy link
Member

Motivation

CKAN's process for merging pull requests has some very specific rules about how things should be formatted:

$ git checkout master
$ git merge someuser/somebranch --no-commit --no-ff # This merges your PR, but doesn't commit it.
$ vim CHANGELOG.md                                  # Or whatever your favourite editor is, if not vim.
$ git add CHANGELOG.md
$ git commit -m 'Merge #<PR-Num> <Description>'     # This commits the merge, *and* your changes to CHANGELOG.md
$ git push

Doing it manually requires a lot of fiddly copying and pasting and editing to reorder things, add parentheses, etc. This may sometimes discourage team members from attempting this flow.

Changes

Now a new bin/ckan-merge-pr.py script is created to automate the steps from the wiki, using the pull request data from the GitHub API to generate the descriptive strings. The script assumes you're running it from your CKAN working folder. The one required paramer is the number of the PR to merge, and the GITHUB_TOKEN environment variable can be set to avoid rate limiting errors:

export GITHUB=TOKEN='Your GitHub token'
bin/ckan-merge-pr.py 3250

The script does not attempt to set up everything as it should be, since chaos sometimes reigns in developer working copies, and you don't want robots guessing and messing with that; rather it is assumed that the user will get the repo into a usable state before running. Validation performed:

  • The master branch is checked out and is the current HEAD
  • The master branch is up to date
  • The PR's branch is checked out locally under the same name as it has on the remote
  • The PR's branch is up to date
  • At least one reviewer has approved the changes

If everything looks OK, the PR's branch will be merged into master, and an entry will be added to the changelog file. The script does not attempt to set the [Core]/[GUI]/etc. tag or guess whether it's a feature or a bug fix, so the user's git-configured editor will be launched to edit the changelog and perform this remaining manual work of setting the tag and moving the entry to the correct section. After the user saves and exits the editor, the changelog will be staged and a merge commit will be created with the appropriate commit message. The script does not perform the final push, so the user can take a look at the commit history and make sure everything is OK.

Also the other Python scripts in bin that have a shebang are now executable.

@HebaruSan HebaruSan added Enhancement New features or functionality Pull request Infrastructure Issues affecting everything around CKAN (the GitHub repos, build process, CI, ...) labels Jan 11, 2021
@HebaruSan HebaruSan requested a review from DasSkelett January 11, 2021 22:59
@DasSkelett
Copy link
Member

There's a requirements.txt in the bin/ folder. Should we add the new requirements to it?

Also do you think it would be possible to change the script so that no local branch is required? I'm often working with a detached HEAD and don't have a local copy of the PR branch, only if I need to test something bigger or I'm planning to commit/propose any changes.

bin/ckan-merge-pr.py Outdated Show resolved Hide resolved
@HebaruSan HebaruSan force-pushed the feature/merge-script branch from ce2798e to be51b0d Compare January 16, 2021 20:52
@HebaruSan
Copy link
Member Author

There's a requirements.txt in the bin/ folder. Should we add the new requirements to it?

Sure, no harm in doing so. I also took a closer look at the other files in bin, and I think they're all no longer in use and would no longer work anyway. They depend on things like having multiple repos (pre repo merge) or files generated during build that we don't have (build-tag in the root folder, I've never seen that). So I git rm'd them.

Also do you think it would be possible to change the script so that no local branch is required?

The problem I had with this was in translating the reference name we get from GitHub into a remote name that git can use in your local working copy. There is no guarantee that you even have the remote added or fetched and there are no guarantees about what you named it (DasSkelett, dasskelett, das-skelett, ds, etc.). So I had to compromise and require the local branch to exist under the same name as the remote branch, which then allows us to check the upstream remote of the local branch. This also forces me to make sure that my local branch matches the remote, to prevent #3198 from happening again.

@HebaruSan
Copy link
Member Author

... we might be able to compare PullRequest.head.repo.clone_url to Repo.remotes[].urls[0]...

@DasSkelett
Copy link
Member

DasSkelett commented Jan 16, 2021

I got it working by changing a bunch of the local_branch usages to pr.head.sha (i.e. for merge_base(), merge_tree() and r.index.commit()).

--- bin/ckan-merge-pr.py        2021-01-16 22:17:11.754527492 +0100
+++ bin/ckan-merge-pr-changed.py        2021-01-16 22:18:27.323449811 +0100
@@ -32,26 +32,11 @@
     if r.head.commit.hexsha != remote_master.commit.hexsha:
         print(f'master branch is not up to date!')
         sys.exit(ExitStatus.failure)
-    # Make sure PR's branch exists
-    local_branch = pr.head.ref
-    branch = r.heads[local_branch]
+    pr_head_sha = pr.head.sha
     # Fetch remote to make sure we aren't missing changes
-    tracking = branch.tracking_branch()
-    if not tracking:
-        print(f'Upstream {pr.head.label} missing!')
-        sys.exit(ExitStatus.failure)
-    remote = r.remotes[tracking.remote_name]
-    if remote.name != master_remote.name:
-        print(f'Fetching {remote.name}...')
-        remote.fetch()
-    # Make sure branch is identical to its upstream
-    if branch.commit.hexsha != pr.head.sha:
-        print(f'Branch {local_branch} not up to date!')
-        # Bail and let the user figure it out, no way we can safely cover all possibilities here
-        sys.exit(ExitStatus.failure)
     # Get reviewers from PR
     reviewers = [rvw.user.login for rvw in pr.get_reviews() if rvw.state == 'APPROVED']
-    if not reviewers:
+    if False and not reviewers:
         print(f'PR #{pr_num} is not approved!')
         sys.exit(ExitStatus.failure)
     # Get title from PR
@@ -59,8 +44,8 @@
     # Get author from PR
     author = pr.user.login
     # Merge the branch with no-commit and no-ff
-    base = r.merge_base(branch, r.head)
-    r.index.merge_tree(branch, base=base)
+    base = r.merge_base(pr_head_sha, r.head)
+    r.index.merge_tree(pr_head_sha, base=base)
     # Update the working copy
     r.index.checkout(force=True)
     # Print line to add to CHANGELOG.md at top of file, user needs to move it to the right spot
@@ -77,7 +62,7 @@
     r.index.add([changelog_path.as_posix()])
     # Commit
     r.index.commit(f'Merge #{pr_num} {pr_title}',
-                   parent_commits=(r.head.commit, branch.commit))
+                   parent_commits=(r.head.commit, r.commit(pr_head_sha)))
 
     # Don't push, let the user inspect and decide
     sys.exit(ExitStatus.success)

@HebaruSan
Copy link
Member Author

But your .git won't have the commit if you haven't added and fetched the remote, right? Is it possible to catch that before we try to merge it, so the error message might make sense?

@DasSkelett
Copy link
Member

Yeah, just realized that I'm missing some fetching there.
I think we can catch it by trying r.commit(pr_head_sha) first, which throws the following if it doesn't know the hash:

gitdb.exc.BadName: Ref '123456' did not resolve to an object

Maybe we can still do some auto-fetch though...

@HebaruSan
Copy link
Member Author

Yeah looks like we might be able to fetch the clone_url without even assigning it a remote name, that should get the sha where it needs to be...

@HebaruSan
Copy link
Member Author

How about something like this?

diff --git a/bin/ckan-merge-pr.py b/bin/ckan-merge-pr.py
index 56c55b8c..8be0f753 100755
--- a/bin/ckan-merge-pr.py
+++ b/bin/ckan-merge-pr.py
@@ -32,23 +32,6 @@ def merge_pr(repo_path: str, token: str, pr_num: int) -> None:
     if r.head.commit.hexsha != remote_master.commit.hexsha:
         print(f'master branch is not up to date!')
         sys.exit(ExitStatus.failure)
-    # Make sure PR's branch exists
-    local_branch = pr.head.ref
-    branch = r.heads[local_branch]
-    # Fetch remote to make sure we aren't missing changes
-    tracking = branch.tracking_branch()
-    if not tracking:
-        print(f'Upstream {pr.head.label} missing!')
-        sys.exit(ExitStatus.failure)
-    remote = r.remotes[tracking.remote_name]
-    if remote.name != master_remote.name:
-        print(f'Fetching {remote.name}...')
-        remote.fetch()
-    # Make sure branch is identical to its upstream
-    if branch.commit.hexsha != pr.head.sha:
-        print(f'Branch {local_branch} not up to date!')
-        # Bail and let the user figure it out, no way we can safely cover all possibilities here
-        sys.exit(ExitStatus.failure)
     # Get reviewers from PR
     reviewers = [rvw.user.login for rvw in pr.get_reviews() if rvw.state == 'APPROVED']
     if not reviewers:
@@ -58,6 +41,13 @@ def merge_pr(repo_path: str, token: str, pr_num: int) -> None:
     pr_title = pr.title
     # Get author from PR
     author = pr.user.login
+    # Make sure we have the commit
+    r.git.fetch(pr.head.repo.clone_url)
+    # Get the commit
+    branch = r.commit(pr.head.sha)
+    if not branch:
+        print(f'PR #{pr_num} commit {pr.head.sha} not found!')
+        sys.exit(ExitStatus.failure)
     # Merge the branch with no-commit and no-ff
     base = r.merge_base(branch, r.head)
     r.index.merge_tree(branch, base=base)

@HebaruSan HebaruSan force-pushed the feature/merge-script branch from be51b0d to b1d99eb Compare January 16, 2021 21:44
@HebaruSan
Copy link
Member Author

Pushed those changes. I do agree that eliminating the local branch is a nice advantage.

@DasSkelett
Copy link
Member

I was looking for a way to use GitHub's neat sefs/pull/{pr.number}/head ref system, but we still need to fetch that explicitly from the KSP-CKAN/CKAN remote, so I don't think we win anything over the current strategy.

bin/ckan-merge-pr.py Outdated Show resolved Hide resolved
Copy link
Member

@DasSkelett DasSkelett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One embarrassingly short SLS green run test later: Looks good, useful tooling. I'll give it a real-world try on this PR.

@DasSkelett DasSkelett force-pushed the feature/merge-script branch from 48da7d4 to 0d2bde3 Compare January 16, 2021 23:23
@DasSkelett
Copy link
Member

DasSkelett commented Jan 16, 2021

Did force-push a quick fix for this:

Traceback (most recent call last):uests/__init__.py:89:
  File "bin/ckan-merge-pr.py", line 76, in <module> a supported "
    merge_pr()n...
  File "/usr/lib/python3/dist-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/usr/lib/python3/dist-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/lib/python3/dist-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "bin/ckan-merge-pr.py", line 70, in merge_pr
    parent_commits=(r.head.commit, branch.commit))
  File "/home/user/.local/lib/python3.8/site-packages/gitdb/util.py", line 255, in __getattr__
    return object.__getattribute__(self, attr)
AttributeError: 'Commit' object has no attribute 'commit'

and apparently a rebase on master.

@DasSkelett DasSkelett merged commit acdadba into KSP-CKAN:master Jan 16, 2021
@HebaruSan HebaruSan deleted the feature/merge-script branch January 16, 2021 23:31
@DasSkelett
Copy link
Member

Looks like it worked flawlessly!

@DasSkelett
Copy link
Member

For some reason it didn't remove the deleted file from the working tree after the commit. Not sure what caused this and if it's a a fault of the script or something I did to my clone or whatever.
We'll see on the next PR that deletes files.

@HebaruSan
Copy link
Member Author

What does git status say? My bin got cleared out on pull. Maybe we need to pass a flag to allow deletion while updating the working copy.

@DasSkelett
Copy link
Member

$ git status
Auf Branch master
Ihr Branch ist auf demselben Stand wie 'origin/master'.

Unversionierte Dateien:
  (benutzen Sie "git add <Datei>...", um die Änderungen zum Commit vorzumerken)
        .idea/
        bin/changes-since-last-release.pl
        bin/ckan-build-info.py
        bin/ckan-build.py
        bin/ckan-release-maker.py
        bin/ckan-release-promoter.py
        bin/ckan_github_utils.py
        bin/close-old-support-tickets.pl

nichts zum Commit vorgemerkt, aber es gibt unversionierte Dateien
(benutzen Sie "git add" zum Versionieren)

German again, but yeah, it's the list of unstaged "new" files in the working dir.

@HebaruSan
Copy link
Member Author

HebaruSan commented Jan 16, 2021

https://gitpython.readthedocs.io/en/stable/reference.html#git.index.base.IndexFile.checkout

image

Ridiculous.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement New features or functionality Infrastructure Issues affecting everything around CKAN (the GitHub repos, build process, CI, ...)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants