Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revisit gitattributes merge=union #21516

Closed
thomasjpfan opened this issue Oct 31, 2021 · 9 comments · Fixed by #23062
Closed

Revisit gitattributes merge=union #21516

thomasjpfan opened this issue Oct 31, 2021 · 9 comments · Fixed by #23062
Labels
workflow Development workflow changes

Comments

@thomasjpfan
Copy link
Member

I have been seeing the default merge result in diffs looking like: https://github.com/scikit-learn/scikit-learn/pull/18975/files#diff-5a8a66bfe0792b4a0e50eb7c8cc929f2a8e0d4d88ac2d0084e07b70d069ced89. In this case, the changelog for #21130 was moved into 1.0 as a bug fix.

I'm not sure what a good way is to handle this use case. WDYT @lesteve ?

@lesteve
Copy link
Member

lesteve commented Nov 2, 2021

Not sure either ... it seems like merge=union prevents the lines removal to happen.

From some quick tests, it seems like this can happen if you create your PR branch when the log entry for #21130 was still in v1.1.rst and you modify v1.1.rst. When you git pull you will have a conflict in v1.1.rst between your branch (the log entry for #21130 is in it) and main (the log entry for #21130 is not present since it has been removed). merge=union keeps both modifications and consequently re-adds the log entry for #21130.

Generally speaking I guess this is a matter of deciding whether the benefit of merge=union i.e. avoiding conflicts in PRs is worth the downsides i.e. this kind of caveat or potentially additions that are not in the right order (e.g. lexical order for the modules).

There is also the option to define custom merge drivers but this does not seem worth the effort I would say.

@lesteve
Copy link
Member

lesteve commented Nov 2, 2021

A script reproducing the problem:

set -e

# this is a commit that has the #21130 changelog line in v1.1.rst
git checkout -b test-merge b6f11554f > /dev/null
echo 'diff --git a/doc/whats_new/v1.1.rst b/doc/whats_new/v1.1.rst
index 93300bf67..54fb41a7d 100644
--- a/doc/whats_new/v1.1.rst
+++ b/doc/whats_new/v1.1.rst
@@ -55,6 +55,7 @@ Changelog

 :mod:`sklearn.linear_model`
 ...........................
+- add something in linear model

 - |Fix| :class:`linear_model.LogisticRegression` now raises a better error
   message when the solver does not support sparse matrices with int64 indices.' > test-merge.patch

git apply test-merge.patch > /dev/null
git ci -am 'update changelog'
git pull upstream main --no-edit > /dev/null
# This shows that 21130 is still in the doc/whats_new/v1.1.rst
git --no-pager grep 21130 doc/whats_new/v1.1.rst && echo changelog line has been readded

@thomasjpfan
Copy link
Member Author

I am okay with considering this a non issue for now and change it if it becomes a problem.

As for the whats new order, we can create a validator from maint_tools/sort_whats_new.py that runs as part of the CI. It will still be able to sort locally, which is used by contributors. Downside is that it introduces another thing to run in PRs.

@lesteve
Copy link
Member

lesteve commented Nov 4, 2021

Another data point I just bumped into, I thought that merge=union would avoid conflicts in the github web UI but apparently not.

github needs to do something to have the behaviour I want if I understand correctly:
isaacs/github#487

More context: I just had a conflict in doc/whats_new/v1.1.rst in one of my PR, and I still had to pull upstream master locally (then merge=union .gitattributes kicked in so I did not have conflicts to solve locally). Still not ideal.

@thomasjpfan
Copy link
Member Author

Data point: I recently ran into this when reviewing a PR with a merge conflict on v1.1.rst. When fixing up the PR, I had to manually confirm that the changelog item was moved into doc/whats_new/v1.0.rst before I felt comfortable removing the entry from v1.1.rst.

@lesteve
Copy link
Member

lesteve commented Nov 4, 2021

Just to make sure I follow you, you would have preferred to get an explicit conflict so that you know you have to pay attention, rather than merge=union solving the conflict and realising only later (e.g. in the github diff) that a changelog entry was added although it was not related to your PR at all.

It feels like in both cases you need to manually confirm that the entry has been moved to v1.0.rst. Maybe I am missing something though ...

@thomasjpfan
Copy link
Member Author

Just to make sure I follow you, you would have preferred to get an explicit conflict so that you know you have to pay attention, rather than merge=union solving the conflict and realising only later (e.g. in the github diff) that a changelog entry was added although it was not related to your PR at all.

Yup, this was exactly what I was thinking about. Workflow wise, I am adapting to merge=union by running:

git diff upstream/main..  doc/whats_new/v1.1.rst

after a merge to check if I need to make adjustments.

@jeremiedbb
Copy link
Member

I faced the same issue several times. I wouldn't mind removing the merge=union. It often messes up the what's new (see #23060)

@lesteve
Copy link
Member

lesteve commented Apr 6, 2022

I opened #23062 to remove gitattributes for whats_new entry, I don't it is that useful after all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
workflow Development workflow changes
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants