Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use .gitattributes to remove user configuration influence #28945

Closed
wants to merge 3 commits into from

Conversation

sharwell
Copy link
Member

@sharwell sharwell commented Jun 17, 2017

⚠️ Do not rebase, squash, or omit the merge commit during a merge of this pull request.

This change eliminates the influence of the core.autocrlf setting on local checkouts. While it looks like a substantial change, users should see binary identical content of the entire workspace, with the exception of the .gitattributes file, before (for core.autocrlf=input) and after (for any setting) this change.

  • Add a .gitattributes file to normalize the line endings for text files in the repository
  • Configure the .gitattributes file to precisely preserve the current use of LF and CRLF in the repository in local checkouts
  • Remove execute bit from non-script files
    • Several .svg files
    • resources/linux/debian/*.template
  • Add execute bit to one shell script file which was missing it
    • build/tfs/linux/smoketest.sh

💡 It is highly preferable to remove as many eol=crlf and eol=lf lines from the current .gitignore as we are able to. For each such case, Windows users will see the file locally with CRLF line endings and other users will see the file locally with LF line endings, but neither system will ever influence the other. For any case where use of a specific line ending is not required for proper behavior, we can allow this platform-specific use without negative consequences.

*.jxs eol=crlf

# Some scripts require specific endings, others use specific in vscode
*.bat eol=lf
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Normally this would use eol=crlf, but the repository is not currently following this convention.

# Some scripts require specific endings, others use specific in vscode
*.bat eol=lf
*.cmd eol=crlf
*.ps1 eol=lf
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Normally this would use eol=crlf, but the repository is not currently following this convention.

@@ -0,0 +1,149 @@
# Auto-detect text files
* text=auto
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 If the repository requires the use of a specific line ending by all users, it can be added to this line, and we can remove everything below except the exceptions to this case. However, the preferable case is to not require a specific line ending and instead allow it to be controlled by the user (or automated system) which checks out the code. For example, a build machine which validates line endings (I believe this is "hygiene" or something), could use the following configuration without being a noisy neighbor to all other contributors:

git config core.eol lf

@joaomoreno
Copy link
Member

@sharwell How did you normalize all line endings?

I can't find any difference between master's build/win32/code.iss and the one in your branch, though I see in the commit you actually changed something. What actually happened to that file?

@sharwell
Copy link
Member Author

sharwell commented Aug 10, 2017

@joaomoreno

What actually happened to that file?

The .gitattributes file tells git to store files as text instead of storing files as binary. In a way, you can consider a "line ending" in such a file to be a opaque sentinel - it has some representation in the Git database but when you view the blob or check out the commit to files on disk it will be written to either LF or CRLF according to the current platform and configuration in .gitattributes. For the cases where a file appears to have changed in source control due to this PR, it's almost always due to a conversion of LF and/or CRLF to this sentinel.

How did you normalize all line endings?

In the past, you could do the following:

rm .git/index
git reset

However, this no longer works as one would expect. I now use the following process to normalize files:

  1. git clean -dxf (remove all untracked files and reset unwanted changes)
  2. Set .gitattributes to * text=auto
  3. Create a folder tmp in the root of the repository
  4. Move all files except .gitattributes to the folder
  5. Commit the result to source control
  6. Move all files back out of the folder
  7. Commit the result to source control
  8. Use an interactive rebase to squash the last two commits

I then manually review the result to correct any unwanted changes. For this repository, it mostly meant setting the execute bit back on some files that changed. I watch for other "bad things" as well, such as the unintentional renaming of files in source control due to differences caused by file system case sensitivity.

After the above process was complete, I made a second clone of the repository. With core.autocrlf=input configured for this clone, I deleted the checkout and then did a hard reset to the commit prior to working on the repository. I used Beyond Compare to run a binary diff between that directory and the one resulting from my normalized line endings. Using the result of that diff, I manually constructed a best-effort .gitattributes representing the current state of the repository. The remaining differences that would show up in a binary diff before and after by change are due to a limited selection of files in source control previously containing mixed line endings. Each of these cases now uses either LF or CRLF uniformly.

@joaomoreno
Copy link
Member

joaomoreno commented Aug 10, 2017

Wow. I know our hygiene scripts aren't exactly state of the art, but they get the job done. How is this exactly going to benefit us in the future? Did you happen to come across a situation in which you had the wrong line endings?

Also, can you resolve the conflicts?

@sharwell
Copy link
Member Author

sharwell commented Aug 10, 2017

How is this exactly going to benefit us in the future?

Users will no longer need to configure their repositories locally for development. Regardless of configuration or platform, the submitted results will end up as a NOP for hygiene.

Did you happen to come across a situation in which you had the wrong line endings?

Yes, my system is configured for core.autocrlf=true. Many things were wrong.

Also, can you resolve the conflicts?

Yes, I'll get it in later today hopefully. I'll do another verification pass to ensure everything is still consistent.

@microsoft microsoft deleted a comment from msftclas Sep 27, 2017
@joaomoreno
Copy link
Member

joaomoreno commented Dec 11, 2017

Will close this for now. Too many conflicts to deal with. Will re-examine this in the future.

Sorry @sharwell

@joaomoreno joaomoreno closed this Dec 11, 2017
@joaomoreno joaomoreno mentioned this pull request Mar 5, 2018
4 tasks
@github-actions github-actions bot locked and limited conversation to collaborators Mar 27, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants