Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🎨 Color moved passages differently #259

Open
nobodyinperson opened this issue Mar 1, 2022 · 5 comments
Open

🎨 Color moved passages differently #259

nobodyinperson opened this issue Mar 1, 2022 · 5 comments

Comments

@nobodyinperson
Copy link

git diff --color-moved=zebra colors moved lines differently so they don't show up as a huge amount of removed and added lines. Having latexdiff also color moved passages differently (e.g. in darkgreen) would add a lot of value to the output.

@ftilmann
Copy link
Owner

ftilmann commented Mar 1, 2022

I have been thinking about this also, and miss this functionality myself but I expect it is relatively difficult to implement, particularly in the current framework and there are some subtleties involved in the definition of desired behaviour.
Do you know which algorithm is used to make the detection of moving as opposed to deleting and adding. What happens if in addition to moving a small change is made in the moved block? Does this then get highlighted as an addition or deletion within the moved block, or does this somehow interfere with the whole block then no longer being recognised. Is there a minimum size for the block moved?

@nobodyinperson
Copy link
Author

nobodyinperson commented Mar 1, 2022

Do you know which algorithm is used to make the detection of moving as opposed to deleting and adding.

I browsed through the git diff source code and quickly got lost... 😅

Python's difflib also doesn't seem to have this functionality, which would have been a nice starting point...

What happens if in addition to moving a small change is made in the moved block? Does this then get highlighted as an addition or deletion within the moved block, or does this somehow interfere with the whole block then no longer being recognised. Is there a minimum size for the block moved?

Good questions. The easy answer would be to „just make it configurable” and use sane defaults.

Off the top of my head I would introduce an absolute and a relative moving threshold. The absolute threshold would mean: „Don't consider sequences of 20 characters or less for moving”. The relative threshold would say: „If less than 10% of a moved block was also changed, still consider the whole block 'moved' and color the differences accordingly”.

@ftilmann
Copy link
Owner

ftilmann commented Mar 1, 2022

Thank you for looking into this so quickly. It will be an interesting feature but I will need a block of time to think about this and implement something, and those 'blocks of time' are hard to come by these days.

@nobodyinperson
Copy link
Author

'blocks of time' are hard to come by these days.

Absolutely. No pressure, I just wanted to put this idea here so it is out there.

@awillats
Copy link

awillats commented Jan 2, 2023

Just wanted to drop by and +1 to the idea of coloring block moves differently.
It sounds like that will be non-trivial to do, so I'm not trying to add to the time pressure. But I wanted to fill in a couple of details that hopefully will make the process easier in the future.

After playing around with getting this to work on the commandline for standard difftools, indeed
git diff --color-moved=zebra oldfile newfile
or
git diff --color-moved=plain oldfile newfile (removes the 20 character minimum block size) both get the job done. Here's the relevant section of the git diff docs which answers a lot of your detail questions and could be a starting point for various design decisions.

As mentioned in Detecting moved sections #162, the core algorithm seems to be the Heckel diff algorithm described on the page for wikEd diff Implementation. See also Paul Heckel: A technique for isolating differences between files
Communications of the ACM 21(4):264 (1978). Here's an in-browser demo to try for the wikEd implementation.

Here's a couple of python implementations: m-matelski/mdiff, lahwaacz/python-wikeddiff, and a Stack Overflow discussion which might be helpful: Difficulty understanding Paul Heckel's Diff Algorithm.

Thanks again for working on this tool, and hope these resources will be helpful at some time in the future

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants