Flip core algorithm so everything is no longer the mirror image of Myers's paper #440
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Until now, jsdiff's entire implementation has been kind of symmetrically flipped from the algorithm described in the Myers paper. In Myers' paper, each column in the edit graph corresponds to a character in the OLD text (and hence horizontal movements correspond to deletions), while each ROW corresponds to a character in the NEW text (and hence vertical movements correspond to insertions). See page 4:
The numbers of diagonals, meanwhile, are such that a larger diagonal number corresponds to having a higher x - i.e. having done more deletions:
Note that this is the opposite of how diagonals are numbered in jsdiff:
The algorithm on page 6 shows us recording only the greatest x coordinate reached on each diagonal in array V, and shows us choosing whether to reach a diagonal via a vertical or horizontal move as follows:
x remaining the same (the top case) corresponds to a vertical move, i.e. an insertion. So the logic here says to do an insertion if the path on the more-deletion-heavy side of our target diagonal has made it further through the old string. This means we break ties by preferring to add an insertion on the end of a path that has previously done more deletions rather than adding a deletion on the end of a path which has previously done more insertions.
jsdiff does the mirror image of all of this. Its
diagonalPath
numbers go up as you do insertions, not deletions. InbestPath
it stores objects that have anewPos
property, corresponding to the y coordinate on a Myers edit graph, not the x. And it breaks ties in favour of doing a deletion, not an insertion.This makes jsdiff's diffs differ from the diffs of more popular/canonical tools like the Unix
diff
command, which break ties in favour of doing insertions (i.e. in favour of putting deletions earlier). It also makes it confusing to compare this library to other Myers diff implementations or to apply optimizations suggested in Myers's paper (or elsewhere).This PR rewrites the algo to better match the paper (though it introduces a hack, noted with a TODO, to preserve the incorrect tiebreak behaviour for now; fixing that is a breaking change and so will go out in a later release).