Skip to content

Commit

Permalink
Sort out behaviour of newlineIsToken and ignoreWhitespace (#486)
Browse files Browse the repository at this point in the history
* Sort out behaviour of newlineIsToken and ignoreWhitespace

* Fix linting errors I just introduced

* Fix broken super call

* Update failing test to reflect changes

* Update passing test to more completely test the new behaviour

* Fix a gibberish test name

* Add test of newlineIsToken/ignoreWhitespace compatibility

* Add tests that demonstrate a bug in the ignoreWhitespace/newlineIsToken compat

* Document quirk of using the two options together

* Add release notes

* Fix comment typo
  • Loading branch information
ExplodingCabbage authored Feb 15, 2024
1 parent fc2e36d commit 5f9cd41
Show file tree
Hide file tree
Showing 4 changed files with 57 additions and 16 deletions.
10 changes: 2 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,18 +48,12 @@ Broadly, jsdiff's diff functions all take an old text and a new text and perform
* `Diff.diffLines(oldStr, newStr[, options])` - diffs two blocks of text, treating each line as a token.

Options
* `ignoreWhitespace`: `true` to strip all leading and trailing whitespace characters from each line before performing the diff. Defaults to `false`.
* `ignoreWhitespace`: `true` to ignore leading and trailing whitespace characters when checking if two lines are equal. Defaults to `false`.
* `stripTrailingCr`: `true` to remove all trailing CR (`\r`) characters before performing the diff. Defaults to `false`.
This helps to get a useful diff when diffing UNIX text files against Windows text files.
* `newlineIsToken`: `true` to treat the newline character at the end of each line as its own token. This allows for changes to the newline structure to occur independently of the line content and to be treated as such. In general this is the more human friendly form of `diffLines`; the default behavior with this option turned off is better suited for patches and other computer friendly output. Defaults to `false`.

Returns a list of [change objects](#change-objects).

* `Diff.diffTrimmedLines(oldStr, newStr[, options])` - diffs two blocks of text, comparing line by line, after stripping leading and trailing whitespace. Equivalent to calling `diffLines` with `ignoreWhitespace: true`.

Options
* `stripTrailingCr`: Same as in `diffLines`. Defaults to `false`.
* `newlineIsToken`: Same as in `diffLines`. Defaults to `false`.
Note that while using `ignoreWhitespace` in combination with `newlineIsToken` is not an error, results may not be as expected. With `ignoreWhitespace: true` and `newlineIsToken: false`, changing a completely empty line to contain some spaces is treated as a non-change, but with `ignoreWhitespace: true` and `newlineIsToken: true`, it is treated as an insertion. This is because the content of a completely blank line is not a token at all in `newlineIsToken` mode.

Returns a list of [change objects](#change-objects).

Expand Down
1 change: 1 addition & 0 deletions release-notes.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
- [#460](https://github.com/kpdecker/jsdiff/pull/460) Added `oneChangePerToken` option.
- [#467](https://github.com/kpdecker/jsdiff/pull/467) When passing a `comparator(left, right)` to `diffArrays`, values from the old array will now consistently be passed as the first argument (`left`) and values from the new array as the second argument (`right`). Previously this was almost (but not quite) always the other way round.
- [#480](https://github.com/kpdecker/jsdiff/pull/480) Passing `maxEditLength` to `createPatch` & `createTwoFilesPatch` now works properly (i.e. returns undefined if the max edit distance is exceeded; previous behavior was to crash with a `TypeError` if the edit distance was exceeded).
- [#486](https://github.com/kpdecker/jsdiff/pull/486) The `ignoreWhitespace` option of `diffLines` behaves more sensibly now. `value`s in returned change objects now include leading/trailing whitespace even when `ignoreWhitespace` is used, just like how with `ignoreCase` the `value`s still reflect the case of one of the original texts instead of being all-lowercase. `ignoreWhitespace` is also now compatible with `newlineIsToken`. Finally, `diffTrimmedLines` is deprecated (and removed from the docs) in favour of using `diffLines` with `ignoreWhitespace: true`; the two are, and always have been, equivalent.

## v5.2.0

Expand Down
29 changes: 26 additions & 3 deletions src/diff/line.js
Original file line number Diff line number Diff line change
Expand Up @@ -23,17 +23,40 @@ lineDiff.tokenize = function(value) {
if (i % 2 && !this.options.newlineIsToken) {
retLines[retLines.length - 1] += line;
} else {
if (this.options.ignoreWhitespace) {
line = line.trim();
}
retLines.push(line);
}
}

return retLines;
};

lineDiff.equals = function(left, right) {
// If we're ignoring whitespace, we need to normalise lines by stripping
// whitespace before checking equality. (This has an annoying interaction
// with newlineIsToken that requires special handling: if newlines get their
// own token, then we DON'T want to trim the *newline* tokens down to empty
// strings, since this would cause us to treat whitespace-only line content
// as equal to a separator between lines, which would be weird and
// inconsistent with the documented behavior of the options.)
if (this.options.ignoreWhitespace) {
if (!this.options.newlineIsToken || !left.includes('\n')) {
left = left.trim();
}
if (!this.options.newlineIsToken || !right.includes('\n')) {
right = right.trim();
}
}
return Diff.prototype.equals.call(this, left, right);
};

export function diffLines(oldStr, newStr, callback) { return lineDiff.diff(oldStr, newStr, callback); }

// Kept for backwards compatibility. This is a rather arbitrary wrapper method
// that just calls `diffLines` with `ignoreWhitespace: true`. It's confusing to
// have two ways to do exactly the same thing in the API, so we no longer
// document this one (library users should explicitly use `diffLines` with
// `ignoreWhitespace: true` instead) but we keep it around to maintain
// compatibility with code that used old versions.
export function diffTrimmedLines(oldStr, newStr, callback) {
let options = generateOptions(callback, {ignoreWhitespace: true});
return lineDiff.diff(oldStr, newStr, options);
Expand Down
33 changes: 28 additions & 5 deletions test/diff/line.js
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ describe('diff/line', function() {
'line\nnew value\nline');
expect(convertChangesToXML(diffResult)).to.equal('line\n<del>old value\n</del><ins>new value\n</ins>line');
});
it('should the same lines in diff', function() {
it('should treat identical lines as equal', function() {
const diffResult = diffLines(
'line\nvalue\nline',
'line\nvalue\nline');
Expand Down Expand Up @@ -128,7 +128,7 @@ describe('diff/line', function() {
'line\nnew value\nline');
expect(convertChangesToXML(diffResult)).to.equal('line\n<del>old value\n</del><ins>new value\n</ins>line');
});
it('should the same lines in diff', function() {
it('should treat identical lines as equal', function() {
const diffResult = diffTrimmedLines(
'line\nvalue\nline',
'line\nvalue\nline');
Expand All @@ -138,16 +138,39 @@ describe('diff/line', function() {
it('should ignore leading and trailing whitespace', function() {
const diffResult = diffTrimmedLines(
'line\nvalue \nline',
'line\nvalue\nline');
expect(convertChangesToXML(diffResult)).to.equal('line\nvalue\nline');
'line \nvalue\nline');
expect(convertChangesToXML(diffResult)).to.equal('line \nvalue\nline');
});

it('should not consider adding whitespace to an empty line an insertion', function() {
const diffResult = diffTrimmedLines('foo\n\nbar', 'foo\n \nbar');
expect(convertChangesToXML(diffResult)).to.equal('foo\n \nbar');
});

it('should handle windows line endings', function() {
const diffResult = diffTrimmedLines(
'line\r\nold value \r\nline',
'line\r\nnew value\r\nline');
expect(convertChangesToXML(diffResult)).to.equal('line\r\n<del>old value\r\n</del><ins>new value\r\n</ins>line');
expect(convertChangesToXML(diffResult)).to.equal('line\r\n<del>old value \r\n</del><ins>new value\r\n</ins>line');
});

it('should be compatible with newlineIsToken', function() {
const diffResult = diffTrimmedLines(
'line1\nline2\n \nline4\n \n',
'line1\nline2\n\n\nline4\n \n',
{newlineIsToken: true}
);
expect(convertChangesToXML(diffResult)).to.equal('line1\nline2\n<del> </del>\n<ins>\n</ins>line4\n \n');
});

it(
'should not consider adding whitespace to an empty line an insertion ' +
'even in newlineIsToken mode where a token may be an empty string',
function() {
const diffResult = diffTrimmedLines('foo\n\nbar', 'foo\n \nbar', {newlineIsToken: true});
expect(convertChangesToXML(diffResult)).to.equal('foo\n \nbar');
}
);
});

describe('#diffLinesNL', function() {
Expand Down

0 comments on commit 5f9cd41

Please sign in to comment.