Introduced fastDiff() #238

f1ames · 2018-04-16T12:28:01Z

Suggested merge commit message (convention)

Other: Introduced fastDiff diffing function. Closes ckeditor/ckeditor5#5000.

Additional information

See ckeditor/ckeditor5#5000 for more details.

coveralls · 2018-04-16T12:33:20Z

Coverage remained the same at 100.0% when pulling 47b2c4f on t/235 into 2974f62 on master.

scofalik · 2018-04-20T13:04:24Z

src/fastdiff.js

+ * Finds position of the first and last change in the given strings and generates set of changes. Set of changes
+ * can be applied to the input text in order to transform it into the output text, for example:
+ *
+ *		let input = '12abc3';


I'd add at least two more, basic examples. One example: adding a few consecutive characters somewhere in the string. Second example: removing a few consecutive characters from the string. The example you provided is actually kind of an edge case, where characters were removed from the beginning and from the end. So, the output might actually be surprising for someone.

Other than that, I am not sure about the "usage" snippet. I have a feeling that the example is a bit uninspiring. Even though it shows how to use the output, somebody would think "why would I want to do it?". I have a feeling that we might want to drop this snippet.

I'd also shorten the examples:

* (...) for example: * * fastDiff( '12a', '12xyza' ); // [ { ... } ] * fastDiff( '12a', '12aa' ); // [ { ... } ] * fastDiff( '12xyza', '12a' ); // [ { ... } ] * fastDiff( '12aa', '12a' ); // [ { ... } ]

If the output is too long for the comment in the same line, it can be above or below starting with "Following produces:" and with extra line between each example.

scofalik · 2018-04-20T13:34:54Z

src/fastdiff.js

+// The above indexes means that in `oldText` modified part is `1[23]4` and in the `newText` it is `1[342]4`.
+// Based on such indexes, array with `insert`/`delete` operations which allows transforming
+// old text to the new one could be generated.
+//


Does it return correct values if strings are the same? If not, then I'd add a note that it is assumed that the strings are different.

scofalik · 2018-04-20T13:36:59Z

src/fastdiff.js

+	// If not found, it means first change is at the end of the string.
+	let firstIndex = oldTextLength;
+	for ( let i = 0; i < oldTextLength; i++ ) {
+		if ( i >= newTextLength || oldText[ i ] !== newText[ i ] ) {


i >= newTextLength should not be needed. If it is true, then newText[ i ] == undefined so surely the second condition will be met.

scofalik · 2018-04-20T14:31:34Z

src/fastdiff.js

+		// oldText: '321ba' -> '21ba' -> 'ab12'
+		// newText: '31ba'  -> '1ba'  -> 'ab1'
+		// { firstIndex: 1, lastIndexOld: 2, lastIndexNew: 1 }
+		if ( i >= newTextReversedLength ) {


Those additional conditions are probably not needed either.

Well, unfortunately, it is needed, because otherwise, calculations mess up in the last if. There's a difference if the texts differ because of an actual different letter or because of one of them is shorter.

scofalik · 2018-04-20T14:37:33Z

I've spent some time analyzing this code, cause it looked to me that the fastDiff function is too complicated. There are many variables initialized, strings are reversed and cut and you quickly stop to understand what's what and why.

So, the algorithm itself cannot be simpler. Of course, finding the first change is easy. Finding the last change is more difficult. Cutting and reversing the string leads in fact to the simpler implementation (you could do the same using proper offsets but it would be even worse).

However, I think that it may be written in a simpler way. I'll propose something.

scofalik · 2018-04-20T16:24:32Z

I've pushed two commits.

In the first one, I only changed @f1ames solution a bit.

In the second one, I proposed a different solution. Since I am subjective on this matter, maybe @Reinmar could take a look and say which solution looks cleaner.

BTW. Because of a non-precise description in findChangeBoundaryIndexes I thought that may algorithm should not work. Because the original documentation says that the lastNewIndex and lastOldIndex are indexes of change, then there should be no difference between:

123 vs 123123 and 1234 vs 123123.

Because both of them differ on the third index, both of them should have lastOldIndex equal to 3. That meant that my solution needed an additional if because it produced a different lastOldIndex in both scenarios. Now imagine my shock when I saw that tests failed with the special if and that they succeed without it :D. So I looked at the docs and even in the example from the description you can see that the lastOldIndex and lastNewIndex are not indexes of change but rather an index of the last common character.

scofalik · 2018-04-20T16:29:32Z

BTW. I've also added two tests - one that I thought should fail for my algorithm, and the one that is used in the docs.

Reinmar · 2018-05-04T15:29:26Z

Since I am subjective on this matter, maybe @Reinmar could take a look and say which solution looks cleaner.

I'm sure you'll be able to figure out which one you like more, guys ;) You know far more about this code than I'll be able to learn now.

f1ames · 2018-05-09T10:26:56Z

src/fastdiff.js

+// @param {String} newText
+// @returns {Object}
+// @returns {Number} return.firstIndex Index of the first change in both strings (always the same for both).
+// @returns {Number} result.lastIndexOld Index of the last common character in `oldText` string looking from back.


I think it should be:

result.lastIndexOld Index of the last common character in oldText string.

or

result.lastIndexOld Index of the first common character in oldText string looking from back.

Because looking for the last common character starting from the back, means you are looking for the first common character basically.

f1ames · 2018-05-09T11:02:05Z

I just added some docs improvements. TBH @scofalik, your refactor made it more readable and simplified it a little so I'm fine with it. Looks good IMHO.

Other than that, I am not sure about the "usage" snippet. I have a feeling that the example is a bit uninspiring. Even though it shows how to use the output, somebody would think "why would I want to do it?". I have a feeling that we might want to drop this snippet.

I see you left the snippet, was it on purpose? I wasn't sure either, but it gives some general picture how fastDiff output can be used so maybe we could left it there.

Merged latest changes from master too.

I think it is ready for review, from my perspective it looks good, nothing to add here.

f1ames · 2018-05-09T11:14:48Z

As agreed with @scofalik F2F, I have merged the PR as it was ready 🎉

f1ames added 2 commits April 16, 2018 14:24

Introduced 'fastDiff' function.

4423c4c

Tests: 'fastDiff' unit tests.

2458006

Reinmar requested a review from scofalik April 20, 2018 11:02

scofalik reviewed Apr 20, 2018

View reviewed changes

Changed: Refactored fastDiff.

d867925

scofalik force-pushed the t/235 branch from eb563f9 to f72bf11 Compare April 20, 2018 16:28

scofalik added 2 commits April 20, 2018 18:37

Changed: Refactored fastDiff.

f72bf11

Docs: Improved docs.

f021fb1

Reinmar mentioned this pull request May 4, 2018

Use CharacterData.insertData and CharacterData.deleteData to render changes in text nodes ckeditor/ckeditor5-engine#1407

Merged

f1ames commented May 9, 2018

View reviewed changes

f1ames added 2 commits May 9, 2018 12:31

Docs adjustments.

e341cb6

Merge branch 'master' into t/235

47b2c4f

f1ames merged commit 81fefc9 into master May 9, 2018

f1ames deleted the t/235 branch May 9, 2018 11:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduced fastDiff() #238

Introduced fastDiff() #238

f1ames commented Apr 16, 2018 •

edited by pomek

Loading

coveralls commented Apr 16, 2018 •

edited

Loading

scofalik Apr 20, 2018

scofalik Apr 20, 2018

scofalik Apr 20, 2018 •

edited

Loading

scofalik Apr 20, 2018

scofalik Apr 20, 2018

scofalik Apr 20, 2018 •

edited

Loading

scofalik commented Apr 20, 2018

scofalik commented Apr 20, 2018

scofalik commented Apr 20, 2018

Reinmar commented May 4, 2018

f1ames May 9, 2018

f1ames commented May 9, 2018

f1ames commented May 9, 2018

Introduced fastDiff() #238

Introduced fastDiff() #238

Conversation

f1ames commented Apr 16, 2018 • edited by pomek Loading

Suggested merge commit message (convention)

Additional information

coveralls commented Apr 16, 2018 • edited Loading

scofalik Apr 20, 2018

Choose a reason for hiding this comment

scofalik Apr 20, 2018

Choose a reason for hiding this comment

scofalik Apr 20, 2018 • edited Loading

Choose a reason for hiding this comment

scofalik Apr 20, 2018

Choose a reason for hiding this comment

scofalik Apr 20, 2018

Choose a reason for hiding this comment

scofalik Apr 20, 2018 • edited Loading

Choose a reason for hiding this comment

scofalik commented Apr 20, 2018

scofalik commented Apr 20, 2018

scofalik commented Apr 20, 2018

Reinmar commented May 4, 2018

f1ames May 9, 2018

Choose a reason for hiding this comment

f1ames commented May 9, 2018

f1ames commented May 9, 2018

f1ames commented Apr 16, 2018 •

edited by pomek

Loading

coveralls commented Apr 16, 2018 •

edited

Loading

scofalik Apr 20, 2018 •

edited

Loading

scofalik Apr 20, 2018 •

edited

Loading