Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of attribute removal #4504

Closed
mlewand opened this issue Apr 3, 2019 · 17 comments
Closed

Improve performance of attribute removal #4504

mlewand opened this issue Apr 3, 2019 · 17 comments
Assignees
Labels
package:engine support:2 An issue reported by a commercially licensed client. type:bug This issue reports a buggy (incorrect) behavior.

Comments

@mlewand
Copy link
Contributor

mlewand commented Apr 3, 2019

While working on remove formatting (ckeditor/ckeditor5-remove-format#7) we found that removing attributes from relatively simply code takes too much time.

image

For repro case see ckeditor/ckeditor5-remove-format#7.

@Reinmar
Copy link
Member

Reinmar commented Apr 3, 2019

How much content did you test?

Because I tested removing styles (quite a lot of them – I used a duplicated content from the remove-formatting.md docs page) from 50 paragraphs:

image

It took 225ms to remove all the styles. I don't think this is a problem because those 50 paragraphs represent are not a short piece:

image

@Reinmar
Copy link
Member

Reinmar commented Apr 3, 2019

OK, I can see that you used the content from https://github.com/ckeditor/ckeditor5-remove-format/issues/7.

Sounds for me like an edge case. This content is causing some turbo deopts, but it does not look like a real content. We definitely have a problem, but I'd check with some real-life scenarios (copy-paste from other websites/word/etc) and prioritise it based on the results from these tests.

@Reinmar
Copy link
Member

Reinmar commented Apr 3, 2019

Or maybe this is actually a pretty realistic scenario...

@scofalik
Copy link
Contributor

scofalik commented Apr 3, 2019

The question would be whether it's simply a lot of operations to make or if attribute operations are slower than other.

This could be affected by text nodes splitting and joining after each operation.

@Reinmar
Copy link
Member

Reinmar commented Apr 3, 2019

OK, I found the main culprit. It was actually quite funny. I reduced the execution time from 34s to 1.5s which still isn't great, but it's much better.

BEFORE:

image

AFTER:

image

The issue was this bit of code in ImageLoadObserver:

		viewRoot.on( 'change:children', ( evt, node ) => {
			// Wait for the render to be sure that `<img/>` elements are rendered in the DOM root.
			this.view.once( 'render', () => this._updateObservedElements( domRoot, node ) );
		} );

Why is it a problem? It's actually 3 problems:

  1. We add a listener on every child change – and in this case, we have thousands of these.
  2. We use once() so on every #render we now need to remove those thousands callbacks. This isn't cheap too.
  3. We call the _updateObservedElements() function thousands times too.

We need to rewrite this observer completely. The first thing that came to my mind was – can we use event capturing for image#load?

The next thing to look into would be _getRawTag() (which is the most significant position in the "self time" category) which is actually a lodash's function so there's a big chance we don't need to call it at all. Most likely we use some overcomplicated (for our case) lodash function instead of doing something simple with our own code.

@oleq oleq changed the title Improve performance for attribute removing Improve performance for attribute removal Apr 3, 2019
@oleq oleq changed the title Improve performance for attribute removal Improve performance of attribute removal Apr 3, 2019
@jodator
Copy link
Contributor

jodator commented Apr 4, 2019

I've found out that the longer the text is in single paragraph the longer the method take. This is counterintuitive to me as I's suspect the same number of operations - but maybe there's something more going on which is dependent on the text length not the number of attributes.

@scofalik
Copy link
Contributor

The same happens the other way around too -- when you add multiple attributes at the same time. This is a case in track changes. I debugged it and got a very similar performance chart.

@mlewand mlewand transferred this issue from ckeditor/ckeditor5-engine Oct 9, 2019
@mlewand mlewand added this to the next milestone Oct 9, 2019
@mlewand mlewand added status:confirmed type:bug This issue reports a buggy (incorrect) behavior. package:engine labels Oct 9, 2019
@Reinmar Reinmar added the support:2 An issue reported by a commercially licensed client. label Oct 29, 2019
@Reinmar Reinmar modified the milestones: next, iteration 28 Oct 29, 2019
@Reinmar
Copy link
Member

Reinmar commented Nov 20, 2019

Unfortunately, ckeditor/ckeditor5-image#335 does not resolve this issue completely. The performance is still fundamentally wrong.

I did some profiling and out of 8s that the whole thing takes, 3.3s are spent in parseAttributes() in the view element constructor. That'd be the first thing to check because we can gain the most here. But it'd still not be all.

Here are stats of view element constructor calls, split per element name. In the arrays I stored the attributes param of the constructor:

image

As you can see, in summary, the constructor is called 70k times 🤯

👉 70.000 times 👈

Most of the time, in this case, with undefined as attributes. Hence... perhaps we can add a condition for it.

@Reinmar
Copy link
Member

Reinmar commented Nov 20, 2019

Most of the time, in this case, with undefined as attributes. Hence... perhaps we can add a condition for it.

Nope. The number of times we call this with undefined is actually around 500 compared to 70k with some empty map/object. We could still find ways to optimize this locally but I think that we should figure out why that many elements are created and whether we can prevent that.

Reinmar added a commit to ckeditor/ckeditor5-image that referenced this issue Nov 20, 2019
Fix: Improved markup operation performance of the editor with the image plugin enabled. See ckeditor/ckeditor5#4504.
@Reinmar
Copy link
Member

Reinmar commented Nov 20, 2019

I closed ckeditor/ckeditor5-image#335 but I think we should keep this issue opened. I think that a reasonable time for this content is around 1s. So there's still 80% to go.

@Reinmar
Copy link
Member

Reinmar commented Nov 20, 2019

I extracted the image load observer to #5766 and I'd remove this ticket from the iteration as there's more to do here.

@Reinmar Reinmar modified the milestones: iteration 28, next Nov 20, 2019
@Reinmar Reinmar reopened this Nov 20, 2019
@Reinmar
Copy link
Member

Reinmar commented Nov 27, 2019

I was testing loading a quite long content (50 pages) into the editor and I can also see the getRawTag there:

image

Interesting if it would be possible to optimise that bit 🤔

@Reinmar
Copy link
Member

Reinmar commented Nov 27, 2019

In this case, there are 30k model nodes (10k elements and 20k text nodes) created, even though the final document has 3k nodes. Again, we're executing toMap() just too many times.

@Reinmar
Copy link
Member

Reinmar commented Nov 28, 2019

Okeydokey – I was able to take toMap()'s execution time from something around 20-30% of the total task's time to close to 0%. Details in #5854.

However, we should still check if we can save some time here. Unfortunately, in both cases (remove format and data loading) this will not be a micro-optimization any more (unless someone knows how to shove some time off the Position#parent getter). This time we have to figure out:

  • how to do fewer things (e.g. create fewer elements),
  • how to not do some things at all.

@Reinmar
Copy link
Member

Reinmar commented Nov 28, 2019

Actually, I'm wrong, there's still more to do in here (will affect only remove formatting case):

image

This is parseAttributes() from view element and it also does the isPlainObject() thing. We can change the condition there too.

@Reinmar Reinmar self-assigned this Nov 28, 2019
@jodator
Copy link
Contributor

jodator commented Nov 28, 2019

@Reinmar would you mind creating a manual sample similar to the ones I've created for the memory leak tests?

  • It might be just one editor build with some complex content to parse.
  • to make tests easier it could initialize the editor on a button (to setup performance test in dev tools)
  • it should have as many features (in the editor and in loaded content) to test in some full setup
  • it could also load the same content to different setups (like current Classic and some with many features enabled) so one button will tests subset of features from the content the other would test all of them.

This would help to manually test various improvements and compare results in the team.

@Reinmar Reinmar modified the milestones: next, backlog Dec 2, 2019
@Reinmar
Copy link
Member

Reinmar commented May 5, 2020

As I commented in #2336 (comment). We reduced the time over 10x. NICE :)

@Reinmar Reinmar closed this as completed May 5, 2020
@Reinmar Reinmar removed this from the backlog milestone May 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
package:engine support:2 An issue reported by a commercially licensed client. type:bug This issue reports a buggy (incorrect) behavior.
Projects
None yet
Development

No branches or pull requests

4 participants