Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Investigation #115

Closed
bd82 opened this issue Jan 8, 2019 · 2 comments
Closed

Performance Investigation #115

bd82 opened this issue Jan 8, 2019 · 2 comments

Comments

@bd82
Copy link
Contributor

bd82 commented Jan 8, 2019

Context

The early performance numbers of the parser were not stellar.
On modern powerful laptop parsing the java-design-patterns repo took about 4 seconds.

  • That repo is about 28k Lines of java code (neto).

7K lines a second is not actually that bad considering the prettier scenario
where normally you would only incrementally re-print the files which have changed (staged in git).

  • Consider that the average Java File only takes a few miliseconds to parse that would not be noticeable by human user.

However those are not good performance numbers relative to the previous parser implementation (x3 faster). And also compared to other Chevrotain based Parsers, For example the ECMAScript parser example runs at over 100K LOC per second.

Causes

The main culprit is the use of backtracking to keep the grammar as aligned as possible
with the official specifications style.
In some cases backtracking is required to resolve ambiguities regardless of the need to align with the grammar, e.g lambda expression vs cast expression vs parenthesis expression.

Current Status

Several quick wins, optimizations focused around reducing the use of backtracking
or "fast failing" during backtracking have been applied recently.

This has already more than doubled the performance to 16-17K LOC per second.
and am confidant more optimizations are possible.

So while this is not going to be the fastest Java Parser in the world partially
due to the design goal to align strong with the spec's style, and partly because the Java
grammar is not LL(K) while Chevrotain is at its core an LL(K) parser library.

It would be more than fast enough for the prettier scenario and also most
other scenarios that do not need to process millions of lines of code at once.

@bd82
Copy link
Contributor Author

bd82 commented Jan 8, 2019

I will keep the issue open to update it with future performance related work.

bd82 added a commit that referenced this issue Jan 8, 2019
Avoids another case of backtracking in a hotspot.
Gained another ~20%+ performance boost.
Now at ~20K LOC per second on the benchmark.

Part Of #115
bd82 added a commit that referenced this issue Jan 8, 2019
- Avoid backtracking during the parsing of annotations.

Part of #115
@bd82
Copy link
Contributor Author

bd82 commented Feb 14, 2019

lets re-open this if/when there would actually be future performance work 😄

@bd82 bd82 closed this as completed Feb 14, 2019
@bd82 bd82 mentioned this issue Mar 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant