-
-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix performance issue due to search in tokens #210
fix performance issue due to search in tokens #210
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi! Thanks a lot for your work!
Let's run the CI and check it out. 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like, you've accidentally deleted pyproject.toml
77c34db
to
40cd990
Compare
omg... sorry for that. I'm using pip to install it locally, not poetry so I had to clean those files temporarily... |
40cd990
to
7e34ae4
Compare
7e34ae4
to
5a46fde
Compare
Alright, now it should be good. I've set it up with poetry locally. |
Hi, please re-run the pipeline when you have a chance ;) |
Done! |
Codecov Report
@@ Coverage Diff @@
## master #210 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 1 1
Lines 41 42 +1
Branches 5 7 +2
=========================================
+ Hits 41 42 +1
Continue to review full report at Codecov.
|
50f30fa
to
6a3ccfc
Compare
Note that consuming the file_tokens from flake8 made it slightly less trivial/efficient to ignore the file encoding |
Would you mind running the pipeline on it ? |
Let me know if there is anything else I should do or if it's fine as it is |
@bagerard I will return to this PR sometime soon! Please, stay tuned 🙂 |
I finally got the time to work on this, sorry for the long wait! |
|
||
### Features | ||
|
||
- Imrpoves performance on long files #210 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo here :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed! Thanks!
@bagerard thanks a lot for your work! 👍 |
Fixes #176
I looked into this and the issue was in the following block
self._tokens is actually containing the tokens from beginning of the file until the physical line yielded by flake8, this means that self._tokens is getting bigger and bigger every time flake8 invokes this plugin on the next lines, thus searching in self._tokens gets more and more expensive for large files as it's inefficient to search in large list by design (the large file is > 12000 lines so that makes a lot of tokens).
I switch the code to process the whole file instead of work line by line, that way it has to search for the comment token only once.
On the long python file attached in #176, this makes the processing time go from 30 sec to just 2 sec 🚀