Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow pandas completion #520

Closed
dpavlic opened this issue Dec 14, 2014 · 7 comments
Closed

Slow pandas completion #520

dpavlic opened this issue Dec 14, 2014 · 7 comments
Labels
database-index Needs a database index/Rewrite in Rust (#1059) performance

Comments

@dpavlic
Copy link

dpavlic commented Dec 14, 2014

I'm getting significant lag in emacs (company-mode with anaconda backend), while completions seem to break entirely with vim (YouCompleteMe) where it won't complete anything anymore for a long, long time when using pandas 0.15.1. Here is some example code:

import pandas as pd
a = pd.DataFrame({'v1': ['a', 'b', 'c']})
a.v1.cat.

It is at this point where lag in emacs becomes unbearable, while vim will happily continue to work, but all completions from YouCompleteMe will not work anymore (including completions for a different library).

Now, it seems the issue is that the getting the completion candidates is a costly affair in jedi itself:

import jedi
%timeit jedi.Script("import pandas as pd; a = pd.DataFrame({'v1': ['a','b','c']}); a.v1.cat.").completions()

Which gives me:

1 loops, best of 3: 2.66 s per loop

A similar timeit operation on os.path. completion gives me 9 ms, so it seems to me like that might be the source of the problem.

@davidhalter
Copy link
Owner

Thanks for the report!

@tomsheep
Copy link

tomsheep commented Jun 3, 2015

Any update? I came across the same issue with Pandas 0.16.0 + MacOSX 10.10

@davidhalter
Copy link
Owner

@FrankFeng YouCompleteMe uses Jedi with an async client/server model. This obviously makes it "fast", but some things will not complete.

@tomsheep Not really. I don't really spend a lot of time on Jedi at the moment, sorry.

@davidhalter
Copy link
Owner

Still that bad with the current dev branch?

@davidhalter
Copy link
Owner

davidhalter commented Jan 5, 2020

I finally had time to understand what the problem was here.

I brought this example down from a few seconds to 0.02s. The slowest thing (DataFrame completions, all types and docstrings with signatures) went from more than 10s to 0.25s. I feel like this is good enough.

The changes are kind of a "PyCharm mode". I just disabled a lot of dynamic features for pandas that are problematic, like decorators and parameter resolving (**kwargs to what they effectively are), as well as ignoring __getattr__ and if branching. These features are nice, but don't really provide any value for pandas (and other libraries like numpy/tensorflow), so I just disabled them for some big libraries. I called it "PyCharm mode", because those are all things PyCharm always ignores, which is a bad thing in my opinion, but makes perfect sense for those libraries.

Feel free to test already on master, before release.

Further improvements might happen for #1116 and to be really fast we might eventually need to do #1059.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
database-index Needs a database index/Rewrite in Rust (#1059) performance
Projects
None yet
Development

No branches or pull requests

4 participants
@tomsheep @davidhalter @dpavlic and others