-
Notifications
You must be signed in to change notification settings - Fork 572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance regression from 5.6.1 to 6.0.7 #1439
Comments
A nit but to be clear I think the regression was introduced in 6.0. |
Is it the file system work or the python module importing with the extensions? I thought it was the latter. The stat calls have always been around because the notebook server does a lot of lstat calls to check file existence. |
Instantiating the extensions to check their |
Ahh, yes I see that |
The help string was just a quick way to demonstrate that |
I believe from searching the repo that |
Changing The specific place we're seeing the issue is here. When the notebook template is rendered |
I also did some experimenting today and found that having |
Ahh I see. Yes outside of nbconvert calling the function would trigger disk loads on each call. We could cache the result so it's only loaded once but the first load would be slow. I'm still a little surprised with the times you posted which suggests there's more calls being made than expected in those class loads. The upgrade was from #1273 to allow for the download options to be disabled by application config. This means that the intent was to read from the config and find what classes are enabled, where the class loading disk all the disk attributes was somewhat unnecessary to determine enabled state. Rebuilding the correct config hierarchy without loading the class would be a bit of a headache to do... maybe a shortcut could be taken. Another option here would be to have an environment variable or argument to disable the enablement check. That way you could launch with the feature check disabled. |
Yeah with 30s+ load times on loading the first notebook we probably can't go with that.
FWIW this is kind of what I was thinking. |
Shoot a PR along, happy to merge it. |
This introduces an environment variable named NBCONVERT_DISABLE_CONFIG_EXPORTERS that will cause get_export_names to return all entrypoints instead of checking the "enabled" status of each one. Closes: jupyter#1439
This introduces an environment variable named NBCONVERT_DISABLE_CONFIG_EXPORTERS that will cause get_export_names to return all entrypoints instead of checking the "enabled" status of each one. Closes: jupyter#1439
This introduces an environment variable named NBCONVERT_DISABLE_CONFIG_EXPORTERS that will cause get_export_names to return all entrypoints instead of checking the "enabled" status of each one. Closes: jupyter#1439
* Allow get_export_names to skip configuration check This introduces an environment variable named NBCONVERT_DISABLE_CONFIG_EXPORTERS that will cause get_export_names to return all entrypoints instead of checking the "enabled" status of each one. Closes: #1439 * Update nbconvert/exporters/tests/test_exporter.py Co-authored-by: Matt Riedemann <mriedem.os@gmail.com> Co-authored-by: Matt Riedemann <mriedem.os@gmail.com>
Loading notebooks and running
jupyter nbconvert
commands became much slower after we upgraded from 5.6.1 to 6.x. Our notebook load times went from 400-500ms to 30s or more. Our user notebooks run on kubernetes and use s3fs for the home directory.Nbconvert version: 6.0.7
This was introduced in #1273. Previously
get_export_names
just returned the list of exporters in thenbconvert.exporters
group. Now it loads and instantiates each one to check itsenabled
attribute. This causes a lot of stat calls and listing of directories which is very slow on s3fs.The text was updated successfully, but these errors were encountered: