Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure that the SymPy version always remains up to date #26

Merged
merged 10 commits into from
Jan 20, 2025

Conversation

agriyakhetarpal
Copy link
Collaborator

@agriyakhetarpal agriyakhetarpal commented Jan 16, 2025

Description

This PR adds machinery to the shell's configuration to ensure that the SymPy version is always up to date. This change would become redundant with Pyodide 0.28 where package recipes will be unvendored from its runtime and multiple versions of a package will be available to install for a particular Pyodide version, but that release and the subsequent stability of that system are quite a few months away for us as of now – and hence this change should stay relevant till that time.

Changes made

  1. We download SymPy's wheel from PyPI and ship it with JupyterLite. This feature just ships the wheel and makes it available for installation – JupyterLite doesn't install it beforehand. We use the jupyter_lite_config.json file for this, as the CLI currently doesn't work (Ship additional Pyodide wheels at build time does not work jupyterlite/jupyterlite#1502).
  2. A script unvendor_tests_from_wheel.py has been added, based on pyodide-build.
  3. The index.html page generation script has been updated to automatically add the %pip install sympy command to the REPL initialisation code, so that SymPy is always up to date in the REPL.
  4. The documentation has been updated in accordance with these points. I initially designed this workflow to run in CI, but I think it makes sense to be able to try the change locally, too.
  5. The deployment workflow will now run on a scheduled basis daily so that the SymPy version can be grabbed and will be available to run on manual requests to fix things in case something breaks.
  6. The rest of the changes are purely cosmetic; in particular:
    • Added inline script metadata to both Python scripts so that we can use popular command-line runners to run the scripts without having to install dependencies in requirements.txt, which is used by JupyterLite to manage its environment
    • Some minor formatting changes (I ran black over the Python files)
    • Renamed generateindex.py to generate_index.py
    • Added a custom_wheels/ directory that is to be kept in version control

Plan of action

The version of SymPy in Pyodide 0.27.1 is 1.13.3 at the time of writing. When SymPy 1.x.y (where either x > 13 or y > 3 or both) gets released upstream, we will use pip download to place the wheel in custom_wheels/, and run unvendor_tests_from_wheel.py to remove all test-related files from the wheel (like how it is in the Pyodide distribution). Then, generate_index.py will notice that a wheel exists in that directory, and it will embed the %pip install sympy command in the REPL initialisation code. This will ensure that the SymPy version is always up to date.

Additional context

@agriyakhetarpal
Copy link
Collaborator Author

Hi @oscarbenjamin, this enhancement might be of interest to you based on our recent correspondence. Could you please take a look when you have time? Thank you!

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
templates/index.html Show resolved Hide resolved
@agriyakhetarpal agriyakhetarpal changed the title [DRAFT]: Ensure that the SymPy version always remains up to date Ensure that the SymPy version always remains up to date Jan 16, 2025
@agriyakhetarpal agriyakhetarpal marked this pull request as ready for review January 16, 2025 06:25
Comment on lines 108 to 115
def main():
pyodide_version = get_pyodide_lock()
pypi_version = get_pypi_version()

print(f"SymPy version in Pyodide: {pyodide_version}")
print(f"Latest PyPI SymPy version: {pypi_version}")

should_download = Version(pypi_version) > Version(pyodide_version)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand why all of this is needed. Would it not make sense to just run pip download sympy and always use the latest final release?

I'm not sure why the pyodide version would be relevant at all or is there some advantage in using it from pyodide rather than from PyPI?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense to me, I will refactor the script. TIL that a pip download command exists!

The idea is to add the %pip install sympy command only when it is needed, as it is redundant otherwise.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is to add the %pip install sympy command only when it is needed, as it is redundant otherwise.

I don't understand when/where that command is being run. Is it running in the end users browser? Or is it just run locally as part of building the site?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first one – the command runs in the end user's browser, when the REPL is instantiated.

@agriyakhetarpal
Copy link
Collaborator Author

So, we now have two options:

A. 3bfc047, which goes halfway and adds the %pip install sympy command in the REPL conditionally based on the outdatedness of the SymPy version in Pyodide (and with no effect if Pyodide already has the same version).
B. a5a3cb2, which goes the full way and removes check_sympy_version.py entirely, so that we will always have the %pip install sympy command in the REPL.

Which one would you prefer? B is much nicer, I feel.

@oscarbenjamin
Copy link

So, we now have two options:

Do these options have any implications for performance or anything else?

Generally my preference is to use the latest release of SymPy always and not worry about the version in pyodide.

@agriyakhetarpal
Copy link
Collaborator Author

agriyakhetarpal commented Jan 16, 2025

So, we now have two options:

Do these options have any implications for performance or anything else?

Generally my preference is to use the latest release of SymPy always and not worry about the version in pyodide.

Note: please see the "Tip" admonition below for a TL;DR.

Good question – I don't think performance will be impacted at all, at least at a level that will be glaringly noticeable, as the SymPy wheel will be shipped alongside JupyterLite's static files and not downloaded externally from Pyodide's CDN. I did try to benchmark this in the dev console, and the installation for the wheel for Pyodide 0.26.4 (where I loaded SymPy from the local download) was almost instant, coming at 96.98 ms from time of pressing Enter without browser throttling enabled. Getting SymPy's wheel from the jsDelivr CDN took 161.27 ms. Fetching SymPy from PyPI and not locally took 299.53 ms.

By enabling throttling on a "Slow 4G" network, the local/hosted wheel took 583.83 ms to install, and the one from the CDN took 50.81 ms (not sure why, perhaps my request hit a cache – I wasn't able to do this experiment reliably). Fetching from PyPI on this throttled network took 45.17 ms, and I was unable to evict the cache for some JupyterLite niceties/annoyances.

In all cases, mpmath was fetched from the CDN, so its effect can be nullified.

One thing to note is that Pyodide's SymPy wheel/recipe has the unvendor-tests: true option, so the wheel is smaller as many files are removed. However, we disable the wheel compression with the build frontend and set it to 0, as our wheels carry the application/wasm MIME type which is compressed (dynamically based on the file size and location in one of gzip/zstd/Brotli) by Cloudflare/jsDelivr better than what Python packaging tooling and PyPI can provide. Thus, while the size of the Pyodide's SymPy wheel is ~17 MiB, the compression level offered by the CDN is greater than that of PyPI (which is 6). So, jsDelivr ends up downloading less than half of the PyPI SymPy wheel size, which is 6.2 MiB at the time of writing, and the wheel is decompressed later on.

GitHub Pages will use the binary/octet-stream MIME type for the wheel file, and based on https://docs.github.com/en/pages/getting-started-with-github-pages/about-github-pages#mime-types-on-github-pages, I don't think it would be possible for us to customise it beforehand, and we will be limited to gzip compression/decompression, most likely.

Tip

Hence, I can't offer a concrete answer for this, unfortunately, and I can't thoroughly verify if my experiment above was empirical enough to be insightful without having a hosted GitHub Pages site, but I would prefer this order (option B), and I hope it clears your concerns somewhat:

The local wheel, as it requires no external bandwidth connection and can be cached by JupyterLite's storage on subsequent runs by a user >= jsDelivr CDN/Pyodide, which has a smaller wheel size + better compression > PyPI, which has a similar speed as that of jsDelivr and wheels can be cached by JupyterLite, but they are larger in size.

However, the jsDelivr CDN might not come with the latest SymPy; hence, it's not a viable option at the moment. I just performed the experiment as it made sense to compare all options, especially when Pyodide will complete unvendoring recipes, and we could think about it again at a later time.

Hence, choosing option B—where we always have a local and up-to-date wheel—shouldn't cause any adverse side effects right now, and we can ask @ivanistheone to run https://bit.ly/sympyjstest over it once we merge this. However, if this warrants a closer inspection, I would be open to deploying SymPy Live on GH Pages via my fork, and we could run those tests targeting my site instead of the upstream one.


P.S. Since we are downloading SymPy from PyPI anyway, I could undoubtedly port the test unvendoring to a new reusable tool, but I don't think doing so is worth the effort as JupyterLite and piplite offer abstractions over micropip to cache pure Python wheels they fetch from PyPI or are indexed from elsewhere (locally, in this case). However, having the PyPI wheel means that #10 can now be closed (if it is not meant to be closed with SymPy's Pyodide CI job).

@oscarbenjamin
Copy link

I don't follow all the details so this is my main takeaway:

I don't think performance will be impacted at all, at least at a level that will be glaringly noticeable

I just wanted to know if there was some reason why the previous code attempted to use pyodide first before then using PyPI. It sounds like there isn't any important reason to do that rather than just always install from PyPI.

The main thing from my perspective is just that I want this to be always the latest version because e.g. we get bug reports from people who are not using the latest version and they do often test against the live website.

The next thing from my perspective is that it would be great if we had a latest dev version that people could test e.g. to see if some issue is fixed already or try out new features.

It would also be good if the interface makes it clear what version is being used because I think many users will not think to print __version__.

@agriyakhetarpal
Copy link
Collaborator Author

I don't follow all the details so this is my main takeaway:

I don't think performance will be impacted at all, at least at a level that will be glaringly noticeable

Yes, I did write a lot here, and that's the main takeaway, indeed. FWIW, it's going to serve as a self-note for me in the future :)

I just wanted to know if there was some reason why the previous code attempted to use pyodide first before then using PyPI.

Ah, that comes from the long-standing issue in Pyodide: pyodide/pyodide#2580, where removing SymPy and other pure Python packages was previously discussed. The gist is that it could be a bad idea to remove them, as:

  • a package being pure Python does not guarantee its compatibility with a WASM environment and we need patches over quite a few of them,
  • the wheel size reductions and compression offered by the CDN make their inclusion viable,
  • historically, there has been a lack of a dependency resolver, and the handling of alternative indices is being improved only recently.

It sounds like there isn't any important reason to do that rather than just always install from PyPI.

Actually, I noticed when typing this comment that the code we have for unvendoring the tests is not too complicated: https://github.com/pyodide/pyodide-build/blob/7ebf9a274de346477910ae9b6885a5aed0b98fff/pyodide_build/buildpkg.py#L690-L741. This opens two options for an alternative measure, where the benefit is that we have a reduced wheel size to ship:

  1. Download SymPy's sdist from PyPI instead of the wheel and build a wheel from it using pyodide-build
  2. Adapt that code into a short script here, which strips the wheel downloaded from PyPI

We would have to add the %pip install command anyway, for both using SymPy directly from PyPI's CDN and using SymPy downloaded in advance from PyPI, so maybe doing this and shipping a smaller wheel does make sense?

Otherwise, I think installing from PyPI could be tried out too, as the wheel caching is quite reliable based on my experiments.

The main thing from my perspective is just that I want this to be always the latest version because e.g. we get bug reports from people who are not using the latest version and they do often test against the live website.

Yes, that makes sense to me – this PR should be in line with those thoughts – barring the delay in which a new SymPy version is released and the nightly deployment hasn't been built yet. A cron job has a maximum granularity of five minutes, but that's overkill and someone will need to do a manual trigger of the deployment workflow.

The next thing from my perspective is that it would be great if we had a latest dev version that people could test e.g. to see if some issue is fixed already or try out new features.

That is a good idea, and it makes sense that we will have the necessary machinery for that after this PR – all we will need for a "dev" version of the shell is to pip download the nightly wheels from the Anaconda.org index, grab their version number and modify the %pip install command with it (at least until the issue I linked upstream and is resolved), ship them with our JupyterLite deployment, and that should be it. However, we would need a separate deployment and repository for this – perhaps a sympy/live-dev repository would be nice; what do you think? Unless I am missing something, that is. I don't know if there is a way to have GH Pages built from two branches in one repository.

It would also be good if the interface makes it clear what version is being used because I think many users will not think to print __version__.

I agree. I've opened #23 for this recently. I haven't looked at a solution so far, as it might require writing a kernel of our own based on the design for isympy. I can follow up after a conversation with JupyterLite developers.

@oscarbenjamin
Copy link

maybe doing this and shipping a smaller wheel does make sense?

Yes, I mean all these ideas sound good to me. I am somewhat out of my depth in understanding the details.

In general I would actually like to just remove the tests from the installed sympy package and have them be a separate package but that's another bunch of work.

barring the delay in which a new SymPy version is released and the nightly deployment hasn't been built yet

That is not an issue. If the live website is updated even weekly then that is fine. Anything faster than yearly beats the current situation.

perhaps a sympy/live-dev repository would be nice; what do you think? Unless I am missing something, that is. I don't know if there is a way to have GH Pages built from two branches in one repository.

I'm not sure what can or can't be done. In my mind the ideal thing would be something like a drop down box where you select the sympy version and maybe which other packages like python-flint you want. I have no idea whether that is an easy thing or something that is much harder than it sounds though.

@agriyakhetarpal
Copy link
Collaborator Author

agriyakhetarpal commented Jan 17, 2025

Yes, I mean all these ideas sound good to me. I am somewhat out of my depth in understanding the details.

In general I would actually like to just remove the tests from the installed sympy package and have them be a separate package but that's another bunch of work.

No worries; thanks for your input! I'll add these changes.

Yes, moving tests away into a package of their own is a tricky scenario. In Meson land, this can be done using Install tags, which one can use to separate what files end up being included, and build two wheels with one command: sympy-1.X.Y-none-any.whl and sympy-tests.1.X.Y-none-any.whl (see SciPy docs example). I could take a look at doing that as well, but I don't know if SymPy as a pure Python package is supported at all, and doing so does require moving SymPy's build backend that was recently updated to hatchling, IIRC.

That is not an issue. If the live website is updated even weekly then that is fine. Anything faster than yearly beats the current situation.

Sounds good. In that case, I think we should match the release schedule of https://github.com/sympy/sympy/blob/master/.github/workflows/nightly-wheels.yml, so daily sounds like a good bet as per my previous commit. A weekly build would also be fine with me.

I'm not sure what can or can't be done. In my mind the ideal thing would be something like a drop down box where you select the sympy version and maybe which other packages like python-flint you want. I have no idea whether that is an easy thing or something that is much harder than it sounds though.

That was what I had in mind, too, but I self-rejected the idea :D So, I spent a bit of time thinking about it, and I found that we can do this with the same JupyterLite distribution itself – multiple wheels can be added to the index:

JupyterLite REPL deployment showing the use of SymPy 1.14dev0 in WASM, which is a tip-of-tree build

So it should be possible, but we need to add a file dev/index.html and link it with a dropdown menu to switch between stable and nightly versions. It will require some configuration at deployment, though, if we need it to be at https://live-dev.sympy.org/. Or, we could also just have https://live.sympy.org/dev/.

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
unvendor_tests_from_wheel.py Show resolved Hide resolved
@agriyakhetarpal
Copy link
Collaborator Author

This is ready for review now; thank you @oscarbenjamin, for continually pinging me with questions about all of this in general and for responding to mine!

I added a script to remove all test-related files from the wheel, and it reduces the SymPy 1.13.3 wheel size by 33.4%, from 6.2 MiB to 4.13 MiB (while retaining the compression level from PyPI).

TODO items for me after this is merged:

@ivanistheone
Copy link
Collaborator

Thanks @agriyakhetarpal for all your work on this. I'll didn't have time to follow all the discussions today, but I'll reserve some time for to look at the PR tomorrow afternoon/evening.

Copy link
Collaborator

@ivanistheone ivanistheone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read in details and it looks good. Ready to ship!

Shall I press the green button or do you want to do it?

unvendor_tests_from_wheel.py Show resolved Hide resolved
@agriyakhetarpal
Copy link
Collaborator Author

agriyakhetarpal commented Jan 19, 2025

Thanks for the review, @ivanistheone! Unfortunately (or rather fortunately), I think I've discovered an upstream bug here as a part of testing my changes building up on this branch for a "dev' REPL – if I were to install the dev version here through %pip install sympy==1.14.0dev0 by shipping a wheel built for SymPy tip-of-tree, it fails, and displays this error message:

Tap to show error message
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[1], line 1
----> 1 await __import__("piplite").install(**{'requirements': ['sympy==1.14.dev0']})
      2 import importlib.metadata
      3 sympy_version = importlib.metadata.version('sympy')

File /lib/python3.12/site-packages/piplite/piplite.py:121, in _install(requirements, keep_going, deps, credentials, pre, index_urls, verbose)
    119 """Invoke micropip.install with a patch to get data from local indexes"""
    120 with patch("micropip.package_index.query_package", _query_package):
--> 121     return await micropip.install(
    122         requirements=requirements,
    123         keep_going=keep_going,
    124         deps=deps,
    125         credentials=credentials,
    126         pre=pre,
    127         index_urls=index_urls,
    128         verbose=verbose,
    129     )

File /lib/python3.12/site-packages/micropip/package_manager.py:133, in PackageManager.install(self, requirements, keep_going, deps, credentials, pre, index_urls, verbose)
    130 if index_urls is None:
    131     index_urls = self.index_urls
--> 133 return await install(
    134     requirements,
    135     index_urls,
    136     keep_going,
    137     deps,
    138     credentials,
    139     pre,
    140     verbose=verbose,
    141 )

File /lib/python3.12/site-packages/micropip/install.py:53, in install(requirements, index_urls, keep_going, deps, credentials, pre, verbose)
     41 wheel_base = Path(getsitepackages()[0])
     43 transaction = Transaction(
     44     ctx=ctx,  # type: ignore[arg-type]
     45     ctx_extras=[],
   (...)
     51     index_urls=index_urls,
     52 )
---> 53 await transaction.gather_requirements(requirements)
     55 if transaction.failed:
     56     failed_requirements = ", ".join([f"'{req}'" for req in transaction.failed])

File /lib/python3.12/site-packages/micropip/transaction.py:55, in Transaction.gather_requirements(self, requirements)
     52 for requirement in requirements:
     53     requirement_promises.append(self.add_requirement(requirement))
---> 55 await asyncio.gather(*requirement_promises)

File /lib/python3.12/site-packages/micropip/transaction.py:62, in Transaction.add_requirement(self, req)
     59     return await self.add_requirement_inner(req)
     61 if not urlparse(req).path.endswith(".whl"):
---> 62     return await self.add_requirement_inner(Requirement(req))
     64 # custom download location
     65 wheel = WheelInfo.from_url(req)

File /lib/python3.12/site-packages/micropip/transaction.py:141, in Transaction.add_requirement_inner(self, req)
    138 # Is some version of this package is already installed?
    139 req.name = canonicalize_name(req.name)
--> 141 satisfied, ver = self.check_version_satisfied(req)
    142 if satisfied:
    143     logger.info("Requirement already satisfied: %s (%s)", req, ver)

File /lib/python3.12/site-packages/micropip/transaction.py:86, in Transaction.check_version_satisfied(self, req)
     82 if req.specifier.contains(ver, prereleases=True):
     83     # installed version matches, nothing to do
     84     return True, ver
---> 86 raise ValueError(
     87     f"Requested '{req}', " f"but {req.name}=={ver} is already installed"
     88 )

ValueError: Requested 'sympy==1.14.dev0', but sympy==1.13.3 is already installed

This means that once SymPy 1.14 is released sometime later this year and pip download sympy starts downloading it (and if SymPy would not have been updated in Pyodide to 1.14 and released at that point), it would break the REPL as we won't be able to install the latest SymPy. I don't think this should be a hard error – perhaps a warning at best, with an option to force-install...

The other thing to note here as to why this occurs is because statements like from sympy import * are detected/intercepted in JupyterLite/jupyterlite-pyodide-kernel already to automatically install SymPy (which is currently 1.13.3 from the Pyodide CDN). So, this logic doesn't respect custom wheels (or indices built out of them) that are to be installed in the first line of a code cell, like ours. I would also note that if I were to do %pip install sympy==1.14.0dev0 first and press Enter, and then add the from sympy import * statement in a different code cell and other statements in a separate input outside the initialisation code inserted in the REPL, then it works, as it should. So, it's probably a matter of letting %pip install sympy==<some newer/older version> complete first, before proceeding to the rest of the lines added. I'll mark it as a draft because SymPy is already up to date in Pyodide, and it would be nice to handle this kind of situation now that we've discovered it.

@agriyakhetarpal agriyakhetarpal marked this pull request as draft January 19, 2025 19:56
@ivanistheone
Copy link
Collaborator

@agriyakhetarpal Can you try running the code form 26efec5 with 1.14.0dev0 to see if that works?

I added the %pip install command as a separate code block (the URL query string supports multiple code instances, which get interpreted as separate inputs notebook cells).

@agriyakhetarpal
Copy link
Collaborator Author

the URL query string supports multiple code instances, which get interpreted as separate inputs notebook cells

Ah, interesting, TIL, and thank you so much, @ivanistheone! That works quite well and I confirmed with import importlib.metadata; print(importlib.metadata.version('sympy')) returns the installed version.

I don't know if this behaviour that I noticed is a bug then, as it thus has an easy workaround. Either way, we should be good to merge this in this case. I've just added ec640fa to bump to https://github.com/jupyterlite/pyodide-kernel/releases/tag/v0.5.2, and will press the button.

@agriyakhetarpal agriyakhetarpal marked this pull request as ready for review January 20, 2025 17:27
@agriyakhetarpal
Copy link
Collaborator Author

Also, I think I should rewrite history a bit here, as I believe squashing isn't too helpful because I've also added a few files. Could you please post some benchmarks from https://bit.ly/sympyjstest after this? It would be helpful to see that this doesn't cause significant downsides, as there is indeed a lack of CDN compression when downloading SymPy.

@agriyakhetarpal agriyakhetarpal merged commit 64ae18b into sympy:main Jan 20, 2025
2 checks passed
@agriyakhetarpal agriyakhetarpal deleted the update/sympy-workflow branch January 20, 2025 18:14
@ivanistheone
Copy link
Collaborator

Could you please post some benchmarks from https://bit.ly/sympyjstest after this?

I did a quick test from my old phone (iOS), the loading and code execution is pretty fast. Like so fast it would be difficult to do exact timing since it's less than 3 seconds...

I can do more tests from friend's Android phones later, but overall it's looking good.

@eli-schwartz
Copy link

Yes, moving tests away into a package of their own is a tricky scenario. In Meson land, this can be done using Install tags, which one can use to separate what files end up being included, and build two wheels with one command: sympy-1.X.Y-none-any.whl and sympy-tests.1.X.Y-none-any.whl (see SciPy docs example). I could take a look at doing that as well, but I don't know if SymPy as a pure Python package is supported at all, and doing so does require moving SymPy's build backend that was recently updated to hatchling, IIRC.

Meson (and meson-python) has no problems with being used to build pure-python packages. :)

@agriyakhetarpal
Copy link
Collaborator Author

Yes, moving tests away into a package of their own is a tricky scenario. In Meson land, this can be done using Install tags, which one can use to separate what files end up being included, and build two wheels with one command: sympy-1.X.Y-none-any.whl and sympy-tests.1.X.Y-none-any.whl (see SciPy docs example). I could take a look at doing that as well, but I don't know if SymPy as a pure Python package is supported at all, and doing so does require moving SymPy's build backend that was recently updated to hatchling, IIRC.

Meson (and meson-python) has no problems with being used to build pure-python packages. :)

Sounds good, thanks for the information @eli-schwartz!

I am willing to do such work for SymPy, though it would be on the discretion of @oscarbenjamin and @ivanistheone :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants