-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Limit Request: librec-auto (300 MB, both PyPI and TestPyPI) #152
Comments
Thanks for the report. It doesn't seem to answer the most important question though: Why does the release have to be so big? |
Thanks for your response. This project uses a java project, librec, as an engine for performing experiments. As part of the installation, this .jar file needs to be installed as well. So, the whole project including this .jar file will need 270MB-300MB. |
Thanks for the reply! Because such large projects place a burden on PyPI and do not provide a good experience for end users, PyPI moderators are encouraged to help find ways to avoid the need to distribute such large packages — this is especially true if the package isn't actually distributing Python code (PyPI is not built to distribute data sets or non-Python code at scale). At 300MB, moderators are asked to limit grant increases only to established projects, so finding ways to reduce the size is especially important. The JAR in question is 250MB. It appears to be an amalgamation of many disparate Java libraries. It bundles numerous compiled binary libraries for numerous platforms (Windows x86, Windows x86-64, Linux x86-64, macOS x86-64, Linux Arm, Linux PPC). When expanded, it occupies nearly 1GB of disk space, largely because of all the binary code.
The best recommendation I have is to distribute multiple binary wheels, one for each platform you wish to support, using the appropriate platform tags. Only include the needed binary libraries for each platform in that platform's wheel, instead of distributing 5 copies of a library, 4 of which will be unused after download. Perhaps dividing the 250MB jar file into 5 platform specific variants will let it fit within the size limit. Of course, that just sidesteps the issue of using PyPI to distribute non-Python artifacts. So another recommendation is to provide a way to install the dependency from its intended distribution center (e.g., Maven), either at install time (have 'setup.py' execute Maven) or by having the user do so later (often this is done by providing a Python script that does the installation; this could be used to update this dependency later, too). Using the native distribution mechanism in this way might allow using its own dependency resolution such that there's no need to create an amalgamation jar. Note that PyPI is currently introducing certain additional scans of uploaded files. In the future that might extend to verifying uploaded binary code is compatible with the platforms it claims, specifically for manylinux, so it is also recommended to build all such binary code on a compatible system. |
Thanks for your suggestions. |
Since this package seems to wrap a relatively large JAR file, I'm going to opt to decline this request. If you're able to reduce the size of the JAR file contained in your distribution, please let us know the reduced size and we can reconsider this request. |
Project
librec-auto
https://pypi.org/project/librec-auto/
https://github.com/that-recsys-lab/librec-auto
https://github.com/that-recsys-lab/librec-auto-java
Size of release
300MB
Which indexes
Both
Reasons for the request
The librec-auto project aims to automate recommender system experimens using Librec. The workflow of an experiment involves identifying appropriate data, creating training / test splits, implementing or choosing algorithms, running experiments (possibly with a range of different parameters), and reporting on the results.
The text was updated successfully, but these errors were encountered: