Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve dependencies versioning and ci #141
Improve dependencies versioning and ci #141
Changes from 10 commits
e4865bd
587bea7
afe8025
e39ec66
ea33ed5
3ef122d
55aefe5
4bb2abf
221e34e
9e0dc05
54b757e
a23800e
528ae4c
f269f73
5a7271c
72f6405
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these the current max versions or the end of our support versions?
IMO, as long as we do not have an upper limit, we should not define one.
But I think this is up to the taste of the maintainer and boils to one of the following choices:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the current max versions. My thinking was that it would prevent failures for users, and dependabot would update these max versions weekly (and then we could merge in one click if it works, otherwise investigate).
But yeah I'm not sure, not specifying the max version does reduce the maintainer work a little bit, and could be combined with some other github action which checks that things are working with the latest version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to chime in:
Both options have tradeoffs and I want to be a bit more explicit about them:
You effectively are at the whim of the lib to not break their API. This rarely happens with well maintained libraries except in
X
changes of a version (X.Y.Z
, see semvar). However it will eventually happen, depending on how much functionality you use of a library. For example, withnumpy
, they're very unlikely to change hownp.add
works and if that's all you use, then you shouldn't need to upperbound it. However if you were to rely on numpy's C bindings, then they may change it as they did in the bump to2.0.0
. The upside of this is that if a user installs a higher version and it just works, then they don't have to come ask you to bump it.You effectively specify a set of dependencies which guarantee you library works (with cave-eats such as pre-existing bugs in existing versions within bounds). The downside here is that if the user uses a library that needs a library version with a lower bound such as
2.6.0
and you upperbound on2.5.0
, then they can not use your library alongside their other one, and you will likely get a request to up it at some point.So the tradeoff is when do you want to patch a version change? With no upperbound, it will happen immediately for a user when an issue occurs and they will raise an issue, at which point you recommend downgrading the conflict (if they can, same problem you get with Explicit upperbound but harder to debug.
With Explicit upperbound, you save the user from unknown dependancy hell, replacing it with known dependancy hell. You also save yourself immediate dependency management, but as things go with time, eventually your upperbounds become the lowerbounds of the eco-system, and you will have to solve them bit by bit. You also prevent perfectly compatible libraries from working when they otherwise would.
Given that, my recommendation and strategy is use your vibe-checks to know which libraries to upperbound and which to not. For example, I would very comfortably set numpy to have an upperbound on
X
, i.e.numpy<3
, as they are unlikely to break anything anytime soon. Likewise withtorch
, but you need to consider thattorch
is usually a lot more unstable due to the hardware eco-system they build upon and the functionality you utilize fromtorch
, such as their attention mechanisms. There are some libraries liketyping_extensions
which are developed byPython
themselves and (almost) never backwards breaking, as the library is pure python and doesn't rely on CPython or otherwise system-level libraries. I would never upper-bound this.If you were to use something like
SMAC
(sorry SMAC) or otherwise less production grade libraries, then they are unlikely to follow semvar strictly, and from version to version, they could break things unknowingly, usually not intentionally. These are often the hardest to judge as if several popular libraries depend on it, each of them distrusting it and setting a relatively strict upperbound, this will cause lots of conflicts.Other considerations is the frequency of library updates. Things like
pyyaml
are basically complete, in that it is unlikely to update, meaning a stricter upperbound isn't going to cause as much dep maintanence as a package which updates extremely frequently.---
lock
filesRecommendation: Upperbound libraries like
numpy
andtorch
to their nextX
. If you are unsure of the development practices of a dependancy, upperbound on theirY
. Never upper bound on aZ
unless there's a history of issues.If you want to provide a this setup works assuming no other dependancies, such as in a Docker environment for deployment, then you provide
lock
files which specify exact dependancies.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot @eddiebergman, really interesting!! Based on your comments I made the max versions much laxer: no bound for
typing-extension
, bound on X for all the rest exceptscikit-learn
andeinops
, for which bound on Y. Tell me if you disagree! I would say that using these lax upper bounds + dependabot is a good balance between making life easy for maintainers while being robust for users.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very well put into words @eddiebergman!
I agree with going forward like this and like the reasoning.
One more comment: I think it is also useful to no upperbound to push ourselves to support higher versions.