-
Notifications
You must be signed in to change notification settings - Fork 847
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More frequent major releases for arrow-rs #1120
Comments
I think this touches on something key. In my opinion, the crates in this repository are still somewhat beta if not alpha software, with major APIs still getting fleshed out. As such it would perhaps be more expected for it to be 0.x software with the accompanying frequency of breaking releases. I accept the version ship has sailed, but I think it makes sense to pretend it hasn't and adopt a more rapid breaking release cadence as suggested 👍 FWIW similar arguments apply to many of the arrow-rs dependencies. It isn't so much that the ecosystem itself is "fast changing", just a lot of the ecosystem isn't yet v1 😆 |
Also another pro is that we can hopefully bring arrow2 back into the fold (though that would need confirmation/additional work). As I remember it, @jorgecarleitao's main sticking point was that the Rust version of arrow was sticking with the C++/Python versioning scheme. |
In conclusion, from my perspective maintaining an active release branch with stable patches and major releases every 3 months doesn't provide enough value for the cost The cost is borne both by me actually creating the releases as well as users who are slowed down picking up updates for dependent libraries (the latest version of tonic, for example) or waiting for their changes to be released. So unless there is significant pushback (ideally along with a volunteer to help maintain a stable branch) I will plan to start doing releases for arrow-rs directly from master as proposed on [1] , starting with arrow-rs 7.0 (will make a candidate later this week and hope to release early next) I think the only downside to this will be the oft discussed "arrow will end up with versions 22.0.0 rather than version 0.22.0" which may lead to confusion about its relative stability. We can add some documentation about this and adjust if it causes too much confusion. |
TLDR is I plan to release arrow-rs 7.0 as normally scheduled at the end of this week, and then after that only release from master every other week (rather than maintaining a more stable active_release branch) So the first time something different will happen is in 2.5 weeks time |
If the master has the API-change commit after the last release, how to handle this? |
I wonder if we might further simplify matters by releasing weekly and incrementing the version based on if there are any new breaking changes on master since the last release. The major benefit would be this could be completely automated, potentially using existing tooling. In the past I've used this approach for binaries, in particular using goreleaser, but I see no obvious reason something similar couldn't be done here... Edit: a somewhat related question is must all the crates in this repo advance their version numbers in lockstep? Should a breaking release to parquet mandate a breaking semver bump to arrow? |
@liukun4515 the proposal is that the following release would have a new major (rather than minor version) So for example, if there was an API change on Jan 12 (after 7.0.0 was released) the next release after Jan 12 would become 8.0.0 (rather than 7.1.0) |
The major thing that prevents more frequent and/or automated releases is the release voting process. I often have to chase (via slack, etc) to get 3 PMC members to approve a release; In terms of automation, sqlparser-rs uses
I don't know of any technical reason they not need to advance their numbers in lockstep -- it is a convenience so that we (mostly so I) don't have to scrutinize the changes to each crate and determine if/when a new version is warranted on each release |
Waiting for the release of 7.0.0! |
Maybe we need more active PMC for arrow in the rust ecosystem. |
Yeah -- I don't really know how to improve this. We do have several representatives of the Rust implementation on the PMC now (@jorgecarleitao @nevi-me @andygrove and @Dandandan) but their time is limited (as all of ours is) I do wonder how much of the issue is the current process, which suggests running an automated script. I am not convinced this extra level of "quality assurance" is a real value add -- the real value add to me is having more than one person look at the release's content and say that the "content seems reasonable." Are there any other thoughts on this thread? |
I don't have anything to add other than that I fully support this move :) It will certain reduce @alamb 's maintenance burden and make it easier for downstream to get access to breaking changes 👍 |
Given there are no more comments I'll proceed with this proposal 👍 |
I know this is an old issue but just giving my 2 cents. While I definitely agree working out of master directly, bumping a major version so frequently is not super convenient.
I am not sure where this is coming from. From my experience there are many important crates which are trying hard not to release new major versions unless necessary (rocket, actix-web, chrono, tokio, log for the most important ones). As explained in a reddit post, I have a hard time keeping up with the major version changes. Updating one library is easy, updating all its reverse dependency much harder. Anyway, thanks a lot for the hard work! |
Thank you for your feedback @tafia -- it is nice to hear from someone who valued the incremental updates! |
So this means currently arrow-rs is in 0.x state, right? I think it's better if relative stable API can be guaranteed when we reach 1.x. (Do you have any idea about a timeline?)
+1. However, the drawbacks are also clear, especially when
|
After reading the thread a few times, the main need seems to be "release from main and don't maintain stable branch", which makes a lot sense to me, instead of "frequent major release". 🤔 The latter happened just because of the fact that arrow-rs isn't stable enough yet. Maybe we can just be more conservative about breaking changes after "1.0", e.g., only merge them in a batch before a major release. |
Yes this is the primary motivation, we aren't setting out to cut frequent breaking releases 😅 . FWIW most breaking changes are fairly innocuous, e.g. adding Send bounds to a trait object, or returning Result instead of panicking, however, semver doesn't really give us a good way to convey this nuance. Whilst we could defer these upgrades, even the current two week lag to get changes into DataFusion causes people frustration.
I don't have a timeline, but the pace of breaking changes is noticeably slowing down. There are some breaking changes in the pipeline concerning scalar representation #1047 but I hope that following that we will be in a better place to maintain API stability. |
On a side note, I think the fact that rust ecosystem is more willing to use 3rd party dependencies and update them, and crates are more often released actually requires libraries to obey semver more strictly and do major updates less often (because Cargo has a nice solution for diamond dependencies ). Otherwise it's actually discouraging users to use/update it and the situation will fallback to C++/Python's. |
I predict this crate will adopt a more measured pace of versions once the pace of development slows down (and most of what goes in is bug fixes). Interestingly we haven't gotten there yet even after 3-4 years of development. I am likely feeling good that in the next 6 months or so we'll start seeing versions with non-major bumps each time |
I agree -- and furthermore given the nice tooling in Cargo (and dependabot) the versions keep updating even though it does take ongoing work |
TDLR: I propose doing major releases for arrow-rs more frequently (up to every other week) directly from
master
, breaking the correspondence with the main arrow releaseFor example, the release cadence might look like
7.0.0
7.1.0
(no backwards incompatible changes)8.0.0
(new backwards incompatible change)9.0.0
(new backwards incompatible change)9.1.0
etc9.2.0
9.3.0
10.0.0
...
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
The rust ecosystem as a whole is "fast changing". This means:
This means the ability to quickly upgrade to new downstream libraries is critical to help fix issues like #1101
With a few notable exceptions (e.g. tokio) most of the rust ecosystem pushes new major releases frequently (monthly if not more so), often via
0.X
releases.The current release cadence and versioning scheme of
arrow-rs
inherits from the C/C++ and python ecosystems with a major release every 3 months while maintaining a backwards compatible branch.I propose moving the rust implementation in closer alignment with the rest of the rust ecosystem with more frequent major version releases.
Describe the solution you'd like
Continue to release every 2 weeks; However, release all new versions directly from the
master
branch, picking a new version number based on the changes in the crate (a new major version if there were semver changes, minor if not)Pros:
Cons:
cargo update
doesn't pick up new major versions)Much of the rust ecosystem, as described above, is used to the "do major updates frequently" mindset, and furthermore tools such as
dependabot
reduce the effort required to do so, I believe the first con is manageable.Describe alternatives you've considered
Option: 1 "major release on demand", where we released when there were "enough" changes built up to do a regular release;
Option 2: keep the same structure (a
master
andactive_release
branch); do a major release every month instead of every 3 months, and do a minor release every other week.The text was updated successfully, but these errors were encountered: