-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build against arrow-ballista
in CI & remove ballista code from this repo
#2582
Conversation
arrow-ballista
in CI
arrow-ballista
in CIarrow-ballista
in CI
arrow-ballista
in CIarrow-ballista
in CI & remove ballista code from this repo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a good step forward, I have a fair few breaking changes in flight so I guess we'll get to exercise the process for them soon 😅
Edit: I do wonder if it might be less friction to do what we do with arrow-rs, have ballista track a released DataFusion version and update it as part of the release process for DataFusion. Whilst this does result in a delay getting new features, keeping this delay low is probably of interest to more than just Ballista...
# clone the repo | ||
# TODO make repo/branch configurable | ||
git clone https://github.com/apache/arrow-ballista |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to always pin this so that Ballista changes can't side-effect on unrelated DataFusion PRs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good question and I'm really not sure what the best approach is. Perhaps we wait until we hit this situation and then we can pin to the commit prior to the breaking change.
DataFusion no longer depends on Ballista so I am not sure what kind of change would cause us to deal with this situation but I am sure it is possible somehow.
Thanks for the review @tustvold. I also have plenty of breaking API changes planned so this will definitely get tested soon and I am sure we will either iterate on this process or abandon it in favor of the approach you suggest. With more frequent DataFusion releases and the current level of activity in Ballisata, I think that this would likely be fine. I suggest we try this approach out over the next week or two and see how it goes. |
I think this is the ideal outcome, however it takes a non trivial effort now to release arrow bi-weekly to keep the code flowing, and and we haven't yet been able to do it with DataFusion. Until we get experience and confidence we'll have regular DataFusion releases, I would recommend we keep developing Ballista directly from a github sha. However I see this separation (and slowdown of API changes) hopefully a good sign for DataFusion's eventual stabilization |
Which issue does this PR close?
Part of #2502
Rationale for this change
We need to build against
arrow-ballista
to detect breaking changes.What changes are included in this PR?
Cargo.toml
datafusion-cli
andbenchmarks/tpch
to remove dependencies on Ballistaarrow-ballista
, change dependency paths, and buildarrow-datafusion
andarrow-ballista
#2583rm -rf ballista*
Are there any user-facing changes?
Yes, CI and process changes.
The PR template has a new section:
This PR originally failed (for good reason) because
arrow-ballista
was already "broken" by recent changes merged to master here.After updating Ballista in apache/datafusion-ballista#31 I reran the build here and it passed.