-
Notifications
You must be signed in to change notification settings - Fork 244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial definition for Spark 4.0.0 shim #10725
Conversation
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
ab2fcbe
to
dae448f
Compare
The pre-commit checks are failing due to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes seem OK, but the CI failure needs to be resolved. Any reason we cannot add 400 to all.buildvers? I didn't see evidence of scripts keying off of all.buildvers, so I think we can add the currently unbuildable 400 shim to that definition as long as we keep it out of noSnapshot.buildvers and snapshot.buildvers definitions until it is buildable.
Could this be due to the complication that Spark 4.0.0 defaults to Scala 2.13 and does not support Scala 2.12? |
pom.xml
Outdated
@@ -810,6 +810,7 @@ | |||
351 | |||
</noSnapshot.buildvers> | |||
<snapshot.buildvers> | |||
400 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should actually only be in the snapshotScala213.buildvers at the moment. Spark 4.0.0 does not support Scala 2.12, so the CI cannot actually build the shim under the default Scala version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should actually only be in the snapshotScala213.buildvers at the moment. Spark 4.0.0 does not support Scala 2.12, so the CI cannot actually build the shim under the default Scala version.
You are right, and that is what my initial thought was but as you have noted it needs to be added to the all.buildvers
for the 2.12 pom.xml. At this point, I have added it to the all.buildvers
for Scala2.12 and 2.13 but as things get clearer closer to the release of Spark 4.0.0 we will have to either ignore the check for all.buildvers
for Scala 2.12 build while keeping the check for Scala 2.13 build.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why 400 was added to snapshot.buildvers. That's not what we want, right? 400 is not ready to be built. We want 400 to be in all.buildvers but not in any definition of what is buildable. We need 400 to be declared as a shim but not one that builds yet. Therefore I would expect the change to be more like this:
diff --git a/pom.xml b/pom.xml
index e898c1735a..7b45bbbd3e 100644
--- a/pom.xml
+++ b/pom.xml
@@ -849,6 +849,8 @@
${noSnapshot.buildvers},
${snapshot.buildvers},
${databricks.buildvers},
+ <!-- 400 is not buildable yet, only declaring it as a known shim by placing it here -->
+ 400
</all.buildvers>
<noSnapshotScala213.buildvers>
330,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, probably something for a future PR, but we should consider how to organize this in the future for 40x shims that will only build under Scala 2.13. These future shims (400, 401, etc.) should be able to live in all.buildvers
but the build system can handle them as well. Maybe put it in another section that shared with *Scala213.buildvers
sections as well?
Actually, I missed that the error was in |
scala2.13/pom.xml
Outdated
@@ -810,6 +810,7 @@ | |||
351 | |||
</noSnapshot.buildvers> | |||
<snapshot.buildvers> | |||
400 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to regenerate scala2.13 pom
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I was curious why we need to commit the 2.13 pom when it can be generated when needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In other words, isn't this a lot similar to generating shims for different versions of Spark?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's committed for convenience. Developers can directly point their IDE to the scala2.13 pom or directly build after pulling source. If it required manual generation, switching branches in your local repo would be fraught with problems if you forget to re-generate the scala2.13 pom after moving to a new commit. Something that would be very easy to forget.
build |
This is a PR in a series of PRs to come that will add support for Spark 4.0.0
In this PR we have added Shims by generating Shimplify but also manually making changes to a few shims.
Changes made
mvn generate-sources -Dshimplify=true -Dshimplify.move=true -Dshimplify.overwrite=true -Dshimplify.add.shim=400 -Dshimplify.add.base=351
GpuArrowPythonRunner.scala, GpuCoGroupedArrowPythonRunner.scala, GpuArrowPythonOutput.scala
to use the shim from 341dbcontributes to #9259
NOTE
If you want to test the changes, I have pushed a branch with all the necessary changes needed to build this commit here. There should be 24 compilation errors