-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Spark2 runner fails to deserialize PipelineOptions due to NoSuchMethodError #23568
Comments
Unfortunately classpath issues are a common trouble both with Beam and Spark. Spark offers a workaround to handle this using
When enabling E.g., if using <artifactSet>
<excludes>
<exclude>log4j</exclude>
<exclude>org.slf4j</exclude>
<exclude>io.dropwizard.metrics</exclude>
<exclude>org.scala-lang</exclude>
<exclude>org.scala-lang.modules</exclude>
<exclude>org.apache.spark</exclude>
<exclude>org.apache.hadoop</exclude>
</excludes>
</artifactSet> Additionally you might have to explicitly bump the version of some Jackson modules to match the version used by Beam. <dependency>
<groupId>com.fasterxml.jackson.module</groupId>
<artifactId>jackson-module-scala_2.11</artifactId>
<version>${jackson.version}</version>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.module</groupId>
<artifactId>jackson-module-jaxb-annotations</artifactId>
<version>${jackson.version}</version>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.module</groupId>
<artifactId>jackson-module-paranamer</artifactId>
<version>${jackson.version}</version>
<scope>runtime</scope>
</dependency> Step-by-step example
Spark job-server / PortableRunnerIf you depend on the Spark job-server to submit your jobs there's no obvious way to enable
Unfortunately this is not sufficient to successfully run the job-server for Spark2, likely due to an incompatible user classpath. |
Actually in my case I changed fastxml libs both to runtime scope and it works without need to set the userClassPathFirst.
|
@haoyuche The problem here is that the Jackson version used by Spark 2 is too old for Beam. |
What happened?
Spark 2.4.8 uses a fairly old version of Jackson
2.6.7
, while Beam is far ahead at2.13.3
.When attempting to deserialize Pipeline options on a Spark worker, it will fail with a
NoSuchMethodError
it attempts to use a newer Jackson API that doesn't exist in the older Spark version.Note: The Spark 2 runner is already deprecated. So this is likely a NO-FIX and mostly for reference.
Issue Priority
Priority: 3
Issue Component
Component: runner-spark
The text was updated successfully, but these errors were encountered: