Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NoSuchMethodError in sparksql35-scalapb0_11 after update #385

Open
leoeareis opened this issue May 6, 2024 · 4 comments
Open

NoSuchMethodError in sparksql35-scalapb0_11 after update #385

leoeareis opened this issue May 6, 2024 · 4 comments

Comments

@leoeareis
Copy link

Hi! I made some updates in my project from Spark 3.4.1 to Spark 3.5.0 and updated the scalapb dependency from sparksql34-scalapb0_11 to sparksql35-scalapb0_11. After this upgrade, I faced this error:

 java.lang.NoSuchMethodError: org.apache.spark.sql.catalyst.expressions.objects.StaticInvoke.<init>(Ljava/lang/Class;Lorg/apache/spark/sql/types/DataType;Ljava/lang/String;Lscala/collection/Seq;Lscala/collection/Seq;ZZZ)V
	at scalapb.spark.ToCatalystHelpers.fieldToCatalyst(ToCatalystHelpers.scala:165)
	at scalapb.spark.ToCatalystHelpers.fieldToCatalyst$(ToCatalystHelpers.scala:107)
	at scalapb.spark.Implicits$$anon$1.fieldToCatalyst(TypedEncoders.scala:123)
	at scalapb.spark.ToCatalystHelpers.$anonfun$messageToCatalyst$2(ToCatalystHelpers.scala:39)

I run my jobs in a Databricks environment using Runtime 14.3 LTS (includes Apache Spark 3.5.0, Scala 2.12) and my udf that performs the protobuf decoder is defined as

import org.apache.log4j.Logger
import org.apache.spark.sql.Column
import scalapb.spark.ProtoSQL
import example.root.root.{Event => RootEvent}
import scalapb.spark.Implicits.{messageTypedEncoder, typedEncoderToEncoder}

import scala.util.{Failure, Success, Try}

object ProtobufExample extends Serializable {

  val logger: Logger = org.apache.log4j.LogManager.getLogger(this.getClass.getSimpleName)
  val rootDecoderUdf: Column => Column = ProtoSQL.udf(Protobuf.decodeRootEvent)

  def decodeRootEvent(input: Array[Byte]): Option[RootEvent] = {
    val result = Try {
      RootEvent.parseFrom(input)
    }
    result match {
      case Success(value) => Some(value)
      case Failure(e) =>
        logger.error(s"Decode error:", e)
        None
    }
  }
}

Could you help me how to figure out this error?

@thesamet
Copy link
Contributor

thesamet commented May 7, 2024

Let's try to isolate the problem (is it related to Databricks environment?) and also make it reproducible so I can confirm a certain solution works. Can you try to reproduce the problem outside of Databricks environment?

It would also be really helpful if you can prepare and share a minimal project (can be based on https://github.com/thesamet/sparksql-scalapb-test) and try to reproduce it both in and outside databricks. Since it will also include some specific protos that causes failure maybe that would provide another direction

@anamariavisan
Copy link

Hello @thesamet! I tried to update a service to Databricks 14.2 and above that uses the sparksql35-scalapb0_11_2.12 dependency and I got the following error:

java.lang.NoSuchMethodError: org.apache.spark.sql.catalyst.expressions.objects.StaticInvoke.<init>(Ljava/lang/Class;Lorg/apache/spark/sql/types/DataType;Ljava/lang/String;Lscala/collection/Seq;Lscala/collection/Seq;ZZZ)V
	at frameless.TypedEncoder$$anon$1.toCatalyst(TypedEncoder.scala:69)
	at frameless.RecordEncoder.$anonfun$toCatalyst$2(RecordEncoder.scala:155)
	at scala.collection.immutable.List.map(List.scala:293)
	at frameless.RecordEncoder.toCatalyst(RecordEncoder.scala:153)
	at frameless.TypedExpressionEncoder$.apply(TypedExpressionEncoder.scala:28)
	at scalapb.spark.Implicits.typedEncoderToEncoder(TypedEncoders.scala:119)
	at scalapb.spark.Implicits.typedEncoderToEncoder$(TypedEncoders.scala:116)
	at scalapb.spark.Implicits$.typedEncoderToEncoder(TypedEncoders.scala:122)

This doesn't happen locally. To your suggestion, I forked this repo https://github.com/thesamet/sparksql-scalapb-test/tree/master to see if the problem is related to the Databricks environment. The code can be found here: https://github.com/anamariavisan/sparksql-scalapb-test. To build the app I ran these commands:

curl -s "https://get.sdkman.io" | bash
sdk install java 11.0.24-zulu
sdk install sbt 1.6.2
sbt assembly

And to test it locally:

sdk install spark 3.5.0

spark-submit \
  --jars . \
  --class myexample.RunDemo \
  target/scala-2.12/sparksql-scalapb-test-assembly-1.0.0.jar

To test it in Databricks, I created a job and I uploaded the library target/scala-2.12/sparksql-scalapb-test-assembly-1.0.0.jar with the main class being myexample.RunDemo. I submitted the job locally and it worked, but in Databricks 14.2 and above, it failed with:

java.lang.NoSuchMethodError: org.apache.spark.sql.catalyst.expressions.objects.StaticInvoke.<init>(Ljava/lang/Class;Lorg/apache/spark/sql/types/DataType;Ljava/lang/String;Lscala/collection/Seq;Lscala/collection/Seq;ZZZ)V
	at scalapb.spark.ToCatalystHelpers.fieldToCatalyst(ToCatalystHelpers.scala:165)
	at scalapb.spark.ToCatalystHelpers.fieldToCatalyst$(ToCatalystHelpers.scala:107)
	at scalapb.spark.ProtoSQL$$anon$1$$anon$2.fieldToCatalyst(ProtoSQL.scala:84)
	at scalapb.spark.ToCatalystHelpers.$anonfun$messageToCatalyst$2(ToCatalystHelpers.scala:39)

I searched how to fix it and I found these issues that describe the same problem:

I also left a comment on this issue on the frameless repo typelevel/frameless#787.

What is your action course on this matter for scalapb-sparksql?

@thesamet
Copy link
Contributor

This is not actionable by sparksql-scalapb until there's a fix for frameless on Spark 3.5 and DBR 14.2.

@chris-twiner
Copy link

This is not actionable by sparksql-scalapb until there's a fix for frameless on Spark 3.5 and DBR 14.2.

fyi - the second stack is sparksql-scalapb internal and due to spark internal api usage rather than frameless itself. The proposed solution for frameless (#787) via shim could also be leveraged for the sparksql-scalapb api usage (tested across all supported DBRs for frameless usage at least).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants