You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have searched the existing issues and did not find a match.
Who can help?
No response
What are you working on?
I am trying to use DocumentAssembler for array of strings. The documentation says: "The DocumentAssembler can read either a String column or an Array[String])"
Current Behavior
But I am getting an error:
AnalysisException: [CANNOT_UP_CAST_DATATYPE] Cannot up cast input from "ARRAY<STRING>" to "STRING".
The type path of the target object is:
- root class: "java.lang.String"
You can either add an explicit cast to the input data or choose a higher precision type of the field in the target object
Expected Behavior
Steps To Reproduce
For example, I want to submit a text column (type array string) to a document.
data = spark.createDataFrame([[["Spark NLP is an open-source text processing library."]]]).toDF("text")
documentAssembler = DocumentAssembler().setInputCol("text").setOutputCol("document")
result = documentAssembler.transform(data)
result.select("document").show(truncate=False)
Spark NLP version and Apache Spark
Spark NLP version sparknlp.version(): 4.4.0
Apache NLP version spark.version: 3.4.0
Type of Spark Application
No response
Java Version
No response
Java Home Directory
No response
Setup and installation
No response
Operating System and Version
No response
Link to your project (if available)
No response
Additional Information
No response
The text was updated successfully, but these errors were encountered:
Is there an existing issue for this?
Who can help?
No response
What are you working on?
I am trying to use DocumentAssembler for array of strings. The documentation says: "The DocumentAssembler can read either a String column or an Array[String])"
Current Behavior
But I am getting an error:
Expected Behavior
Steps To Reproduce
For example, I want to submit a text column (type array string) to a document.
Spark NLP version and Apache Spark
Spark NLP version sparknlp.version(): 4.4.0
Apache NLP version spark.version: 3.4.0
Type of Spark Application
No response
Java Version
No response
Java Home Directory
No response
Setup and installation
No response
Operating System and Version
No response
Link to your project (if available)
No response
Additional Information
No response
The text was updated successfully, but these errors were encountered: