-
Notifications
You must be signed in to change notification settings - Fork 717
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DocumentAssembler on array<string> #13816
Comments
@HeyBossy your question was not deleted, it was answered with an example and a Colab link: #13815 Please have a look at our step-by-step tutorials to get started with Spark NLP. (the input for DocumentAssembler must be STRING https://github.com/JohnSnowLabs/spark-nlp-workshop/tree/master/open-source-nlp |
leaving it open to avoid another duplicate question. @HeyBossy please close it once you read the answer |
I didn't do extra square brackets. I want to submit type array string.
The DocumentAssembler can read either a String column or an Array[String]) |
I see the problem now, I didn't see the docs saying that. The documentation is wrong! Actually, DocumentAssembler only accepts @HeyBossy As a workaround, you need to explode your array of text and then that can be used as an input to |
@DevinTDHa Could you please make sure this is fixed in all the docs (pydoc, scaladoc, website, etc.) - many thanks |
Okay I got it, thanks for the replies! |
Thanks @HeyBossy re-opening this, it gets closed once we fixed this issue in the docs |
Is there an existing issue for this?
Who can help?
No response
What are you working on?
I don't know why but my question was deleted. Therefore, I will repeat again.
I am working with a dataframe that I need to lemmatize. There, the input is an array of strings. I am trying to use DocumentAssembler for array of strings. The documentation says: "The DocumentAssembler can read either a String column or an Array[String])". But it doesn't work like that for me. Can you explain what I'm doing wrong? Or is the documentation out of date?
Current Behavior
I am getting an error
Expected Behavior
Steps To Reproduce
When I do a simple example:
Spark NLP version and Apache Spark
sparknlp.version() == '4.4.0'
spark.version == '3.4.0'
Type of Spark Application
No response
Java Version
No response
Java Home Directory
No response
Setup and installation
No response
Operating System and Version
No response
Link to your project (if available)
No response
Additional Information
No response
The text was updated successfully, but these errors were encountered: