-
Notifications
You must be signed in to change notification settings - Fork 322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE REQUEST]: Add Trigger() to DataStreamWriter. #139
Comments
For Spark 2.4.x, what's the command you ran? Did you make sure you are using 2.4 version for spark-sql-kafka: https://mvnrepository.com/artifact/org.apache.spark/spark-sql-kafka-0-10? |
This is how Spark streaming works: http://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#programming-model
Please refer to the link I pasted above. |
@imback82 Yes you are right, the spark sql version was wrong. |
@imback82 Thanks for the help regarding the versions but the maven link I don't see how it can help me with the configurations. As I mentioned, I am trying to achieve continuous processing with .NET Spark. Here https://spark.apache.org/docs/2.4.3/structured-streaming-programming-guide.html#triggers I see how this can be done with other languages and I am looking for the equilavent for .NET Spark since a Trigger API does not seem to exist |
Sorry, I fixed the link.
Got it. Looks like we are missing the API. We will add this in the next release. |
waiting for this feature ready. |
@danny8002 we expect the next release to be the second week of July. |
it is too late for me, could you please send a workable PR? I just implement Trigger() by myself, see #153 , ( i just write it according to spark/Trigger.java). but when i test it , i found Trigger.Continuous() don't work. (Trigger.ProcessingTime works). here is the code with Trigger.Continuous();
Microsoft.Spark.Worker.exe runs without data, and then closed immediately (my other program keeps sending data to input kafka with qps = 2 records/s), see stderr
stdout
master log:
Hope helps from you! |
@danny8002 I will review this.
This is expected for Can you first check if |
Yes, it works without UDF
=>
As your say 'it is expected', how to let UDF works? how to let the worker/Task start again? I write the same program with Java and it works for UDF. and i am curious how python works ... |
any insight about running UDF ? |
@danny8002 I've reproduced your test using pyspark and I'm also encountering issues using continuous trigger and UDFs. This seems to be a known issue Spark-27234 and there is an active PR that should address it. |
Description
Hi...I am trying to create a simple proof of concept application using .NET Spark that streams data from Kafka.
Facing issues:
Exception with Spark 2.4.x
Moreover, the sample code below works fine (i.e. it receives data from Kafka) with spark version 2.3.x (microsoft-spark-2.3.x-0.3.0.jar) but throws a Java exception with spark version 2.4.x (microsoft-spark-2.4.x-0.3.0.jar)
Sample Code
For running the code I build the project in VS2019 and then in cmd:
Also, you must have a Producer sending data to a Kafka topic from which to stream data.
The text was updated successfully, but these errors were encountered: