-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-21701][CORE] Enable RPC client to use SO_RCVBUF
and SO_SNDBUF
in SparkConf.
#18964
Conversation
@jerryshao please review my separated PR. Thanks! |
The change looks OK to me. Did you meet the issue in which you have to change the buffer size in the client side? |
Not yet since it is OK to keep buffer size as default system value, but to keep it consistent as user would like to specify, this makes sense. |
This change looks reasonable, cc @zsxwing @cloud-fan for another look. |
@cloud-fan would you take a look of the PR, the update is very simple. Thanks very much! |
@neoremind did you see any performance issue caused by Spark RPC? Spark doesn't send a lot of RPC messages. I don't see it's a bottleneck even when we tried to optimize the latency in Structured Streaming. |
@zsxwing I did try to create a performance test against spark rpc, the test result can be found here, note that I created the project for studying purpose and the code is based on 2.1.0. As you said, the performance would not be dropped as client not using For example, I use the scenario of concurrent calls 10, total calls 100k, keep all things as default, the QPS would be around 11k. When I set I admit that the update is trivial but from users' perspective, if |
@@ -210,6 +210,14 @@ private TransportClient createClient(InetSocketAddress address) | |||
.option(ChannelOption.CONNECT_TIMEOUT_MILLIS, conf.connectionTimeoutMs()) | |||
.option(ChannelOption.ALLOCATOR, pooledAllocator); | |||
|
|||
if (conf.receiveBuf() > 0) { | |||
bootstrap.option(ChannelOption.SO_RCVBUF, conf.receiveBuf()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the difference between option
and childOption
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For server side, netty leverages reactor pattern and ServerBootstrap
is used here, method option
works when socket binding and connecting occurs, while childOption
works for the later process when connection is established and there will be one channel per client.
For client side, spark rpc uses Bootstrap
, there is no method to set childOption
because client works not like server side, it is pretty simple for client, only one thread-pool will be used by netty to do network I/O.
ok to test |
LGTM |
@neoremind that's an interesting project. However, Spark RPC is not designed for high-performance and general RPC. In general, Spark just needs a good enough RPC system. That's why it's using Java serialization. |
@zsxwing Thanks for reviewing. The project I mentioned above is for studying purpose and hope it will help others who are interested. I totally agree that spark rpc mainly for internal use, but as I tested, its performance is good though in general cases, which is good news :) |
Test build #81097 has finished for PR 18964 at commit
|
Thanks. Merging to master. |
What changes were proposed in this pull request?
TCP parameters like SO_RCVBUF and SO_SNDBUF can be set in SparkConf, and
org.apache.spark.network.server.TransportServe
r can use those parameters to build server by leveraging netty. But for TransportClientFactory, there is no such way to set those parameters from SparkConf. This could be inconsistent in server and client side when people set parameters in SparkConf. So this PR make RPC client to be enable to use those TCP parameters as well.How was this patch tested?
Existing tests.