[SPARK-18838][CORE] Introduce multiple queues in LiveListenerBus #18253

bOOm-X · 2017-06-09T18:36:10Z

What changes were proposed in this pull request?

In this PR the single queue of the LiveListenerBus was replaced by multiple independent queues.

The EventLoggingListener was put in an independent queue (it was the most time consuming listener).
The executorAllocationManager was put in an independent qeue (it is very impacted by event drop of the main queue due to other listener impacts, so isolation purpose)
The UI listeners were put in the same independent queue (One of them was very time consuming, we want no impact of the GUI on the other listener)
The "extralisteners" were put in the same independent queue, for isolation purpose
The streamingListenerBus was put in an independent queue. this llistener is a bus too, and the processing time of its listeners can be quite significant

The queue and its processing thread have been extracted from LiveListenerBus into a class BusQueue. The definition of most of the methods of ListenerBus has been extracted into a trait WithListenerBus. The LiveListenerBus implements it directly. It hold the "default" queue associated with a group of listeners and a list of queues. The method addListener of the WithListenerBus has a new optional boolean parameter (default value false) to require a independent queue for this listener instead of the default one. This parameter is ignored in the default implementation (inListenerBus) .
A listener which is also a set of listeners has been added. It allows to keep the current behavior for a group of dependent listeners or the default queue. It handles the per listener metrics.
The methods addProcessor and removeProcessor have been added to LiveListenerBus to be able to add message processing at the super type SparkListenerEvent in addition to the per event type processing of the listener interface.

How was this patch tested?

utest + manual tests on the cluster

bOOm-X · 2017-06-13T11:20:12Z

@vanzin @cloud-fan Can I have a review on this new PR ?

vanzin · 2017-06-13T17:42:15Z

I'm busy, but I'll get to it eventually. You could at least write a proper commit summary in the meantime.

vanzin · 2017-06-13T17:42:22Z

ok to test

SparkQA · 2017-06-13T17:49:28Z

Test build #78001 has finished for PR 18253 at commit fc6f609.

This patch fails Scala style tests.
This patch does not merge cleanly.
This patch adds no public classes.

bOOm-X · 2017-06-15T15:32:26Z

Ok I rebased & updated my commit message.
I will update the PR message and push more commit to use what is done.

SparkQA · 2017-06-15T15:34:33Z

Test build #78111 has finished for PR 18253 at commit 97cb911.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-06-15T18:29:25Z

Test build #78116 has finished for PR 18253 at commit 18cb952.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-06-15T18:45:29Z

Test build #78117 has finished for PR 18253 at commit 52505d9.

This patch fails to build.
This patch merges cleanly.
This patch adds no public classes.

vanzin · 2017-06-15T22:00:34Z

I took a quick look and this does indeed look very much like work in progress. I also have a feeling that it's way over-engineered; there's a lot of base classes that are not that interesting, for example:

why SynchronousListenerBus? It doesn't seem to add anything interesting over the existing interface.
the whole "group of listeners" abstraction seems unnecessary. As far as I see it can be folded into the "listener queue" concept - a queue feeds events to a collection of listeners.

Changing "post" to "postToAll" as part of this change also is adding a lot of unnecessary noise. I'm not a fan of the current class hierarchy of the listener bus and I think that change makes sense, but at the same time it should be done separately since it's distracting here.

I also saw methods that are not fully implemented in the code, so I assume you're still working on this.

I'd also like to see better justification for your custom queue implementation. Have you identified the use of BlockingQueue as a hotspot in the current code? My main worry is the sleep; if you're unlucky, your queue will always be 20ms behind in processing events and may suffer if there's a sudden burst while it's sleeping. You can probably squeeze better performance out of BlockingQueue without having to write your own - e.g. by using drainTo instead of reading events one by one.

The approach in #16291 had a lot of good things going for it, and mostly needed some clean up (and be modified to only change the live listener bus, and not the replay one). Your current approach seems a lot more complicated than that.

SparkQA · 2017-06-15T22:25:31Z

Test build #78128 has finished for PR 18253 at commit a436b09.

This patch fails to build.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-06-15T23:11:02Z

Test build #78134 has finished for PR 18253 at commit 9427be0.

This patch fails MiMa tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-06-15T23:39:34Z

Test build #78136 has finished for PR 18253 at commit d429be6.

This patch fails MiMa tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-06-16T01:02:53Z

Test build #78138 has finished for PR 18253 at commit 1fe9161.

This patch fails MiMa tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
case class SparkListenerBlockUpdated(blockUpdatedInfo: BlockUpdatedInfo)

SparkQA · 2017-06-16T01:13:08Z

Test build #78139 has finished for PR 18253 at commit 123aa92.

This patch fails MiMa tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-06-16T01:24:26Z

Test build #78141 has finished for PR 18253 at commit b73b020.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-06-16T02:14:32Z

Test build #78146 has finished for PR 18253 at commit a239f83.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-06-16T07:13:30Z

Test build #78165 has finished for PR 18253 at commit 24721b9.

This patch fails to build.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-06-16T07:29:33Z

Test build #78167 has finished for PR 18253 at commit cafcd96.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-06-16T10:24:51Z

Test build #78168 has finished for PR 18253 at commit 13003b2.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-06-16T10:44:57Z

Test build #78169 has finished for PR 18253 at commit d3d2cbe.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-06-16T15:54:32Z

Test build #78186 has finished for PR 18253 at commit be3d560.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-06-17T11:54:49Z

Test build #78205 has finished for PR 18253 at commit 7a5df2d.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-06-18T01:59:19Z

Test build #78219 has finished for PR 18253 at commit 4596c61.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-06-18T14:45:05Z

Test build #78225 has finished for PR 18253 at commit 965b105.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-06-18T17:44:48Z

Test build #78226 has finished for PR 18253 at commit 16ba70a.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

bOOm-X · 2017-06-20T16:28:55Z

@vanzin Ok it is ready now.

why SynchronousListenerBus? It doesn't seem to add anything interesting over the existing interface.

I removed it.

the whole "group of listeners" abstraction seems unnecessary. As far as I see it can be folded into the "listener queue" concept - a queue feeds events to a collection of listeners.

I simplified it and add usages in the other commits. It is basically usefull to hold the metrics, and I need a common way to add a group of dependent listeners to the LiveListenerBus and the ReplayBus.

Changing "post" to "postToAll" as part of this change also is adding a lot of unnecessary noise. I'm not a fan of the current class hierarchy of the listener bus and I think that change makes sense, but at the same time it should be done separately since it's distracting here

100% agree. I removed it

I'd also like to see better justification for your custom queue implementation. Have you identified the use of BlockingQueue as a hotspot in the current code?

This implementation had 2 advantages: it is a 1 producer - 1 consumer queue whereas the BlockingQueue is a n producers - m consumers. So it use much less synchronization. The other advantage (the main one) is that no object is created for each message added to the queue. So it produces a lot less garbage. More independent queues we have, more it is significant.

My main worry is the sleep; if you're unlucky, your queue will always be 20ms behind in processing events and may suffer if there's a sudden burst while it's sleeping.

I change it to 1 ms instead of 20ms. This time is much less than the average processing time of the fastest listener (around 5 ms for the HeartbeatListener). It is just to force the consumer thread to escape in case of empty queue to give more chance to the producer thread to be scheduled. I can remove it if you want.

vanzin · 2017-06-20T17:13:21Z

This implementation had 2 advantages... it use much less synchronization.

Yes, but have you quantified how much you win with that? If the blocking queue approach has enough throughput for the listener bus, it's safer to use it.

is that no object is created for each message added to the queue

Well, you could use an ArrayBlockingQueue. Then no extra object is allocated either.

Here's a link with numbers for ArrayBlockingQueue:
https://github.com/LMAX-Exchange/disruptor/wiki/Performance-Results

4M ops per sec in the 1P-1C case looks plenty fast for Spark's need.

I change it to 1 ms instead of 20ms.

I think if you really insist on going this route, you should use LockSupport.park/unpark instead of fighting for CPU time like this.

bOOm-X · 2017-06-21T22:15:42Z

Well, you could use an ArrayBlockingQueue. Then no extra object is allocated either.

Yes I agree. But you get the synchronization too. I am still agree that it should not have a big impact yet. But using an ArrayBlockingQueue does not simplify the code a lot. The current implementation is not complicated, not too verbose, and base on a simple pure scala array. I do not think that it has a huge complexity cost compared to the java ArrayBlockingQueue.

I change the Thread.sleep to a Thread.yield to be less agressive for the thread unscheduling. Even with very few messages it should not consume too much CPU, and it will be much more reactive when messages are bursting.

vanzin

I started reviewing, but again, I noticed the same thing I commented on before. This is way over-engineered. You can do this in a much, much simpler way. There's no need to create all the different abstractions you're adding - the current listener abstraction is enough to achieve what is being proposed here.

All you need is to add a "queue name" parameter to the addListener method, and potentially an "event filter" parameter. Everything else is hidden in the listener implementation, and doesn't need to be exposed to any calling code.

vanzin · 2017-07-17T18:42:36Z

core/src/main/scala/org/apache/spark/SparkContext.scala

@@ -532,7 +533,10 @@ class SparkContext(config: SparkConf) extends Logging {
          new EventLoggingListener(_applicationId, _applicationAttemptId, _eventLogDir.get,
            _conf, _hadoopConfiguration)
        logger.start()
-        listenerBus.addListener(logger)
+        listenerBus.addProcessor(


I'm having a hard time finding the declaration of this method. I can't find it in your code nor in the existing master branch. Can you link to it?

Found this @vanzin: https://github.com/apache/spark/pull/18253/files#diff-ca0fe05a42fd5edcab8a1bdaa8e58db9R86

In LiveListenerBus.scala line 86

vanzin · 2017-07-17T18:44:11Z

core/src/main/scala/org/apache/spark/SparkContext.scala

@@ -2350,13 +2354,12 @@ class SparkContext(config: SparkConf) extends Logging {
    try {
      val listenerClassNames: Seq[String] =
        conf.get("spark.extraListeners", "").split(',').map(_.trim).filter(_ != "")
-      for (className <- listenerClassNames) {
-        // Use reflection to find the right constructor
+      val extraListeners = listenerClassNames.map{ className =>


.map { className =>

You have a lot of style issues in your code - indentation, spacing, etc. Please read the style section in http://spark.apache.org/contributing.html and try to follow it.

Fixed.
I do not understand, the PR pass the Scala style tests. How can I still have style issues ?

vanzin · 2017-07-17T18:45:50Z

core/src/main/scala/org/apache/spark/SparkContext.scala

+        listener
+      }
+      if (extraListeners.nonEmpty) {
+        val group = new FixGroupOfListener(extraListeners, "extraListeners")


FixGroupOfListener is a bad class name. I'm not even sure what it's supposed to be, but the closest I can think of is ListenerGroup.

But perhaps this shouldn't be exposed at all. If you add a queue name to the listener registration method, you can hide this from callers altogether. That is, if I understood what this class is in the first place.

Then you wouldn't need addIsolatedListener either.

@vanzin are you suggesting making this a call to addProcessor? Or, and addListener override? Just trying to understand the code at this stage.

First, I think that modifying the existing addListener method is a bad idea. It will impact a lot the code. We want to keep this method with its current behavior (add a listener to the "default" queue) and be able to add listener in other queue. I think that adding a String label and doing matching on it to determine the queue is quite error prone. I prefer having a more constrained API.
For the FixedGroupOfListener name, I can change it. But I have 2 kind of group of listeners:

FixGroupOfListener : For group of inter-dependant listeners (like UI listeners). I can rename it to ListenerImmutableGroup

ModifiableGroupOfListener : For the "default" queue. I can rename it to ListenerGroup

Are these name OK for you ?

First, I think that modifying the existing addListener method is a bad idea. It will impact a lot the code.

That's why overloaded methods exist.

But I have 2 kind of group of listeners

I don't think there's really a distinction between the two types of groups you mention. The "UI group" is just a modifiable group that you don't modify after it's been created.

vanzin · 2017-07-17T18:46:39Z

core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala

@@ -227,6 +169,7 @@ private[spark] class EventLoggingListener(
   * ".inprogress" suffix.
   */
  def stop(): Unit = {
+    flush()


Shouldn't be necessary (close() does it).

vanzin · 2017-07-17T18:47:44Z

core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala

-  override def onExecutorMetricsUpdate(event: SparkListenerExecutorMetricsUpdate): Unit = { }
-
-  override def onOtherEvent(event: SparkListenerEvent): Unit = {
+  def log(event: SparkListenerEvent): Unit = {
    if (event.logEvent) {


Since you're adding an event filter, you could perform this check there...

To keep the current behavior, it is not simple to put this filtering (if (event.logEvent)) in the event filter. Indeed I want to perform it only if the type of the event is not a "basic" type. It would imply to complexify a lot the EventFilter, which acts here as a "pre-filter" (discard only part of the event that we do not want to log)

vanzin · 2017-07-17T18:52:45Z

core/src/main/scala/org/apache/spark/scheduler/bus/ListenerBusQueueImpl.scala

+import org.apache.spark.scheduler.bus.ListenerBusQueue.{FixGroupOfListener, ModifiableGroupOfListener}
+
+// For generic message processor (like event logging)
+private[scheduler] class ProcessorListenerBusQueue(


First, the name of this file is weird. But more importantly, why are these classes even necessary?

Why can't you have a single queue implementation that manages a group of listeners? Whether the group has a single listener or multiple shouldn't matter - the implementation can be the same.

For the name of the file, I can change it ! Do you have a better name ? I can even put the content of the file (the 2 concrete implementations) in the BusQueue.scala file.

I refactored a bit this file. Now I have only 2 implementations:

ProcessorBusQueue: This is the implementation for generic processor (In which we do not do the dispatch by event type)

ListenerBusQueue: This is the implementation for listener (with the dispatch by event type)

I'm still confused about why you need 2 implementations. Why doesn't ListenerBusQueue work for everybody? And why shouldn't it?

I need more time to actually grok all this code, but like Wenchen suggested before, this is a big change and it would benefit from a more detailed explanation of exactly how you're organizing the hierarchy of listener, groups, etc. Your PR description only explains which queues you created, but not any of the changes that were needed to achieve that.

If it makes it easier, you can create a README.md file with a longer explanation for how things are organized. (for example, check ommon/network-common/src/main/java/org/apache/spark/network/crypto where I added a README to explain details of what that whole body of code is doing).

abellina

Hi just some comments as I try to understand the code.

abellina · 2017-07-26T18:32:06Z

core/src/main/scala/org/apache/spark/SparkContext.scala

+        listener
+      }
+      if (extraListeners.nonEmpty) {
+        val group = new FixGroupOfListener(extraListeners, "extraListeners")


@vanzin are you suggesting making this a call to addProcessor? Or, and addListener override? Just trying to understand the code at this stage.

abellina · 2017-07-26T18:34:12Z

core/src/main/scala/org/apache/spark/SparkContext.scala

@@ -532,7 +533,10 @@ class SparkContext(config: SparkConf) extends Logging {
          new EventLoggingListener(_applicationId, _applicationAttemptId, _eventLogDir.get,
            _conf, _hadoopConfiguration)
        logger.start()
-        listenerBus.addListener(logger)
+        listenerBus.addProcessor(


Found this @vanzin: https://github.com/apache/spark/pull/18253/files#diff-ca0fe05a42fd5edcab8a1bdaa8e58db9R86

abellina · 2017-07-26T19:13:22Z

core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala

+      }
+      logEvent(toLog)
+      nbMessageProcessed = nbMessageProcessed + 1
+      if (nbMessageProcessed == FLUSH_FREQUENCY) {


This should be >= FLUSH_FREQUENCY.

abellina · 2017-07-27T15:38:26Z

core/src/main/scala/org/apache/spark/scheduler/LiveListenerBus.scala

+    * This method is thread-safe and can be called in any thread.
+    */
+  final override def addListener(listener: SparkListenerInterface): Unit = {
+    startStopAddRemoveLock.lock()


this should probably be in a try/finally block, with unlock in the finally.

same for other lock/unlocks.

Done using Scala Try

abellina · 2017-07-27T16:11:20Z

core/src/main/scala/org/apache/spark/scheduler/LiveListenerBus.scala

    } else {
-      onDropEvent(event)
+      throw new IllegalStateException("LiveListener bus already started!")


definitely want to unlock before this.

abellina · 2017-07-27T16:19:49Z

core/src/main/scala/org/apache/spark/scheduler/SparkListenerBus.scala

@@ -27,7 +27,12 @@ private[spark] trait SparkListenerBus

  protected override def doPostEvent(
      listener: SparkListenerInterface,
-      event: SparkListenerEvent): Unit = {
+      event: SparkListenerEvent): Unit = SparkListenerEventDispatcher.dispatch(listener, event)


why is this change necessary?

I have just extracted the dispatch method to be able to use it in GroupOfListenersBusQueue and in SingleListenerBusQueue (in the file ListenerBusQueueImpl.scala)

bOOm-X · 2017-08-01T08:26:14Z

retest this please

bOOm-X · 2017-08-07T23:58:57Z

@vanzin I simplifed a lot the code. There are now only one implementation for the queue and for the group of listeners. I removed the extra trait in the listener hierarchy too.
Can you take a look ?

cloud-fan · 2017-08-10T14:47:33Z

core/src/main/scala/org/apache/spark/scheduler/LiveListenerBus.scala

+   /**
+    * Add a generic listener to an isolated pool.
+    */
+  def addProcessor(processor: SparkListenerEvent => Unit,


what's the difference between addProcessor and addListener?

with addProcessor, you do not have to provide a SparkListenerInterface (So with a method per message type), but just a generic function which handle SparkListenerEvent (the super type of each event type). So when you do a generic processing (see EventLoggingListener for example) it is very convenient, and cherry on the top you avoid the horrible and costly dispatch function , which in this case (generic processing) is a burden

shall we do it in a separated PR?

It is just a small technical refactoring which came for almost free with the new qeue object. It is also very convenient to be able to handle the asynchronous LiveListenerBus in the test. I think that we can keep it in this PR.

cloud-fan · 2017-08-10T15:05:35Z

The PR description is good for explaining the new behavior, but can you say more about the implementation?

IMO, we just need to duplicate the event queue for each important listener like event logging listener, and non-important listeners can share one event queue like the current behavior. Then each event queue is processed by an individual thread.

bOOm-X · 2017-08-10T15:28:55Z

@cloud-fan PR description updated with some details on the implementation

cloud-fan · 2017-08-11T05:05:44Z

I think BusQueue is a good abstraction, but we can still simplify other parts. My proposal:

// Do we need a better name like ListenesGroup? it's very similar to the previous LiveListenerBus.
class BusQueue extends SparkListenerBus {
  val eventQueue = ...
  val listenerThread = ...
}

class LiveListenerBus {
  val listenersGroups: Map[String, BusQueue] = ...

  def addListener(listener, groupName = "default") {
    val group = listenerGroups.getOrUpdate(groupName, new BusQueue ...)
    group.addListener(listener)
  }

  def post(event) {
    listenersGroups.foreach { case (_, group) =>
      group.post(event)
    }
  }
}

vanzin · 2017-08-22T22:29:24Z

Agree that it still seems like there's too many moving parts here. I don't see a whole lot of difference between BusQueue and GroupOfListener (which is a weird name, btw). A queue can have a bunch of listeners and a LiveListenerBus can have a bunch of queues. I don't see the need for anything more complicated than that.

WithListenerBus is another thing I don't understand; aside from the really weird name, it seems to be mostly the same thing as ListenerBus.

Also to reinforce a previous comment, your code has a ton of style issues. It doesn't matter that checkstyle doesn't complain about them; you still have to follow the code convention of the project or we'll be forever pointing out style issues in your code.

bOOm-X · 2017-08-28T11:03:16Z

@cloud-fan
In my opinion, having the queues indexed by a string and reflecting that in the API is a bit too error prone in this case. What you want is to be in an isolated queue. With this kind of API it is easy to have an other listener added to the same queue because of a conflicting label. In the contrary, it is quite easy to not had 2 depending listeners not in the same queue because of case, ..., of the label. With a map of queues, it enforces the mutability of the queue, which is not really necessary here. And I think that it is a good idea to enforce the fact that depending listeners should be put in the same queue, it increases the readability and the clarity of the API.

bOOm-X · 2017-08-28T11:33:56Z

@vanzin
GroupOfListener is just a set of listeners, which handle all the metric stuff. it allows to decouple this metric aspect from the ListernerBus and the LiveListenerBus. And It allows to enforce the dependency model between listeners (one message processed sequentially). I can change its name if you want.
The BusQueue just handles the queuing / dequeueing generic processing, which includes the consumer thread start an stop and the queuing strategy (drop if full).

WithListenerBus is indeed just the declaration of the methods implemented in ListenerBus. This declaration is shared between the LiveListenerBus and the ReplayListenerBus, and the UI listeners are added to them through it. The ListenerBus still contains a simple synchronous implementation of them used by the ReplayListenerBus. The LiveListenerBus has its own based on multiple queues for some kind of set of listeners. I can rename this interface if you want.

I will do a pass to try to fix the code style issues.

vanzin · 2017-09-05T21:07:18Z

I really dislike WithListenerBus - both as a name and as a concept. There's already a ListenerBus trait; if it's not enough or is broken in some way, it should be fixed, instead of being patched by introducing yet more complexity in the hierarchy.

I think part of the confusion here is that the current code is trying to both refactor the ListenerBus hierarchy and add the concept of queues are the same time. From my point of view they're different things and could be done separately. For example you could add queues to LiveListenerBus only, which is really the only place that matters in the end. Maybe it won't be optimal in its first iteration, but it would be a much easier change to review.

I don't doubt that there's benefit in taking a holistic look into this part of the class hierarchy; but it would be good to do that separately, both so that we can clearly see that the proposed hierarchy makes sense, and so that it's easier to review things. It's easier to wrap your head around the code if it's focused on one problem instead of two.

vanzin · 2017-09-06T22:31:36Z

@bOOm-X

I pushed some code to my repo: https://github.com/vanzin/spark/tree/SPARK-18838

Which is an attempt to do things the way I've been trying to explain. It tries to keep changes as local as possible to LiveListenerBus, creating a few types that implement the grouping and asynchronous behavior. You could do filtering by extending the new AsyncListener, for example, and adding it to the live listener bus.

It's just a p.o.c. so I cut a few corners (like metrics), and I only ran SparkListenerSuite, but I'm just trying to show a different approach that leaves the ListenerBus hierarchy mostly the same as now.

bOOm-X · 2017-09-12T16:47:12Z

@vanzin I pushed some comments on your code. I think that trying to keep the exact same class hierarchy leads to a very complex code, with many drawbacks.

The LiveListenerBus can now manage multiple queues for different listeners This will allow to increase a lot its deqeuing rate. All the listeners are still added to the main queue. So the behavior is the same as the previous one. In further commits some listeners will be moved to dedicated queues. ## How was this patch tested? unit test + manual tests have been run on the cluster

vanzin · 2017-09-12T16:51:36Z

You commented on my code, not on the idea. My code was hacked together quickly, it can be cleaned up a lot. Your comments don't prove that separating the refactoring of the listener bus hierarchy from the introduction of queues is impossible or undesirable.

The eventLoggingListener is now in a dedicated asynchronous queue. This listener could represent 50% of the event processing time of the standard queue ## How was this patch tested? unit test + manual tests have been run on the cluster

The ExecutorAllocationManager is now in a dedicated asynchronous queue. This listener suffer a lot of event drops. Put it in a dedicated queue decrease a lot the chance of them ## How was this patch tested? unit test + manual tests have been run on the cluster

The UI event listeners are now in a dedicated asynchronous queue. This set of listener could represent 40% of the event processing time + do not block the listener bus with call from the GUI ## How was this patch tested? unit test + manual tests have been run on the cluster

The extralisteners are now in a dedicated asynchronous queue. So they cannot interfere with the execution of the spark internal listeners ## How was this patch tested? unit test + manual tests have been run on the cluster

The streaming listener which is a bus too (for streaming event & listener) is now in a dedicated asynchronous queue. So they streaming listeners are run without impact from the other listeners ## How was this patch tested? unit test + manual tests have been run on the cluster

- wait on empty queue instead of looping

remove WithMultipleListenerBus and replace it by adding a boolean

bOOm-X mentioned this pull request Jun 9, 2017

[SPARK-18838][CORE] Introduce blocking strategy for LiveListener #18004

Closed

bOOm-X changed the title ~~[WIP][SPARK-18838][CORE] Introduce multiple queues in LiveListenerBus~~ [SPARK-18838][CORE] Introduce multiple queues in LiveListenerBus Jun 20, 2017

vanzin reviewed Jul 17, 2017

View reviewed changes

abellina reviewed Jul 27, 2017

View reviewed changes

cloud-fan reviewed Aug 10, 2017

View reviewed changes

bOOm-X and others added 10 commits September 12, 2017 18:51

fix tests

f3963e6

[SPARK-18838][CORE] extralisteners dedicated queue

631e89c

The extralisteners are now in a dedicated asynchronous queue. So they cannot interfere with the execution of the spark internal listeners ## How was this patch tested? unit test + manual tests have been run on the cluster

[SPARK-18838][CORE] fix

8528142

- wait on empty queue instead of looping

to ArrayBloquingQueue

ef6b45b

rebase and review

cc5c9f7

simplify

441af9b

remove WithMultipleListenerBus and replace it by adding a boolean

vanzin mentioned this pull request Sep 15, 2017

[SPARK-18838][core] Add separate listener queues to LiveListenerBus. #19211

Closed

vanzin mentioned this pull request Sep 26, 2017

[BUILD] Close stale PRs #19348

Closed

asfgit closed this in ceaec93 Sep 27, 2017

[SPARK-18838][CORE] Introduce multiple queues in LiveListenerBus #18253

[SPARK-18838][CORE] Introduce multiple queues in LiveListenerBus #18253

Conversation

bOOm-X commented Jun 9, 2017 • edited Loading

What changes were proposed in this pull request?

How was this patch tested?

bOOm-X commented Jun 13, 2017

vanzin commented Jun 13, 2017

vanzin commented Jun 13, 2017

SparkQA commented Jun 13, 2017

bOOm-X commented Jun 15, 2017

SparkQA commented Jun 15, 2017

SparkQA commented Jun 15, 2017

SparkQA commented Jun 15, 2017

vanzin commented Jun 15, 2017

SparkQA commented Jun 15, 2017

SparkQA commented Jun 15, 2017

SparkQA commented Jun 15, 2017

SparkQA commented Jun 16, 2017

SparkQA commented Jun 16, 2017

SparkQA commented Jun 16, 2017

SparkQA commented Jun 16, 2017

SparkQA commented Jun 16, 2017

SparkQA commented Jun 16, 2017

SparkQA commented Jun 16, 2017

SparkQA commented Jun 16, 2017

SparkQA commented Jun 16, 2017

SparkQA commented Jun 17, 2017

SparkQA commented Jun 18, 2017

SparkQA commented Jun 18, 2017

SparkQA commented Jun 18, 2017

bOOm-X commented Jun 20, 2017 • edited Loading

vanzin commented Jun 20, 2017

bOOm-X commented Jun 21, 2017

vanzin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abellina left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bOOm-X commented Aug 1, 2017

bOOm-X commented Aug 7, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cloud-fan commented Aug 10, 2017

bOOm-X commented Aug 10, 2017 • edited Loading

cloud-fan commented Aug 11, 2017 • edited Loading

vanzin commented Aug 22, 2017

bOOm-X commented Aug 28, 2017

bOOm-X commented Aug 28, 2017 • edited Loading

vanzin commented Sep 5, 2017

vanzin commented Sep 6, 2017

bOOm-X commented Sep 12, 2017

vanzin commented Sep 12, 2017 • edited Loading

bOOm-X commented Jun 9, 2017 •

edited

Loading

bOOm-X commented Jun 20, 2017 •

edited

Loading

bOOm-X commented Aug 7, 2017 •

edited

Loading

bOOm-X commented Aug 10, 2017 •

edited

Loading

cloud-fan commented Aug 11, 2017 •

edited

Loading

bOOm-X commented Aug 28, 2017 •

edited

Loading

vanzin commented Sep 12, 2017 •

edited

Loading