-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SPARK-2533 - Add locality levels on stage summary view #9487
Conversation
@kayousterhout suggestions:
I'm updating there. |
Reference to the "old" PR: #9117 |
Test build #1985 has finished for PR 9487 at commit
|
val locality = taskUIData.taskInfo.taskLocality.toString | ||
localityCounts.put(locality, localityCounts.getOrElse(locality, 0L) + 1) | ||
}) | ||
return localityCounts.map { _ match { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can omit return
and the _ match
since you have just one case
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct Sean. Thanks ! Actually, I'm implementing second Kay's suggestion (using "Show Additional Metrics" checkbox). I will update the PR.
f7cf968
to
4388a89
Compare
Updated PR implementing Kay's suggestions: formatting the locality level summary string, and moving under the "Show Additional Metrics" drop down. |
case (localityLevel, count) => s"$localityLevel: $count task(s)" | ||
}}.mkString("; ") | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might result in inconsistent order of output and always shows 0 counts, which may or may not be desirable. How about this which avoids some of the repetition:
val localities = stageData.taskData.values.map(_.taskInfo.taskLocality)
val localityCounts = localities.groupBy(identity).mapValues(_.size)
val localityNamesAndCounts = localityCounts.toSeq.map { case (locality, count) =>
val localityName = locality match {
case TaskLocality.PROCESS_LOCAL => "Process local"
case TaskLocality.NODE_LOCAL => "Node local"
case TaskLocality.RACK_LOCAL => "Rack local"
case TaskLocality.ANY => "Any"
}
(localityName, count)
}
localityNamesAndCounts.sorted.mkString("; ")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough, let me try. Thanks again Sean, I appreciate all your help, support and guidance !
4388a89
to
4293666
Compare
PR rebased and updated according to Sean's suggestion |
@@ -21,6 +21,7 @@ import java.net.URLEncoder | |||
import java.util.Date | |||
import javax.servlet.http.HttpServletRequest | |||
|
|||
import scala.collection.mutable.HashMap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good. Finally, I think this import is no longer needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True !!! Let me fix that.
397de2e
to
1f0f023
Compare
PR rebased and updated with alphabetize import thanks to @kayousterhout suggestion (thanks ;)) |
Can you post the updated screenshot? |
case TaskLocality.RACK_LOCAL => "Rack local" | ||
case TaskLocality.ANY => "Any" | ||
} | ||
(localityName, count) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this prints as "(Process local, 3); (Rack local, 5)"? Am I understanding that correctly? If so, can you replace line 83 with s"$localityName: $count" to get a nicer looking format?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, let me do it.
Sure, let me prepare that. |
1f0f023
to
96a93ed
Compare
PR rebased and implement Kay's suggestion (about the locality level summary string format) |
LGTM |
@@ -177,6 +192,10 @@ private[ui] class StagePage(parent: StagesTab) extends WebUIPage("stage") { | |||
<div class="additional-metrics collapsed"> | |||
<ul> | |||
<li> | |||
<strong>Locality Level Summary: </strong> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, actually on final thought, is this the right place to show this info? it's dumped into a list of metrics, but all the others are controlled by arrows and checkboxes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about a new table "Locality Levels Summary" displayed by a checkbox in additional metric ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, it didn't register to me that this was adding to the list of additional metrics enabled by checkboxes. If it goes here, I think it would have to be treated the same way. If a table displays well, that's reasonable to me. It might be harder to make render though; could be overkill at this stage. @kayousterhout has better sensibilities here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Catcha. Let me try something around this for you.
96a93ed
to
1f52f87
Compare
Hm, I find it a little weird that we put it under additional metrics. It's not really a metric and it's different from the rest of the things under that drop down. Should we just put it under |
@andrewor14 I think that was how @jbonofre originally did it, but the complaint with that approach was that it clutters the UI with information that may not be useful to many people. I'm torn here though because it is pretty small; I'm fine with just putting it under the total time if @srowen is OK with that. |
Correct Kay, it's the way that I did at the beginning.
Thoughts ? |
I'm updating the PR with the collapsed table. I will attach a screenshot. |
1f52f87
to
5a489fe
Compare
I'd prefer your original approach to this, because in either case, it takes up one line of text in the default view, but with this, it takes up one line of text and you have to click on that line to see the metrics. @andrewor14 @srowen what about putting the summary ("Locality summary: process local: 3 tasks; dode local: 6 tasks; ..." under the "Summary metrics" heading and above the table (and always displayed)? @jbonofre might be a good idea to hold off on changing the code until there's consensus on the right approach |
@kayousterhout ok, let's wait feedback from the others. I will revert my commit, back on a single string always displayed. Thanks ! |
Thanks, the picture helps me think about where this should go. This seems most related to the "Tasks" section which also reports the locality of each task. Is it too random to stick this in as a line of text before or after that table? Summary Metrics: could go here too. This is only for completed tasks though. The metrics are for active tasks too right? |
You are right: it's more related to tasks. And yes, it's valid for both completed and active tasks. |
Good point re: the summary metrics being only for completed tasks. I think it's a little weird to put the summary metric under the "tasks" table though; what about the original proposal of putting it near the top, under "total time"? |
+1 with @kayousterhout |
Ha OK, full circle there. At least we've considered every possible position on the page. I'm OK with that. Maybe later there's a big redesign that re-sorts all the info here but for now it's no bad place to stick this info, logically or aesthetically. |
I agree, we should just put it under total time. The KV pairs at the top of the page are for summary info aggregated across tasks and that's exactly what this is. It looks clunky as a table and doesn't seem to be of equal importance to things like the event timeline. |
Let me revert the unecessary commits. |
5a489fe
to
b4b8c90
Compare
…ional Metrics" drop down
… Sean's suggestion
… Kay's suggestion
b4b8c90
to
bd7de49
Compare
PR rebased, locality levels summary on stage just under the time (as first proposal). |
OK that seems pretty good. |
retest this please |
Test build #45752 has finished for PR 9487 at commit
|
LGTM merging into master 1.6, thanks! |
Author: Jean-Baptiste Onofré <jbonofre@apache.org> Closes #9487 from jbonofre/SPARK-2533-2. (cherry picked from commit 74c3004) Signed-off-by: Andrew Or <andrew@databricks.com>
Thanks ! Let me prepare new PR ;) |
Author: Jean-Baptiste Onofré <jbonofre@apache.org> Closes apache#9487 from jbonofre/SPARK-2533-2.
No description provided.