-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-7627][SPARK-7472] DAG visualization: style skipped stages #6171
Conversation
This requires storing more state in the listener and making sure we clean them up later. Tests are coming in the next commit.
Previously there was an opportunity to leak in `completedStageIds`. This commit fixes that and includes tests that ensures that it is fixed.
Merged build triggered. |
@zsxwing it would be good if you could verify that the listener clean up logic is correct. I have added tests for that purpose but we should test it out in a real long-running streaming application. |
Merged build started. |
Also cc @mateiz who suggested this change. |
Test build #32773 has started for PR 6171 at commit |
Testing your PR. Let's see if there is any memory leak tomorrow morning. |
Test build #32773 has finished for PR 6171 at commit
|
Merged build finished. Test PASSed. |
Test PASSed. |
Conflicts: core/src/main/resources/org/apache/spark/ui/static/spark-dag-viz.js
Merged build triggered. |
Merged build started. |
Test build #32849 has started for PR 6171 at commit |
Merged build triggered. |
Merged build started. |
Test build #32851 has started for PR 6171 at commit |
Right now stage clusters assume the ID space of integers. This is not valid, however, after we merge with dag-viz-streaming, where the cluster ID may just be the operation ID. The result is that certain stage clusters disappear on the UI. This patch by itself doesn't fix anything noticeable. However, it does guard against the potential of collision in future changes.
dc32adb
to
762b541
Compare
Merged build triggered. |
Merged build started. |
Test build #32852 has started for PR 6171 at commit |
Test build #32849 has finished for PR 6171 at commit
|
Merged build finished. Test PASSed. |
Test PASSed. |
Test build #32852 has finished for PR 6171 at commit
|
Test build #32851 has finished for PR 6171 at commit
|
Merged build finished. Test PASSed. |
Test PASSed. |
A small note on the colors -- make them a grayscale version of the colors used in the normal stages, because right now they're quite a bit darker. Just take a picture of the normal stages and turn it to grayscale to grab the right colors. |
Alternatively you could also make them lighter and use dashed lines (http://www.w3schools.com/svg/svg_stroking.asp), but I'm not sure how good that will look. |
Ok, I've updated the screenshot. I tried out the dotted lines but I thought the gray looks better. |
Merged build triggered. |
Merged build started. |
Test build #32915 has started for PR 6171 at commit |
Test build #32915 has finished for PR 6171 at commit
|
Merged build finished. Test PASSed. |
Test PASSed. |
Cool, this looks pretty good. Maybe I'd make the border on skipped stages slightly darker since the red is darker than blue on the other ones, but overall it's fine. |
Merged build triggered. |
Merged build started. |
Test build #32973 has started for PR 6171 at commit |
Test build #32973 has finished for PR 6171 at commit
|
Merged build finished. Test PASSed. |
Test PASSed. |
I discussed with @zsxwing offline and he found that the memory usage is stable for long-running jobs. I'm merging this into master and 1.4. Thanks for all the feedback. |
This patch fixes two things: **SPARK-7627.** Cached RDDs no longer light up on the job page. This is a simple fix. **SPARK-7472.** Display skipped stages differently from normal stages. The latter is a major UX issue. Because we link the job viz to the stage viz even for skipped stages, the user may inadvertently click into the stage page of a skipped stage, which is empty. ------------------- <img src="https://cloud.githubusercontent.com/assets/2133137/7675241/de1a3da6-fcea-11e4-8101-88055cef78c5.png" width="300px" /> Author: Andrew Or <andrew@databricks.com> Closes #6171 from andrewor14/dag-viz-skipped and squashes the following commits: f261797 [Andrew Or] Merge branch 'master' of github.com:apache/spark into dag-viz-skipped 0eda358 [Andrew Or] Tweak skipped stage border color c604150 [Andrew Or] Tweak grayscale colors 7010676 [Andrew Or] Merge branch 'master' of github.com:apache/spark into dag-viz-skipped 762b541 [Andrew Or] Use special prefix for stage clusters to avoid collisions 51c95b9 [Andrew Or] Merge branch 'master' of github.com:apache/spark into dag-viz-skipped b928cd4 [Andrew Or] Fix potential leak + write tests for it 7c4c364 [Andrew Or] Show skipped stages differently 7cc34ce [Andrew Or] Merge branch 'master' of github.com:apache/spark into dag-viz-skipped c121fa2 [Andrew Or] Fix cache color (cherry picked from commit 563bfcc) Signed-off-by: Andrew Or <andrew@databricks.com>
This patch fixes two things: **SPARK-7627.** Cached RDDs no longer light up on the job page. This is a simple fix. **SPARK-7472.** Display skipped stages differently from normal stages. The latter is a major UX issue. Because we link the job viz to the stage viz even for skipped stages, the user may inadvertently click into the stage page of a skipped stage, which is empty. ------------------- <img src="https://cloud.githubusercontent.com/assets/2133137/7675241/de1a3da6-fcea-11e4-8101-88055cef78c5.png" width="300px" /> Author: Andrew Or <andrew@databricks.com> Closes apache#6171 from andrewor14/dag-viz-skipped and squashes the following commits: f261797 [Andrew Or] Merge branch 'master' of github.com:apache/spark into dag-viz-skipped 0eda358 [Andrew Or] Tweak skipped stage border color c604150 [Andrew Or] Tweak grayscale colors 7010676 [Andrew Or] Merge branch 'master' of github.com:apache/spark into dag-viz-skipped 762b541 [Andrew Or] Use special prefix for stage clusters to avoid collisions 51c95b9 [Andrew Or] Merge branch 'master' of github.com:apache/spark into dag-viz-skipped b928cd4 [Andrew Or] Fix potential leak + write tests for it 7c4c364 [Andrew Or] Show skipped stages differently 7cc34ce [Andrew Or] Merge branch 'master' of github.com:apache/spark into dag-viz-skipped c121fa2 [Andrew Or] Fix cache color
This patch fixes two things: **SPARK-7627.** Cached RDDs no longer light up on the job page. This is a simple fix. **SPARK-7472.** Display skipped stages differently from normal stages. The latter is a major UX issue. Because we link the job viz to the stage viz even for skipped stages, the user may inadvertently click into the stage page of a skipped stage, which is empty. ------------------- <img src="https://cloud.githubusercontent.com/assets/2133137/7675241/de1a3da6-fcea-11e4-8101-88055cef78c5.png" width="300px" /> Author: Andrew Or <andrew@databricks.com> Closes apache#6171 from andrewor14/dag-viz-skipped and squashes the following commits: f261797 [Andrew Or] Merge branch 'master' of github.com:apache/spark into dag-viz-skipped 0eda358 [Andrew Or] Tweak skipped stage border color c604150 [Andrew Or] Tweak grayscale colors 7010676 [Andrew Or] Merge branch 'master' of github.com:apache/spark into dag-viz-skipped 762b541 [Andrew Or] Use special prefix for stage clusters to avoid collisions 51c95b9 [Andrew Or] Merge branch 'master' of github.com:apache/spark into dag-viz-skipped b928cd4 [Andrew Or] Fix potential leak + write tests for it 7c4c364 [Andrew Or] Show skipped stages differently 7cc34ce [Andrew Or] Merge branch 'master' of github.com:apache/spark into dag-viz-skipped c121fa2 [Andrew Or] Fix cache color
This patch fixes two things: **SPARK-7627.** Cached RDDs no longer light up on the job page. This is a simple fix. **SPARK-7472.** Display skipped stages differently from normal stages. The latter is a major UX issue. Because we link the job viz to the stage viz even for skipped stages, the user may inadvertently click into the stage page of a skipped stage, which is empty. ------------------- <img src="https://cloud.githubusercontent.com/assets/2133137/7675241/de1a3da6-fcea-11e4-8101-88055cef78c5.png" width="300px" /> Author: Andrew Or <andrew@databricks.com> Closes apache#6171 from andrewor14/dag-viz-skipped and squashes the following commits: f261797 [Andrew Or] Merge branch 'master' of github.com:apache/spark into dag-viz-skipped 0eda358 [Andrew Or] Tweak skipped stage border color c604150 [Andrew Or] Tweak grayscale colors 7010676 [Andrew Or] Merge branch 'master' of github.com:apache/spark into dag-viz-skipped 762b541 [Andrew Or] Use special prefix for stage clusters to avoid collisions 51c95b9 [Andrew Or] Merge branch 'master' of github.com:apache/spark into dag-viz-skipped b928cd4 [Andrew Or] Fix potential leak + write tests for it 7c4c364 [Andrew Or] Show skipped stages differently 7cc34ce [Andrew Or] Merge branch 'master' of github.com:apache/spark into dag-viz-skipped c121fa2 [Andrew Or] Fix cache color
This patch fixes two things:
SPARK-7627. Cached RDDs no longer light up on the job page. This is a simple fix.
SPARK-7472. Display skipped stages differently from normal stages.
The latter is a major UX issue. Because we link the job viz to the stage viz even for skipped stages, the user may inadvertently click into the stage page of a skipped stage, which is empty.