Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-12299][CORE] Remove history serving functionality from Master #10991

Closed

Conversation

BryanCutler
Copy link
Member

Remove history server functionality from standalone Master. Previously, the Master process rebuilt a SparkUI once the application was completed which sometimes caused problems, such as OOM, when the application event log is large (see SPARK-6270). Keeping this functionality out of the Master will help to simplify the process and increase stability.

Testing for this change included running core unit tests and manually running an application on a standalone cluster to verify that it completed successfully and that the Master UI functioned correctly. Also added 2 unit tests to verify killing an application and driver from MasterWebUI makes the correct request to the Master.

@SparkQA
Copy link

SparkQA commented Jan 29, 2016

Test build #50415 has finished for PR 10991 at commit 6918f0b.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@BryanCutler
Copy link
Member Author

Removing the history serving from Master is pretty straightforward, but leaves some gaps in the UI and serving JSON requests. I am marking this as WIP to get some discussion on how to best handle a few of these points.

  1. Currently, the Master page has a the application name as a link to the Application UI Page. While the app is running, it has a link to the live application server, then when complete the Master replaces that link with one for the rebuilt UI.
    Now, since there is no rebuilt UI, I left the live link, then once complete the application name is static and does not provide a link
  2. Similarly, the Master Application page has a link to "Application Detail" to open up the Application UI Page. While the app is running, it has a link to the live application server, then when complete the Master replaces that link with one for the rebuilt UI.
    Now, the live link is still there while the application is running and once complete the "Application Detail" is removed entirely.
  3. Currently, the Master follows the REST API to serve JSON data and the MasterPage extends the trait UIRoot to allow requests for ApiRootResources.
    Since a SparkUI with application info is required to serve these resources, now MasterPage no longer extends this trait.
  4. Although the Master can no long serve ApiRootResources, it still can list applications and also redirect resource requests to any currently running application.
    I put in some functionality that redirects requests, but I stopped short of serving the application list (the "/applications" endpoint) because it was getting a little messy. Is this something that is worth having in? Not going to do this, since there are probably not many people depending on this API

@BryanCutler
Copy link
Member Author

cc @andrewor14, @squito for thoughts on above

@andrewor14
Copy link
Contributor

Ah, that's too bad. When I first proposed to do this I did not think there would be a REST API. Overall I'm still in favor of removing the functionality from Master. I don't have strong opinions about what we should do about the REST API but I believe that's less important; I don't think many people depend on the specific get SparkUI call from the standalone Master.

I believe @JoshRosen has some opinions about how we should move forward in Spark 2.0.

@BryanCutler
Copy link
Member Author

Hi @JoshRosen , what are your thoughts on the Master REST API if history is no longer available?

@JoshRosen
Copy link
Contributor

The Master REST API functionality which existed prior to the unification with the Spark driver REST API should still work after this change, so we shouldn't break or remove those methods.

I think that there are a few ways to handle this JIRA / task, some more complex than others. The only "must have" requirement for 2.0 is removing the HistoryServer functionality; other cleanup and interface refactoring can be deferred for later. So, I wouldn't try to do any major refactoring or cleanup; just remove the feature so we get the API / functionality change in and address the other pieces later.

@BryanCutler
Copy link
Member Author

@JoshRosen @andrewor14 , just to be clear what needs to be done here - we want to keep the ability for Master to rebuild a SparkUI to be able to respond to REST API calls, but just not automatically rebuild every SparkUI when complete? Also, need to remove UI links to completed apps, etc.

@JoshRosen
Copy link
Contributor

I don't think that the Master should have any event-log consumption logic whatsoever.

@BryanCutler
Copy link
Member Author

Maybe I'm missing something then. Most of the REST API defined in ApiRootResource depends on having a SparkUI object which the Master currently creates by playing back event-log files. How can we keep this API in the Master and remove event-log consumption without some major refactoring? Is there some other way for the Master to create a SparkUI on the fly?

@steveloughran
Copy link
Contributor

There's a more fundamental issue which the history server has too: log replay is too expensive for long-lived applications. replay time is O(jobs)+O(stages), even if GC keeps those numbers in the main UI down.

I'm not proposing a solution other than to observe that it's essentially some form of ETL operation, one that may be possible to do prepare for before building the UI up, just running through the events and discarding old/surplus ones. Has anyone played with something like that ?

@andrewor14
Copy link
Contributor

@BryanCutler Actually, for simplicity I would recommend simply removing the REST API related to the Master serving history UIs. Then you can just remove the actual history serving functionality itself easily. I don't see any reason we'll possibly add it back after Spark 2.0 so it doesn't make sense to keep around a REST API that does not have a backing functionality.

Also I haven't heard of anyone relying on this obscure REST API so I don't think it matters a lot. We should never have added this API in the first place.

@BryanCutler
Copy link
Member Author

Thanks @andrewor14 , that sounds fine to me also - I just wanted to make sure I didn't remove some functionality that some people relied on. I already had everything removed, just need to clean up and merge with latest. I'm on vacation now, but I should be able to finish this up next week.

@andrewor14
Copy link
Contributor

Sure, thanks for your work!

@andrewor14
Copy link
Contributor

Also, please update docs/monitoring.md as well

…ster-SPARK-12299

Conflicts:
	core/src/main/scala/org/apache/spark/deploy/master/ApplicationInfo.scala
	core/src/main/scala/org/apache/spark/deploy/master/Master.scala
	core/src/main/scala/org/apache/spark/deploy/master/ui/MasterWebUI.scala
	core/src/main/scala/org/apache/spark/status/api/v1/ApplicationListResource.scala
…stead of listing applications, which is no longer supported
@BryanCutler BryanCutler changed the title [SPARK-12299][CORE][WIP] Remove history serving functionality from Master [SPARK-12299][CORE] Remove history serving functionality from Master Apr 7, 2016
@SparkQA
Copy link

SparkQA commented Apr 8, 2016

Test build #55264 has finished for PR 10991 at commit fadd392.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@BryanCutler
Copy link
Member Author

jenkins retest this please

@SparkQA
Copy link

SparkQA commented Apr 8, 2016

Test build #55365 has finished for PR 10991 at commit fadd392.

  • This patch fails build dependency tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@BryanCutler
Copy link
Member Author

jenkins retest this please

@SparkQA
Copy link

SparkQA commented Apr 8, 2016

Test build #55366 has finished for PR 10991 at commit fadd392.

  • This patch fails build dependency tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@BryanCutler
Copy link
Member Author

jenkins retest this please

@SparkQA
Copy link

SparkQA commented Apr 8, 2016

Test build #55382 has finished for PR 10991 at commit fadd392.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@BryanCutler
Copy link
Member Author

@andrewor14 or @JoshRosen , this is ready for review whenever you get a chance, thanks!

…ster-SPARK-12299

Conflicts:
	core/src/main/scala/org/apache/spark/deploy/master/Master.scala
…ster-SPARK-12299

Conflicts:
	core/src/main/scala/org/apache/spark/deploy/master/Master.scala
@SparkQA
Copy link

SparkQA commented Apr 19, 2016

Test build #56243 has finished for PR 10991 at commit f9c09a1.

  • This patch passes all tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 19, 2016

Test build #56252 has finished for PR 10991 at commit d1814a3.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@BryanCutler
Copy link
Member Author

jenkins retest this please

@SparkQA
Copy link

SparkQA commented Apr 19, 2016

Test build #56262 has finished for PR 10991 at commit d1814a3.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@BryanCutler
Copy link
Member Author

jenkins retest this please

@SparkQA
Copy link

SparkQA commented Apr 20, 2016

Test build #56285 has finished for PR 10991 at commit d1814a3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@BryanCutler
Copy link
Member Author

Ping @andrewor14 , please review. Is this still slated for 2.0 release? Thanks!

@andrewor14
Copy link
Contributor

LGTM! @JoshRosen should have a look too. Also retest this please because this hasn't run tests in a while.

@SparkQA
Copy link

SparkQA commented May 4, 2016

Test build #57697 has finished for PR 10991 at commit d1814a3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@andrewor14
Copy link
Contributor

Merging into master 2.0. Thanks for your work @BryanCutler.

asfgit pushed a commit that referenced this pull request May 4, 2016
Remove history server functionality from standalone Master.  Previously, the Master process rebuilt a SparkUI once the application was completed which sometimes caused problems, such as OOM, when the application event log is large (see SPARK-6270).  Keeping this functionality out of the Master will help to simplify the process and increase stability.

Testing for this change included running core unit tests and manually running an application on a standalone cluster to verify that it completed successfully and that the Master UI functioned correctly.  Also added 2 unit tests to verify killing an application and driver from MasterWebUI makes the correct request to the Master.

Author: Bryan Cutler <cutlerb@gmail.com>

Closes #10991 from BryanCutler/remove-history-master-SPARK-12299.

(cherry picked from commit cf2e9da)
Signed-off-by: Andrew Or <andrew@databricks.com>
@asfgit asfgit closed this in cf2e9da May 4, 2016
@BryanCutler
Copy link
Member Author

Great, thanks!

@BryanCutler BryanCutler deleted the remove-history-master-SPARK-12299 branch January 3, 2017 22:28
dud225 added a commit to dud225/spark that referenced this pull request Apr 20, 2017
PR apache#10991 removed the built-in history view from Spark Standalone, so the history server is no longer useful to Yarn or Mesos only.
asfgit pushed a commit that referenced this pull request Apr 21, 2017
Hello
PR #10991 removed the built-in history view from Spark Standalone, so the history server is no longer useful to Yarn or Mesos only.

Author: Hervé <dud225@users.noreply.github.com>

Closes #17709 from dud225/patch-1.

(cherry picked from commit 3476799)
Signed-off-by: Sean Owen <sowen@cloudera.com>
asfgit pushed a commit that referenced this pull request Apr 21, 2017
Hello
PR #10991 removed the built-in history view from Spark Standalone, so the history server is no longer useful to Yarn or Mesos only.

Author: Hervé <dud225@users.noreply.github.com>

Closes #17709 from dud225/patch-1.

(cherry picked from commit 3476799)
Signed-off-by: Sean Owen <sowen@cloudera.com>
asfgit pushed a commit that referenced this pull request Apr 21, 2017
Hello
PR #10991 removed the built-in history view from Spark Standalone, so the history server is no longer useful to Yarn or Mesos only.

Author: Hervé <dud225@users.noreply.github.com>

Closes #17709 from dud225/patch-1.
@stanzhai
Copy link
Contributor

We've just upgraded our Spark cluster from 1.6.x to 2.x, I found that the REST APIs from Spark MasterUI is unavailable.

It's important for us to use the REST APIs to monitor our Applications. I believe that some other people would rely on this function too.

Right now, the only way to get them is using the Spark Master WebUI, it's too bad.

It would be great that we have some REST APIs to access Master、Workers and Applications information from Master.

@BryanCutler @andrewor14 @JoshRosen

@squito
Copy link
Contributor

squito commented May 30, 2017

@stanzhai I think you have a good point. I don't think removing the REST API for access to basic app info in the standalone master was really necessary for the original goals of SPARK-12299. But I'd open another jira, for restoring an api with the same info in the master UI, and move discussion there.

@squito
Copy link
Contributor

squito commented May 30, 2017

whoops, now I see this is already present as https://issues.apache.org/jira/browse/SPARK-18683

peter-toth pushed a commit to peter-toth/spark that referenced this pull request Oct 6, 2018
Hello
PR apache#10991 removed the built-in history view from Spark Standalone, so the history server is no longer useful to Yarn or Mesos only.

Author: Hervé <dud225@users.noreply.github.com>

Closes apache#17709 from dud225/patch-1.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants