Skip to content
This repository has been archived by the owner on Dec 4, 2024. It is now read-only.

Spark checkpoint docs #181

Merged
merged 1 commit into from
Oct 5, 2017

Conversation

skonto
Copy link
Contributor

@skonto skonto commented Sep 8, 2017

Some basic stuff

@skonto
Copy link
Contributor Author

skonto commented Sep 8, 2017

@ArtRand some light docs.

docs/hdfs.md Outdated
@@ -21,6 +21,10 @@ For more information, see [Inheriting Hadoop Cluster Configuration][8].

For DC/OS HDFS, these configuration files are served at `http://<hdfs.framework-name>.marathon.mesos:<port>/v1/connection`, where `<hdfs.framework-name>` is a configuration variable set in the HDFS package, and `<port>` is the port of its marathon app.

### Spark Checkpointing

In order to use spark with cehckpointing make sure you follow the instructions [here](https://spark.apache.org/docs/latest/streaming-programming-guide.html#checkpointing) and use an hdfs directory as the checkpoint directory. In the future we plan to support checkpointing at the driver level with local persistent volumes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skonto This is great. Could you add an example? Also, please remove In the future we plan to support checkpointing at the driver level with local persistent volumes. although I like the optimism, let's put this in when we have driver level checkpointing.

@skonto skonto force-pushed the update_docs_checkpointing branch 3 times, most recently from 18a7133 to 5ac26ab Compare September 29, 2017 12:31
@skonto
Copy link
Contributor Author

skonto commented Sep 29, 2017

@ArtRand pls review.

Copy link
Contributor

@susanxhuynh susanxhuynh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skonto See comments:

docs/hdfs.md Outdated

In order to use spark with checkpointing make sure you follow the instructions [here](https://spark.apache.org/docs/latest/streaming-programming-guide.html#checkpointing) and use an hdfs directory as the checkpointing directory. For example:
```
val checkpointDirectory = hdfs://hdfs/checkpoint
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add quotes around the string: "hdfs://hdfs/checkpoint"

docs/hdfs.md Outdated
@@ -87,7 +97,7 @@ Submit the job with the ticket:
```$bash
dcos spark run --submit-args="\
--kerberos-principal hdfs/name-0-node.hdfs.autoip.dcos.thisdcos.directory@LOCAL \
--tgt-secret-path /__dcos_base64__tgt
--tgt-secret-path /__dcos_base64__tgt
Copy link
Contributor

@susanxhuynh susanxhuynh Oct 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a back-slash ("\") at end of this line

@skonto skonto force-pushed the update_docs_checkpointing branch from 5ac26ab to cea68b8 Compare October 5, 2017 10:01
@skonto
Copy link
Contributor Author

skonto commented Oct 5, 2017

@susanxhuynh pls review, I think its clear for merge.

@ArtRand ArtRand merged commit dcab0e4 into d2iq-archive:master Oct 5, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants