Skip to content

Commit

Permalink
small changes for documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
ding committed Feb 23, 2017
1 parent 2639eb1 commit 11bc349
Show file tree
Hide file tree
Showing 2 changed files with 19 additions and 13 deletions.
14 changes: 14 additions & 0 deletions docs/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -2115,6 +2115,20 @@ showDF(properties, numRows = 200, truncate = FALSE)

</table>

### GraphX

<table class="table">
<tr><th>Property Name</th><th>Default</th><th>Meaning</th></tr>
<tr>
<td><code>spark.graphx.pregel.checkpointInterval</code></td>
<td>10</td>
<td>
Checkpoint interval for graph and message in Pregel. It used to avoid stackOverflowError due to long lineage chains
after lots of iterations. The checkpoint can be disabled by set as -1.
</td>
</tr>
</table>

### Deploy

<table class="table">
Expand Down
18 changes: 5 additions & 13 deletions docs/graphx-programming-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -710,7 +710,7 @@ messages remaining.
The following is the type signature of the [Pregel operator][GraphOps.pregel] as well as a *sketch*
of its implementation (note: to avoid stackOverflowError due to long lineage chains, graph and
messages are periodically checkpoint and the checkpoint interval is set by
"spark.graphx.pregel.checkpointInterval"):
"spark.graphx.pregel.checkpointInterval", it can be disable by set as -1):

{% highlight scala %}
class GraphOps[VD, ED] {
Expand All @@ -723,30 +723,22 @@ class GraphOps[VD, ED] {
mergeMsg: (A, A) => A)
: Graph[VD, ED] = {
// Receive the initial message at each vertex
var g = graph.mapVertices((vid, vdata) => vprog(vid, vdata, initialMsg))
val graphCheckpointer = new PeriodicGraphCheckpointer[VD, ED](
checkpointInterval, graph.vertices.sparkContext)
graphCheckpointer.update(g)
var g = mapVertices( (vid, vdata) => vprog(vid, vdata, initialMsg) ).cache()

// compute the messages
var messages = GraphXUtils.mapReduceTriplets(g, sendMsg, mergeMsg)
val messageCheckpointer = new PeriodicRDDCheckpointer[(VertexId, A)](
checkpointInterval, graph.vertices.sparkContext)
messageCheckpointer.update(messages.asInstanceOf[RDD[(VertexId, A)]])
var messages = g.mapReduceTriplets(sendMsg, mergeMsg)
var activeMessages = messages.count()
// Loop until no messages remain or maxIterations is achieved
var i = 0
while (activeMessages > 0 && i < maxIterations) {
// Receive the messages and update the vertices.
g = g.joinVertices(messages)(vprog)
graphCheckpointer.update(g)
g = g.joinVertices(messages)(vprog).cache()
val oldMessages = messages
// Send new messages, skipping edges where neither side received a message. We must cache
// and periodic checkpoint messages so it can be materialized on the next line, and avoid
// to have a long lineage chain.
messages = GraphXUtils.mapReduceTriplets(
g, sendMsg, mergeMsg, Some((oldMessages, activeDirection)))
messageCheckpointer.update(messages.asInstanceOf[RDD[(VertexId, A)]])
g, sendMsg, mergeMsg, Some((oldMessages, activeDirection))).cache()
activeMessages = messages.count()
i += 1
}
Expand Down

0 comments on commit 11bc349

Please sign in to comment.