Skip to content

Commit

Permalink
[Skymeld] Perform ALV collection in the same FJP that runs BuildDrive…
Browse files Browse the repository at this point in the history
…rFunction.

Prior to this CL, we created a separate FJP ("find-action-lookup-values-in-build")
and use `invoke(ForkJoinTask)` to start a new traversal. This would cause a
StackOverflowError where the stack trace didn't make sense:

```
	at com.google.devtools.build.lib.skyframe.SkyframeExecutor$TransitiveActionLookupKeysCollector.collect(SkyframeExecutor.java:3910)
	at com.google.devtools.build.lib.skyframe.SkyframeExecutor.collectTransitiveActionLookupValuesOfKey(SkyframeExecutor.java:3476)
	at com.google.devtools.build.lib.skyframe.SkyframeExecutor$1.collect(SkyframeExecutor.java:736)
	at com.google.devtools.build.lib.skyframe.BuildDriverFunction.checkActionConflicts(BuildDriverFunction.java:566)
	at com.google.devtools.build.lib.skyframe.BuildDriverFunction.compute(BuildDriverFunction.java:188)
	at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:506)
	at com.google.devtools.build.lib.concurrent.AbstractQueueVisitor$WrappedRunnable.run(AbstractQueueVisitor.java:399)
	at java.base/java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1426)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
	at java.base/java.util.concurrent.ForkJoinPool.awaitJoin(ForkJoinPool.java:1708)
	at java.base/java.util.concurrent.ForkJoinTask.doJoin(ForkJoinTask.java:397)
	at java.base/java.util.concurrent.ForkJoinTask.join(ForkJoinTask.java:721)
	at java.base/java.util.concurrent.ForkJoinPool.invoke(ForkJoinPool.java:2423)
	at com.google.devtools.build.lib.skyframe.SkyframeExecutor$TransitiveActionLookupKeysCollector.collect(SkyframeExecutor.java:3910)
	at com.google.devtools.build.lib.skyframe.SkyframeExecutor.collectTransitiveActionLookupValuesOfKey(SkyframeExecutor.java:3476)
	at com.google.devtools.build.lib.skyframe.SkyframeExecutor$1.collect(SkyframeExecutor.java:736)
	at com.google.devtools.build.lib.skyframe.BuildDriverFunction.checkActionConflicts(BuildDriverFunction.java:566)
	at com.google.devtools.build.lib.skyframe.BuildDriverFunction.compute(BuildDriverFunction.java:188)
	at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:506)
	at com.google.devtools.build.lib.concurrent.AbstractQueueVisitor$WrappedRunnable.run(AbstractQueueVisitor.java:399)
	at java.base/java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1426)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
	at java.base/java.util.concurrent.ForkJoinPool.awaitJoin(ForkJoinPool.java:1708)
	at java.base/java.util.concurrent.ForkJoinTask.doJoin(ForkJoinTask.java:397)
	at java.base/java.util.concurrent.ForkJoinTask.join(ForkJoinTask.java:721)
	at java.base/java.util.concurrent.ForkJoinPool.invoke(ForkJoinPool.java:2423)
```

There's no recursion expected that would cause this stack trace. Turns out, this
odd behavior was the result of using `invoke(ForkJoinTask)` in the wrong context.
It's only supposed to be used from non fork/join clients [1]. The internal
implementation of FJP also just checks if the current thread is a FJP, but not
which pool, so it's possible that this caused this undefined behavior.

This CL changed that by running the traversal directly in the same CPU-heavy
thread pool that runs the BuildDriverFunction, instead of spinning up an extra FJP for that. This makes sense also from a performance perspective because this traversal is CPU-bound.

[1] https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ForkJoinPool.html

PiperOrigin-RevId: 540546910
Change-Id: Ia1bc573ff350b0a803ec96d710e4d9696eb49421
  • Loading branch information
joeleba authored and copybara-github committed Jun 15, 2023
1 parent 7053947 commit 11affc3
Showing 1 changed file with 12 additions and 14 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -2283,7 +2283,6 @@ public void resetBuildDriverFunction() {
/** Resets the incremental artifact conflict finder to ensure incremental correctness. */
public void resetIncrementalArtifactConflictFindingStates() {
incrementalArtifactConflictFinder.shutdown();
incrementalTransitiveActionLookupKeysCollector.shutdown();
incrementalArtifactConflictFinder =
IncrementalArtifactConflictFinder.createWithActionGraph(
new MapBasedActionGraph(actionKeyContext));
Expand Down Expand Up @@ -3567,9 +3566,6 @@ private ClaimedLookupValueSentinel() {}
* action conflicts.
*/
private static class TransitiveActionLookupKeysCollector {
final ForkJoinPool executorService =
NamedForkJoinPool.newNamedPool(
"find-action-lookup-values-in-build", Runtime.getRuntime().availableProcessors());
private final WalkableGraph walkableGraph;
private final Set<ActionLookupKey> globalVisitedSet;

Expand All @@ -3585,6 +3581,9 @@ private TransitiveActionLookupKeysCollector(
*/
private Map<ActionLookupKey, SkyValue> collect(Iterable<ActionLookupKeyOrProxy> visitationRoots)
throws InterruptedException {
ForkJoinPool executorService =
NamedForkJoinPool.newNamedPool(
"find-action-lookup-values-in-build", Runtime.getRuntime().availableProcessors());
var collected = new ConcurrentHashMap<ActionLookupKey, SkyValue>();
List<Future<?>> futures = Lists.newArrayListWithCapacity(Iterables.size(visitationRoots));
for (ActionLookupKeyOrProxy keyOrProxy : visitationRoots) {
Expand All @@ -3601,7 +3600,10 @@ private Map<ActionLookupKey, SkyValue> collect(Iterable<ActionLookupKeyOrProxy>
} catch (ExecutionException e) {
throw new IllegalStateException("Error collecting transitive ActionLookupValues", e);
} finally {
shutdown();
if (!executorService.isShutdown() && ExecutorUtil.interruptibleShutdown(executorService)) {
// Preserve the interrupt status.
Thread.currentThread().interrupt();
}
}
}

Expand All @@ -3613,15 +3615,18 @@ private Map<ActionLookupKey, SkyValue> collect(Iterable<ActionLookupKeyOrProxy>
* elements in the transitive closure of the visitationRoot.
*
* <p>This method is the Skymeld implementation. Crucially, this differs from the non-Skymeld
* implementation in that it does not shutdown the executor.
* implementation in that it is launched from a Skyframe thread, likely belongs to the CPU heavy
* thread pool, and would queue up recursive tasks in this pool as well. No extra
* ExecutorService is created here.
*/
private ImmutableMap<ActionLookupKey, SkyValue> collect(ActionLookupKeyOrProxy visitationRoot) {
var collected = new ConcurrentHashMap<ActionLookupKey, SkyValue>();
ActionLookupKey key = visitationRoot.toKey();
if (!tryClaimVisitation(key, collected)) {
return ImmutableMap.of();
}
executorService.invoke(new VisitActionLookupKey(key, collected));
// Invoke the recursive task in the same FJP.
new VisitActionLookupKey(key, collected).invoke();
return ImmutableMap.copyOf(collected);
}

Expand All @@ -3637,13 +3642,6 @@ private boolean tryClaimVisitation(
&& collected.putIfAbsent(key, ClaimedLookupValueSentinel.INSTANCE) == null;
}

private void shutdown() {
if (!executorService.isShutdown() && ExecutorUtil.interruptibleShutdown(executorService)) {
// Preserve the interrupt status.
Thread.currentThread().interrupt();
}
}

protected final class VisitActionLookupKey extends RecursiveAction {
private final ActionLookupKey key;
private final ConcurrentHashMap<ActionLookupKey, SkyValue> collected;
Expand Down

0 comments on commit 11affc3

Please sign in to comment.