Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graph aggregation refactoring #8082

Merged
merged 70 commits into from
May 8, 2024
Merged

Graph aggregation refactoring #8082

merged 70 commits into from
May 8, 2024

Conversation

sokra
Copy link
Member

@sokra sokra commented May 3, 2024

Description

  • Deletes the aggregation tree
  • Adds a new graph aggregation algorithm which is more efficient

The graph aggregation works as following:

For the graph aggregation: Every task is a node in the graph. Every parent-child relationship is an edge in the graph.

  • Every node has an "aggregation number" N.
  • There are 2 kinds of nodes: Leaf nodes and aggregating nodes.
  • If a node has N < LEAF_NUMBER, it's a leaf node, otherwise an aggregating node.
  • A higher N for a node usually means that a larger subgraph is aggregated into that node.
  • Next to normal edges there are two extra kind of edges for the graph aggregation: Upper edges and follower edges.
  • A node is considered as "inner" to another node when it has an "upper" edge pointing towards it.
  • The inner node has a lower N than the upper node. (This invariant might be temporarily violated while tree balancing is scheduled but not executed yet)
  • Aggregating nodes store an aggregated version of the state of all inner nodes and transitively inner nodes.
  • Changes in nodes are propagated to all upper nodes.
  • Every node has at least one upper node which is more aggregated than the node. Except for the root node of the graph, which doesn't have upper edges.
  • An aggregating node also has follower edges. They point to the nodes that are one normal edge after all inner and transitively inner nodes.
  • An leaf node doesn't have follower edges. For all purposes the normal edges of leaf nodes are considered as follower edges.
  • Follower nodes have a higher N than the origin node. (This invariant might be temporarily violated while tree balancing is scheduled but not executed yet)
  • This means large and larger subgraphs are aggregated.
  • Graph operations will ensure that these invariants (Higher N on upper and follower edges) are not violated.
  • The N of a node can only increase. So graph operations need to "fix" the invariants by increasing N or changing upper/follower edges. That later one is preferred. N is usually only increased if two nodes have equal N.
  • When new edges between leaf nodes are added, the target node's N is increased to the origin node's N + 4 if it's smaller. This adds a small tolerance range so increasing N doesn't cause long chains of N += 1 between leaf nodes.

Testing Instructions

Closes PACK-3036

Copy link

vercel bot commented May 3, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
examples-nonmonorepo ✅ Ready (Inspect) Visit Preview 💬 Add feedback May 8, 2024 5:52pm
examples-svelte-web 🔄 Building (Inspect) Visit Preview 💬 Add feedback May 8, 2024 5:52pm
rust-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback May 8, 2024 5:52pm
7 Ignored Deployments
Name Status Preview Comments Updated (UTC)
examples-basic-web ⬜️ Ignored (Inspect) Visit Preview May 8, 2024 5:52pm
examples-designsystem-docs ⬜️ Ignored (Inspect) Visit Preview May 8, 2024 5:52pm
examples-gatsby-web ⬜️ Ignored (Inspect) Visit Preview May 8, 2024 5:52pm
examples-kitchensink-blog ⬜️ Ignored (Inspect) Visit Preview May 8, 2024 5:52pm
examples-native-web ⬜️ Ignored (Inspect) Visit Preview May 8, 2024 5:52pm
examples-tailwind-web ⬜️ Ignored (Inspect) Visit Preview May 8, 2024 5:52pm
examples-vite-web ⬜️ Ignored (Inspect) Visit Preview May 8, 2024 5:52pm

Copy link
Contributor

github-actions bot commented May 3, 2024

🟢 Turbopack Benchmark CI successful 🟢

Thanks

Copy link
Contributor

github-actions bot commented May 3, 2024

✅ This change can build next-swc

Copy link
Contributor

github-actions bot commented May 3, 2024

⚠️ CI failed ⚠️

The following steps have failed in CI:

  • Turbopack Rust tests (mac/win, non-blocking)

See workflow summary for details

@sokra sokra marked this pull request as ready for review May 3, 2024 15:57
@sokra sokra requested a review from a team as a code owner May 3, 2024 15:57
@sokra sokra force-pushed the sokra/aggregation-refactor branch from 40c155f to 759d075 Compare May 6, 2024 16:27
@sokra sokra force-pushed the sokra/aggregation-refactor branch from 7317f43 to 31eb100 Compare May 7, 2024 06:07
@sokra sokra force-pushed the sokra/aggregation-refactor branch from eedf18b to 4d25ad5 Compare May 7, 2024 16:31
let count = extra_followers + extra_uppers;
let target = ctx.node(target_id);
if is_in_progress(ctx, upper_id) {
drop(target);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the significance of borrowing before the branch and then dropping?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to check in progress (which is an atomic) while either holding the target or the upper lock. In progress is only set during the target lock, so when we read it it need to be under the target lock. If not in progress we can continue working with the target in the else branch.

Otherwise we want to enqueue our work to the upper node. So we acquire a upper lock. In the meantime the in_progress flag might have changed, so we need to check that again.

Co-authored-by: Alexander Lyon <arlyon@me.com>
@ForsakenHarmony
Copy link
Contributor

Can we maybe add a Markdown doc next to the code describing the overall approach (or just copy the PR description in there)

@sokra
Copy link
Member Author

sokra commented May 8, 2024

Can we maybe add a Markdown doc next to the code describing the overall approach (or just copy the PR description in there)

I'll add this in a follow-up PR

@sokra sokra merged commit adfb599 into main May 8, 2024
46 of 47 checks passed
@sokra sokra deleted the sokra/aggregation-refactor branch May 8, 2024 18:01
sokra added a commit to vercel/next.js that referenced this pull request May 8, 2024
* vercel/turborepo#8082 <!-- Tobias Koppers - Graph
aggregation refactoring -->
Neosoulink pushed a commit to Neosoulink/turbo that referenced this pull request Jun 14, 2024
### Description

* Deletes the aggregation tree
* Adds a new graph aggregation algorithm which is more efficient

The graph aggregation works as following:

For the graph aggregation: Every task is a node in the graph. Every
parent-child relationship is an edge in the graph.

* Every node has an "aggregation number" N.
* There are 2 kinds of nodes: Leaf nodes and aggregating nodes.
* If a node has N < LEAF_NUMBER, it's a leaf node, otherwise an
aggregating node.
* A higher N for a node usually means that a larger subgraph is
aggregated into that node.
* Next to normal edges there are two extra kind of edges for the graph
aggregation: Upper edges and follower edges.
* A node is considered as "inner" to another node when it has an "upper"
edge pointing towards it.
* The inner node has a lower N than the upper node. (This invariant
might be temporarily violated while tree balancing is scheduled but not
executed yet)
* Aggregating nodes store an aggregated version of the state of all
inner nodes and transitively inner nodes.
* Changes in nodes are propagated to all upper nodes.
* Every node has at least one upper node which is more aggregated than
the node. Except for the root node of the graph, which doesn't have
upper edges.
* An aggregating node also has follower edges. They point to the nodes
that are one normal edge after all inner and transitively inner nodes.
* An leaf node doesn't have follower edges. For all purposes the normal
edges of leaf nodes are considered as follower edges.
* Follower nodes have a higher N than the origin node. (This invariant
might be temporarily violated while tree balancing is scheduled but not
executed yet)
* This means large and larger subgraphs are aggregated.
* Graph operations will ensure that these invariants (Higher N on upper
and follower edges) are not violated.
* The N of a node can only increase. So graph operations need to "fix"
the invariants by increasing N or changing upper/follower edges. That
later one is preferred. N is usually only increased if two nodes have
equal N.
* When new edges between leaf nodes are added, the target node's N is
increased to the origin node's N + 4 if it's smaller. This adds a small
tolerance range so increasing N doesn't cause long chains of N += 1
between leaf nodes.

### Testing Instructions

<!--
  Give a quick description of steps to test your changes.
-->


Closes PACK-3036

---------

Co-authored-by: Alexander Lyon <arlyon@me.com>
ForsakenHarmony pushed a commit to vercel/next.js that referenced this pull request Jul 25, 2024
### Description

* Deletes the aggregation tree
* Adds a new graph aggregation algorithm which is more efficient

The graph aggregation works as following:

For the graph aggregation: Every task is a node in the graph. Every
parent-child relationship is an edge in the graph.

* Every node has an "aggregation number" N.
* There are 2 kinds of nodes: Leaf nodes and aggregating nodes.
* If a node has N < LEAF_NUMBER, it's a leaf node, otherwise an
aggregating node.
* A higher N for a node usually means that a larger subgraph is
aggregated into that node.
* Next to normal edges there are two extra kind of edges for the graph
aggregation: Upper edges and follower edges.
* A node is considered as "inner" to another node when it has an "upper"
edge pointing towards it.
* The inner node has a lower N than the upper node. (This invariant
might be temporarily violated while tree balancing is scheduled but not
executed yet)
* Aggregating nodes store an aggregated version of the state of all
inner nodes and transitively inner nodes.
* Changes in nodes are propagated to all upper nodes.
* Every node has at least one upper node which is more aggregated than
the node. Except for the root node of the graph, which doesn't have
upper edges.
* An aggregating node also has follower edges. They point to the nodes
that are one normal edge after all inner and transitively inner nodes.
* An leaf node doesn't have follower edges. For all purposes the normal
edges of leaf nodes are considered as follower edges.
* Follower nodes have a higher N than the origin node. (This invariant
might be temporarily violated while tree balancing is scheduled but not
executed yet)
* This means large and larger subgraphs are aggregated.
* Graph operations will ensure that these invariants (Higher N on upper
and follower edges) are not violated.
* The N of a node can only increase. So graph operations need to "fix"
the invariants by increasing N or changing upper/follower edges. That
later one is preferred. N is usually only increased if two nodes have
equal N.
* When new edges between leaf nodes are added, the target node's N is
increased to the origin node's N + 4 if it's smaller. This adds a small
tolerance range so increasing N doesn't cause long chains of N += 1
between leaf nodes.

### Testing Instructions

<!--
  Give a quick description of steps to test your changes.
-->


Closes PACK-3036

---------

Co-authored-by: Alexander Lyon <arlyon@me.com>
ForsakenHarmony pushed a commit to vercel/next.js that referenced this pull request Jul 29, 2024
### Description

* Deletes the aggregation tree
* Adds a new graph aggregation algorithm which is more efficient

The graph aggregation works as following:

For the graph aggregation: Every task is a node in the graph. Every
parent-child relationship is an edge in the graph.

* Every node has an "aggregation number" N.
* There are 2 kinds of nodes: Leaf nodes and aggregating nodes.
* If a node has N < LEAF_NUMBER, it's a leaf node, otherwise an
aggregating node.
* A higher N for a node usually means that a larger subgraph is
aggregated into that node.
* Next to normal edges there are two extra kind of edges for the graph
aggregation: Upper edges and follower edges.
* A node is considered as "inner" to another node when it has an "upper"
edge pointing towards it.
* The inner node has a lower N than the upper node. (This invariant
might be temporarily violated while tree balancing is scheduled but not
executed yet)
* Aggregating nodes store an aggregated version of the state of all
inner nodes and transitively inner nodes.
* Changes in nodes are propagated to all upper nodes.
* Every node has at least one upper node which is more aggregated than
the node. Except for the root node of the graph, which doesn't have
upper edges.
* An aggregating node also has follower edges. They point to the nodes
that are one normal edge after all inner and transitively inner nodes.
* An leaf node doesn't have follower edges. For all purposes the normal
edges of leaf nodes are considered as follower edges.
* Follower nodes have a higher N than the origin node. (This invariant
might be temporarily violated while tree balancing is scheduled but not
executed yet)
* This means large and larger subgraphs are aggregated.
* Graph operations will ensure that these invariants (Higher N on upper
and follower edges) are not violated.
* The N of a node can only increase. So graph operations need to "fix"
the invariants by increasing N or changing upper/follower edges. That
later one is preferred. N is usually only increased if two nodes have
equal N.
* When new edges between leaf nodes are added, the target node's N is
increased to the origin node's N + 4 if it's smaller. This adds a small
tolerance range so increasing N doesn't cause long chains of N += 1
between leaf nodes.

### Testing Instructions

<!--
  Give a quick description of steps to test your changes.
-->


Closes PACK-3036

---------

Co-authored-by: Alexander Lyon <arlyon@me.com>
ForsakenHarmony pushed a commit to vercel/next.js that referenced this pull request Jul 29, 2024
### Description

* Deletes the aggregation tree
* Adds a new graph aggregation algorithm which is more efficient

The graph aggregation works as following:

For the graph aggregation: Every task is a node in the graph. Every
parent-child relationship is an edge in the graph.

* Every node has an "aggregation number" N.
* There are 2 kinds of nodes: Leaf nodes and aggregating nodes.
* If a node has N < LEAF_NUMBER, it's a leaf node, otherwise an
aggregating node.
* A higher N for a node usually means that a larger subgraph is
aggregated into that node.
* Next to normal edges there are two extra kind of edges for the graph
aggregation: Upper edges and follower edges.
* A node is considered as "inner" to another node when it has an "upper"
edge pointing towards it.
* The inner node has a lower N than the upper node. (This invariant
might be temporarily violated while tree balancing is scheduled but not
executed yet)
* Aggregating nodes store an aggregated version of the state of all
inner nodes and transitively inner nodes.
* Changes in nodes are propagated to all upper nodes.
* Every node has at least one upper node which is more aggregated than
the node. Except for the root node of the graph, which doesn't have
upper edges.
* An aggregating node also has follower edges. They point to the nodes
that are one normal edge after all inner and transitively inner nodes.
* An leaf node doesn't have follower edges. For all purposes the normal
edges of leaf nodes are considered as follower edges.
* Follower nodes have a higher N than the origin node. (This invariant
might be temporarily violated while tree balancing is scheduled but not
executed yet)
* This means large and larger subgraphs are aggregated.
* Graph operations will ensure that these invariants (Higher N on upper
and follower edges) are not violated.
* The N of a node can only increase. So graph operations need to "fix"
the invariants by increasing N or changing upper/follower edges. That
later one is preferred. N is usually only increased if two nodes have
equal N.
* When new edges between leaf nodes are added, the target node's N is
increased to the origin node's N + 4 if it's smaller. This adds a small
tolerance range so increasing N doesn't cause long chains of N += 1
between leaf nodes.

### Testing Instructions

<!--
  Give a quick description of steps to test your changes.
-->


Closes PACK-3036

---------

Co-authored-by: Alexander Lyon <arlyon@me.com>
ForsakenHarmony pushed a commit to vercel/next.js that referenced this pull request Aug 1, 2024
### Description

* Deletes the aggregation tree
* Adds a new graph aggregation algorithm which is more efficient

The graph aggregation works as following:

For the graph aggregation: Every task is a node in the graph. Every
parent-child relationship is an edge in the graph.

* Every node has an "aggregation number" N.
* There are 2 kinds of nodes: Leaf nodes and aggregating nodes.
* If a node has N < LEAF_NUMBER, it's a leaf node, otherwise an
aggregating node.
* A higher N for a node usually means that a larger subgraph is
aggregated into that node.
* Next to normal edges there are two extra kind of edges for the graph
aggregation: Upper edges and follower edges.
* A node is considered as "inner" to another node when it has an "upper"
edge pointing towards it.
* The inner node has a lower N than the upper node. (This invariant
might be temporarily violated while tree balancing is scheduled but not
executed yet)
* Aggregating nodes store an aggregated version of the state of all
inner nodes and transitively inner nodes.
* Changes in nodes are propagated to all upper nodes.
* Every node has at least one upper node which is more aggregated than
the node. Except for the root node of the graph, which doesn't have
upper edges.
* An aggregating node also has follower edges. They point to the nodes
that are one normal edge after all inner and transitively inner nodes.
* An leaf node doesn't have follower edges. For all purposes the normal
edges of leaf nodes are considered as follower edges.
* Follower nodes have a higher N than the origin node. (This invariant
might be temporarily violated while tree balancing is scheduled but not
executed yet)
* This means large and larger subgraphs are aggregated.
* Graph operations will ensure that these invariants (Higher N on upper
and follower edges) are not violated.
* The N of a node can only increase. So graph operations need to "fix"
the invariants by increasing N or changing upper/follower edges. That
later one is preferred. N is usually only increased if two nodes have
equal N.
* When new edges between leaf nodes are added, the target node's N is
increased to the origin node's N + 4 if it's smaller. This adds a small
tolerance range so increasing N doesn't cause long chains of N += 1
between leaf nodes.

### Testing Instructions

<!--
  Give a quick description of steps to test your changes.
-->


Closes PACK-3036

---------

Co-authored-by: Alexander Lyon <arlyon@me.com>
ForsakenHarmony pushed a commit to vercel/next.js that referenced this pull request Aug 16, 2024
* vercel/turborepo#8082 <!-- Tobias Koppers - Graph
aggregation refactoring -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants