Improve Garbage Collection #8154

sokra · 2024-05-15T21:43:05Z

Description

Simplify and improve GC

This improves the GC queue.

The job of the GC queue is to find tasks that should be garbage collected. There are three factors which influence that:

age of the task: Time since last access.
memory usage of the task
compute duration of the task: CPU time spend to compute the task.

Memory usage and compute duration combine into a GC priority by calculating: (memory_usage + C1) / (compute_duration + C2). C1 and C2 and constants to fine tune the priority.

The age of the task is constantly changing so a different scheme is used here:

Every task has a generation in which is was last accessed.
The generation is increased every 100,000 tasks.

We accumulate tasks in the current generation in a concurrent queue. Once 100,000 tasks are reached (atomic counter), we increase the generation and pop 100,000 tasks from the queue into an OldGeneration. These old generations are stored in another queue. No storing is apply so far. These are just lists of task ids.

Once we need to perform GC, we pop the oldest old generation from the queue, filter out all tasks that are in a higher generation (they have been accessed in the meantime), and sort the list by GC priority.
Then we take the 30% top tasks and garbage collect them.
Then remaining tasks are pushed to the front of the queue again, intermixed with other tasks into existing old generations until we reach a maximum of 200,000 tasks in a generation item. In that case the generation item is split into two items.

Testing Instructions

vercel · 2024-05-15T21:43:09Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
examples-nonmonorepo	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	May 17, 2024 2:47pm
rust-docs	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	May 17, 2024 2:47pm

8 Ignored Deployments

Name	Status	Preview	Updated (UTC)
examples-basic-web	⬜️ Ignored (Inspect)	Visit Preview	May 17, 2024 2:47pm
examples-designsystem-docs	⬜️ Ignored (Inspect)	Visit Preview	May 17, 2024 2:47pm
examples-gatsby-web	⬜️ Ignored (Inspect)	Visit Preview	May 17, 2024 2:47pm
examples-kitchensink-blog	⬜️ Ignored (Inspect)	Visit Preview	May 17, 2024 2:47pm
examples-native-web	⬜️ Ignored (Inspect)	Visit Preview	May 17, 2024 2:47pm
examples-svelte-web	⬜️ Ignored (Inspect)	Visit Preview	May 17, 2024 2:47pm
examples-tailwind-web	⬜️ Ignored (Inspect)	Visit Preview	May 17, 2024 2:47pm
examples-vite-web	⬜️ Ignored (Inspect)	Visit Preview	May 17, 2024 2:47pm

github-actions · 2024-05-15T21:45:03Z

🟢 Turbopack Benchmark CI successful 🟢

Thanks

github-actions · 2024-05-15T21:46:02Z

⚠️ This change may fail to build next-swc.

Logs

error: failed to select a version for `swc_common`.
    ... required by package `swc_core v0.92.5`
    ... which satisfies dependency `swc_core = "^0.92.5"` of package `turbopack-binding v0.1.0 (https://github.com/vercel/turbo?rev=2f789bdf72d82754f49bf41c86752b3d831d9a6d#11e59b96)`
    ... which satisfies git dependency `turbopack-binding` (locked to 0.1.0) of package `next-swc-napi v0.0.0 (/root/actions-runner/_work/turbo/turbo/packages/next-swc/crates/napi)`
versions that meet the requirements `^0.33.26` are: 0.33.26

all possible versions conflict with previously selected packages.

  previously selected package `swc_common v0.33.24`
    ... which satisfies dependency `swc_common = "^0.33.20"` (locked to 0.33.24) of package `swc_core v0.90.33`
    ... which satisfies dependency `swc_core = "^0.90.33"` (locked to 0.90.33) of package `wasm v0.0.0 (/root/actions-runner/_work/turbo/turbo/packages/next-swc/crates/wasm)`

failed to select a version for `swc_common` which could resolve this conflict

See job summary for details

github-actions · 2024-05-15T21:48:29Z

⚠️ CI failed ⚠️

The following steps have failed in CI:

Turbopack Rust tests (mac/win, non-blocking)

See workflow summary for details

bgw

I didn't get through the whole review, but I left a few comments: some suggestions, some questions, and some observations.

crates/turbo-tasks-memory/src/gc.rs

crates/turbo-tasks-memory/src/memory_backend.rs

bgw · 2024-05-15T23:26:31Z

crates/turbo-tasks-memory/src/memory_backend.rs

+    pub fn run_gc(
+        &self,
+        idle: bool,
+        _turbo_tasks: &dyn TurboTasksBackendApi<MemoryBackend>,


Can this argument be removed?

crates/turbo-tasks-memory/src/memory_backend.rs

bgw

Because this is such a huge change, it would be nice if we could include some high-level benchmarks alongside it to make sure we're not significantly regressing on peak memory usage or CPU.

This all seems a bit different from a generational GC as I'm familiar with (in the context of tracing GCs). Maybe the terminology "generation" being used in a different context is what's confusing to me, though it's not wrong as a description of what this is. This seems more like a bucketed LRU cache.

In my understanding of generational GC:

Objects in the older generation are less likely to be collected / iterated over. This seems like the inverse of that.
Older objects get moved to a separate tier (possibly one of many tiers) of "survivors" that are less frequently traversed. We don't have any sort of logic for that.

That leads me to a few thoughts on potential future ways to improve this:

Consider a "segmented LRU", which shares some similarities to generational tracing GCs: https://memcached.org/blog/modern-lru/
There are edge-cases where LRU cache evictions can severely degrade if the cache is too small to contain all frequently accessed items. If we think this is potential concern, we could add some amount of randomization to cache eviction, which would lead to a more graceful degradation.

crates/turbo-tasks-memory/src/lib.rs

crates/turbo-tasks-memory/src/gc.rs

crates/turbo-tasks-memory/src/memory_backend.rs

crates/turbo-tasks-memory/src/gc.rs

sokra · 2024-05-17T13:01:49Z

generational GC:

Objects in the older generation are less likely to be collected / iterated over. This seems like the inverse of that.

Yep that's true. We don't write a real GC in the sense of collecting unreferenced memory, as "unreferenced" doesn't exist in our system. We write a cache, where any cache entry might be access anytime in future, but also when can evict any cache entry anytime without hurting correctness (but only performance).

So we want the opposite of a generational GC. We want older cache entries to be more likely to be evicted. But we also want the memory usage and compute time of cache entries to influence the evicting behavior.

So we bucket cache entries into buckets of (currently) 100,000 items. We call them generations. When under memory pressure we start processing the oldest generation. We select 30% of cache entries that have the highest GC priority and GC collect them. The 70% remaining cache entries are pushed into the bucket at of the freshest generation. So they are intermixed with that generation. That will reconsider these entries for GC when we cycled through all old generations.

There is a maximum of (currently) 200,000 items per bucket. Buckets are split evenly into two buckets when they get too full.

Using buckets for age is kind of nice as it avoids having to include age (which is continuously increasing) into the priority and it also avoids having to sort all cache items into a very big priority queue. We basically don't have to sort anything until GC is invoked and then we only have to sort a "small" bucket of items.

### Description Simplify and improve GC This improves the GC queue. The job of the GC queue is to find tasks that should be garbage collected. There are three factors which influence that: * age of the task: Time since last access. * memory usage of the task * compute duration of the task: CPU time spend to compute the task. Memory usage and compute duration combine into a GC priority by calculating: `(memory_usage + C1) / (compute_duration + C2)`. C1 and C2 and constants to fine tune the priority. The age of the task is constantly changing so a different scheme is used here: Every task has a generation in which is was last accessed. The generation is increased every 100,000 tasks. We accumulate tasks in the current generation in a concurrent queue. Once 100,000 tasks are reached (atomic counter), we increase the generation and pop 100,000 tasks from the queue into an `OldGeneration`. These old generations are stored in another queue. No storing is apply so far. These are just lists of task ids. Once we need to perform GC, we pop the oldest old generation from the queue, filter out all tasks that are in a higher generation (they have been accessed in the meantime), and sort the list by GC priority. Then we take the 30% top tasks and garbage collect them. Then remaining tasks are pushed to the front of the queue again, intermixed with other tasks into existing old generations until we reach a maximum of 200,000 tasks in a generation item. In that case the generation item is split into two items. ### Testing Instructions

sokra requested a review from a team as a code owner May 15, 2024 21:43

turbo-orchestrator bot added created-by: turbopack labels May 15, 2024

vercel bot deployed to Preview – rust-docs May 15, 2024 21:50 View deployment

bgw reviewed May 15, 2024

View reviewed changes

sokra force-pushed the sokra/gc branch from e196583 to 7018de9 Compare May 16, 2024 09:17

vercel bot deployed to Preview – examples-nonmonorepo May 16, 2024 09:18 View deployment

sokra requested a review from arlyon May 16, 2024 09:21

vercel bot deployed to Preview – rust-docs May 16, 2024 09:28 View deployment

vercel bot deployed to Preview – examples-nonmonorepo May 16, 2024 09:36 View deployment

sokra force-pushed the sokra/gc branch from 9a641d3 to 3cd150a Compare May 16, 2024 09:38

vercel bot deployed to Preview – examples-nonmonorepo May 16, 2024 09:39 View deployment

vercel bot deployed to Preview – rust-docs May 16, 2024 09:56 View deployment

vercel bot deployed to Preview – examples-nonmonorepo May 16, 2024 13:49 View deployment

vercel bot deployed to Preview – rust-docs May 16, 2024 13:59 View deployment

Base automatically changed from sokra/memory-tracking to main May 16, 2024 21:13

sokra added 5 commits May 16, 2024 23:34

parallel Completion

170e6e3

improve Garbage Collection

343b38e

clippy, review

f763cc3

remove unused code

5ab02a6

update example

8ad613f

sokra force-pushed the sokra/gc branch from 42054db to 8ad613f Compare May 16, 2024 21:35

vercel bot deployed to Preview – examples-nonmonorepo May 16, 2024 21:36 View deployment

vercel bot deployed to Preview – rust-docs May 16, 2024 21:45 View deployment

review

0ab4c26

vercel bot deployed to Preview – examples-nonmonorepo May 16, 2024 22:09 View deployment

vercel bot deployed to Preview – rust-docs May 16, 2024 22:18 View deployment

bgw approved these changes May 17, 2024

View reviewed changes

sokra added 8 commits May 17, 2024 14:44

remove concurrent priority queue

4e31217

reenable GC after task execution

611ecc5

rename variable

5a6e492

avoid counting tracing memory towards tasks

48561ee

make dropping more clear

d3bbf64

add note about tracked valueless

ed99a20

remove printlns

20e5128

remove printlns

9ed1846

vercel bot deployed to Preview – examples-nonmonorepo May 17, 2024 12:47 View deployment

vercel bot deployed to Preview – rust-docs May 17, 2024 12:57 View deployment

Merge branch 'main' into sokra/gc

1d0fbcc

vercel bot deployed to Preview – examples-nonmonorepo May 17, 2024 14:37 View deployment

vercel bot deployed to Preview – rust-docs May 17, 2024 14:47 View deployment

sokra merged commit 864a6ad into main May 17, 2024
47 of 50 checks passed

sokra deleted the sokra/gc branch May 17, 2024 15:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve Garbage Collection #8154

Improve Garbage Collection #8154

sokra commented May 15, 2024 •

edited

Loading

vercel bot commented May 15, 2024 •

edited

Loading

github-actions bot commented May 15, 2024 •

edited

Loading

github-actions bot commented May 15, 2024 •

edited

Loading

github-actions bot commented May 15, 2024 •

edited

Loading

bgw left a comment

bgw May 15, 2024

bgw left a comment

sokra commented May 17, 2024

Improve Garbage Collection #8154

Improve Garbage Collection #8154

Conversation

sokra commented May 15, 2024 • edited Loading

Description

Testing Instructions

vercel bot commented May 15, 2024 • edited Loading

github-actions bot commented May 15, 2024 • edited Loading

🟢 Turbopack Benchmark CI successful 🟢

github-actions bot commented May 15, 2024 • edited Loading

github-actions bot commented May 15, 2024 • edited Loading

⚠️ CI failed ⚠️

bgw left a comment

Choose a reason for hiding this comment

bgw May 15, 2024

Choose a reason for hiding this comment

bgw left a comment

Choose a reason for hiding this comment

sokra commented May 17, 2024

sokra commented May 15, 2024 •

edited

Loading

vercel bot commented May 15, 2024 •

edited

Loading

github-actions bot commented May 15, 2024 •

edited

Loading

github-actions bot commented May 15, 2024 •

edited

Loading

github-actions bot commented May 15, 2024 •

edited

Loading