-
Notifications
You must be signed in to change notification settings - Fork 1.6k
[DNM] Wrapper allocator PoC #7206
base: master
Are you sure you want to change the base?
Changes from 2 commits
d86bd79
998a6ff
dbd40a7
c8e9d1c
b841129
d3e3e72
089e6d8
818699a
052b096
66e8a8b
dbdeb52
1dfd9ca
152888f
70aed93
88c1a70
b137472
b810ce4
314e519
62b489f
14e6605
929e2d4
06ccdf2
aaea117
c7bbfba
08f4333
0f27b6c
e813323
12fdcba
0f57383
2dda590
730a1c8
04ae532
6f9fe26
ffb8d15
74b2fec
2a2393f
ed8f0f8
4f47d3c
d25f550
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
[package] | ||
name = "wrapper-allocator" | ||
s0me0ne-unkn0wn marked this conversation as resolved.
Show resolved
Hide resolved
|
||
description = "Wrapper allocator to control amount of memory consumed by PVF preparation process" | ||
version.workspace = true | ||
authors.workspace = true | ||
edition.workspace = true | ||
|
||
[dependencies] | ||
tikv-jemallocator = "0.5.0" |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,78 @@ | ||
// Copyright (C) Parity Technologies (UK) Ltd. | ||
// This file is part of Polkadot. | ||
|
||
// Polkadot is free software: you can redistribute it and/or modify | ||
// it under the terms of the GNU General Public License as published by | ||
// the Free Software Foundation, either version 3 of the License, or | ||
// (at your option) any later version. | ||
|
||
// Polkadot is distributed in the hope that it will be useful, | ||
// but WITHOUT ANY WARRANTY; without even the implied warranty of | ||
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | ||
// GNU General Public License for more details. | ||
|
||
// You should have received a copy of the GNU General Public License | ||
// along with Polkadot. If not, see <http://www.gnu.org/licenses/>. | ||
|
||
use core::sync::atomic::{ Ordering::SeqCst, AtomicUsize }; | ||
use core::alloc::{GlobalAlloc, Layout}; | ||
use tikv_jemallocator::Jemalloc; | ||
|
||
pub struct WrapperAllocatorData { | ||
allocated: AtomicUsize, | ||
checkpoint: AtomicUsize, | ||
peak: AtomicUsize, | ||
// limit: AtomicUsize, // Should we introduce a checkpoint limit and fail allocation if the limit is hit? | ||
} | ||
|
||
impl WrapperAllocatorData { | ||
/// Marks a new checkpoint. Returns peak allocation, in bytes, since the last checkpoint. | ||
pub fn checkpoint(&self) -> usize { | ||
let allocated = ALLOCATOR_DATA.allocated.load(SeqCst); | ||
let old_cp = ALLOCATOR_DATA.checkpoint.swap(allocated, SeqCst); | ||
ALLOCATOR_DATA.peak.swap(allocated, SeqCst).saturating_sub(old_cp) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. While access to each field of
Either these values shouldn't depend on each other and work as plain counters or something, or the whole struct should go under mutex. I mean, I understand this is our best effort, and thus it depends on what is it we're trying to achieve. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yes, that's exactly the concern I'm worried about! And I'm trying to convince myself it is okay. Not sure at all. It's okay to have a global lock in the Say we have two threads, as in your example, and they're allocating at nearly the same point in time. I'm indexing the values below for clarity.
Thus it seems we can avoid a global lock inside the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Using a spinlock should probably be fine here as the critical section here's going to be very short anyway. Another more complicated alternative that would help with thread contention would be to make this per thread: put the state in thread local storage and have one spinlock per TLS, and only when grabbing a checkpoint lock them all and collate the data. I don't think it's worth it though. |
||
} | ||
} | ||
|
||
pub static ALLOCATOR_DATA: WrapperAllocatorData = WrapperAllocatorData { allocated: AtomicUsize::new(0), checkpoint: AtomicUsize::new(0), peak: AtomicUsize::new(0) }; | ||
|
||
struct WrapperAllocator<A: GlobalAlloc>(A); | ||
|
||
unsafe impl<A: GlobalAlloc> GlobalAlloc for WrapperAllocator<A> { | ||
|
||
// SAFETY: The wrapped methods are as safe as the underlying allocator implementation is. | ||
|
||
#[inline] | ||
unsafe fn alloc(&self, layout: Layout) -> *mut u8 { | ||
let old_alloc = ALLOCATOR_DATA.allocated.fetch_add(layout.size(), SeqCst); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This two lines of code could be refactored out as method on WrapperAllocatorData |
||
ALLOCATOR_DATA.peak.fetch_max(old_alloc + layout.size(), SeqCst); | ||
self.0.alloc(layout) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the memory ordering can just be Locks would make this slightly more deterministic but it doesn't seem worth the cost. Say you have a situation like thread1 calling But I think we decided on using |
||
} | ||
|
||
#[inline] | ||
unsafe fn alloc_zeroed(&self, layout: Layout) -> *mut u8 { | ||
let old_alloc = ALLOCATOR_DATA.allocated.fetch_add(layout.size(), SeqCst); | ||
ALLOCATOR_DATA.peak.fetch_max(old_alloc + layout.size(), SeqCst); | ||
self.0.alloc_zeroed(layout) | ||
} | ||
|
||
#[inline] | ||
unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout) -> () { | ||
ALLOCATOR_DATA.allocated.fetch_sub(layout.size(), SeqCst); | ||
self.0.dealloc(ptr, layout) | ||
} | ||
|
||
#[inline] | ||
unsafe fn realloc(&self, ptr: *mut u8, layout: Layout, new_size: usize) -> *mut u8 { | ||
if new_size > layout.size() { | ||
let old_alloc = ALLOCATOR_DATA.allocated.fetch_add(new_size - layout.size(), SeqCst); | ||
ALLOCATOR_DATA.peak.fetch_max(old_alloc + new_size - layout.size(), SeqCst); | ||
} else { | ||
ALLOCATOR_DATA.allocated.fetch_sub(layout.size() - new_size, SeqCst); | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, you're right, I'm already thinking about making them |
||
self.0.realloc(ptr, layout, new_size) | ||
} | ||
} | ||
|
||
#[global_allocator] | ||
static ALLOC: WrapperAllocator<Jemalloc> = WrapperAllocator(Jemalloc); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few questions:
get_max_rss_thread
and why isn't that enough ?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue with that is it is not deterministic enough. (Is mainly being used for gathering metrics right now.) Different kernel configurations/versions may manage memory differently (simple example is some validators may have swap enabled and some not). So they may get different values for the resident memory (how much is actually held in RAM). So some validators may reach the limit and others not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope yes, in the preparation phase we're not executing any untrusted code so we just presume that Wasmtime guys know what they're doing and won't abuse the stack usage. We only bother about malicious Wasm code that could force the compiler to allocate a lot of memory.
A lot of concerns here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have disputes for preparation, but what can happen is the attacker gets lucky and gets the PVF through pre-checking without hitting the limits, but then the limits are actually hit when preparing for execution causing no-shows (since we don't dispute on preparation errors). I guess we can have a lower, stricter limit for pre-checking, which we should have anyway.
Good point. We can't use
RUSAGE_SELF
instead ofRUSAGE_THREAD
because there's no way to "reset" the max from a previous job.