-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PROF-9476] Managed string storage for interning over several profiles #607
Conversation
BenchmarksComparisonBenchmark execution time: 2024-09-02 17:22:16 Comparing candidate commit 93f84f5 in PR branch Found 33 performance improvements and 5 performance regressions! Performance is the same for 3 metrics, 2 unstable metrics. scenario:credit_card/is_card_number/
scenario:credit_card/is_card_number/ 3782-8224-6310-005
scenario:credit_card/is_card_number/ 378282246310005
scenario:credit_card/is_card_number/37828224631
scenario:credit_card/is_card_number/378282246310005
scenario:credit_card/is_card_number/37828224631000521389798
scenario:credit_card/is_card_number/x371413321323331
scenario:credit_card/is_card_number_no_luhn/
scenario:credit_card/is_card_number_no_luhn/ 3782-8224-6310-005
scenario:credit_card/is_card_number_no_luhn/ 378282246310005
scenario:credit_card/is_card_number_no_luhn/37828224631
scenario:credit_card/is_card_number_no_luhn/378282246310005
scenario:credit_card/is_card_number_no_luhn/37828224631000521389798
scenario:credit_card/is_card_number_no_luhn/x371413321323331
scenario:normalization/normalize_name/normalize_name/Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Long-.Too-Lo...
scenario:normalization/normalize_name/normalize_name/bad-name
scenario:normalization/normalize_name/normalize_name/good
scenario:normalization/normalize_service/normalize_service/A0000000000000000000000000000000000000000000000000...
scenario:normalization/normalize_service/normalize_service/Data🐨dog🐶 繋がっ⛰てて
scenario:normalization/normalize_service/normalize_service/Test Conversion 0f Weird !@#$%^&**() Characters
scenario:normalization/normalize_service/normalize_service/[empty string]
scenario:normalization/normalize_service/normalize_service/test_ASCII
scenario:normalization/normalize_trace/test_trace
scenario:redis/obfuscate_redis_string
scenario:sql/obfuscate_sql_string
scenario:tags/replace_trace_tags
CandidateCandidate benchmark detailsGroup 1
Group 2
Group 3
Group 4
Group 5
Group 6
Group 7
Group 8
Group 9
Group 10
BaselineOmitted due to size. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks quite nice!
The blast radius is not very big, and to be honest, 99% of it seems to be the change to make apis take both an id or a string.
I think if we can solve that one, then the diff for this feature would be quite small + it'll be very close to zero cost for libdatadog clients that aren't using it, which seems like a really nice property.
#[derive(Copy, Clone, Default, Debug)] | ||
pub struct Label<'a> { | ||
pub key: CharSlice<'a>, | ||
pub key_id: u32, | ||
|
||
/// At most one of the following must be present | ||
pub str: CharSlice<'a>, | ||
pub str_id: u32, | ||
pub num: i64, | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe not for the PoC, but it occurs to me that perhaps we could introduce a new type that can represent either a charslice or an interned string:
struct {
kind: byte;
union {
slice: ddog_Slice_CChar
id: u32
}
}
We could tweak the ddog_Slice_CChar definition to even make sure that the final structure/union is the same size a char slice today (as it's kinda wasteful -- a ptr + a whole 64-bit length for the string which we'll never need)
(It's less clear to me if this nicety would be only for the ffi or if it would make sense to propagate to the rust-level API...)
/// # Arguments | ||
/// * `sample_types` | ||
/// * `period` - Optional period of the profile. Passing None/null translates to zero values. | ||
/// * `start_time` - Optional time the profile started at. Passing None/null will use the current | ||
/// time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: Missing the new argument ;)
@@ -38,6 +41,7 @@ pub struct Profile { | |||
stack_traces: FxIndexSet<StackTrace>, | |||
start_time: SystemTime, | |||
strings: StringTable, | |||
string_storage: Option<Rc<RwLock<ManagedStringStorage>>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not for the first version, but my concurrency background does immediately make me think that perhaps we could get better performance by having something more specific than just wrapping this whole thing with a lock.
For instance: If we give up concurrent writing and assume only a single writer, we may be able to do a more of a transaction-like thing where the writer can do whatever it wants, and the reader needs to do a few reads and may "abort" and need to restart if it was concurrent with a writer that mutated something it wanted.
pub unsafe extern "C" fn ddog_prof_ManagedStringStorage_intern( | ||
storage: ManagedStringStorage, | ||
string: Option<&CharSlice>, | ||
) -> ManagedStringStorageInternResult { | ||
(|| { | ||
let storage = get_inner_string_storage(storage, true)?; | ||
|
||
let string: &CharSlice = string.expect("non null string"); | ||
let string: &str = CStr::from_ptr(string.as_ptr()) | ||
.to_str() | ||
.expect("valid utf8 string"); | ||
|
||
let string_id = storage | ||
.write() | ||
.expect("acquisition of write lock on string storage should succeed") | ||
.intern(string); | ||
|
||
anyhow::Ok(string_id) | ||
})() | ||
.context("ddog_prof_Profile_serialize failed") | ||
.into() | ||
} | ||
|
||
#[must_use] | ||
#[no_mangle] | ||
pub unsafe extern "C" fn ddog_prof_ManagedStringStorage_unintern( | ||
storage: ManagedStringStorage, | ||
id: u32, | ||
) -> ManagedStringStorageResult { | ||
(|| { | ||
let storage = get_inner_string_storage(storage, true)?; | ||
storage | ||
.read() | ||
.expect("acquisition of read lock on string storage should succeed") | ||
.unintern(id); | ||
anyhow::Ok(()) | ||
})() | ||
.context("ddog_prof_Profile_serialize failed") | ||
.into() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it would make sense to have versions of these methods that take in arrays of strings/indexes as an optimization.
Specifically, thus far most API calls we did to libdatadog on the "hot path" are big, heavy things, and we do a very small amount of them, so any overhead in ffi and whatnot is quite small.
For interning, we'll often be interning multiple values in a row (or even just pairs -- function name and filename), so on the caller side could cheaply stack-allocate an array, put all the strings in there, and feed them all in one step to libdatadog.
if function.name_id > 0 || function.system_name_id > 0 || function.filename_id > 0 { | ||
name = self.resolve(function.name_id); | ||
system_name = self.resolve(function.system_name_id); | ||
filename = self.resolve(function.filename_id); | ||
} else { | ||
name = self.intern(function.name); | ||
system_name = self.intern(function.system_name); | ||
filename = self.intern(function.filename) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess seeing this pattern of "id or not id" in the code makes me "double down" on thinking that it would be nice if we could have a string representation that could either be a string or a pre-interned id, so that most of these functions would not need any change (and the change would be in intern
or something like that).
(It would also solve the problem of right now the code forces us to always use an id or not use an id, but doesn't really barf if we try to mix them up, although it will do the wrong thing.)
fn get_string(&self, id: StringId) -> Rc<str> { | ||
self.set | ||
.get_index(id.to_offset()) | ||
.expect("StringId to have a valid interned index") | ||
.clone() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: We should probably return an error (or option) to the caller, rather than blowing up if an id is incorrect
pub fn unintern(&self, id: u32) { | ||
let data = self.get_data(id); | ||
let usage_count = &data.usage_count; | ||
usage_count.set(usage_count.get() - 1); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: Should we immediately clean the entry if usage_count gets below zero?
self.id_to_data.retain(|_, data| { | ||
/*let used_in_this_gen = data | ||
.last_usage_gen | ||
.get() | ||
.map_or(false, |last_usage_gen| last_usage_gen == self.current_gen); | ||
if used_in_this_gen || *id == 0 { | ||
// Empty string (id = 0) or anything that was used in the gen | ||
// we are now closing, is kept alive | ||
return true; | ||
}*/ | ||
if data.usage_count.get() > 0 { | ||
return true; | ||
} | ||
|
||
self.str_to_id.remove_entry(&data.str); | ||
false | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a great suggestion for this one, but I'll point out that this may get a bit expensive, especially as we accumulate strings.
(GC-like generational storage anyone? 🔥 Half-kidding, half-not-kidding ;D)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the PoC it seems fine, but yeah I think once we want to merge this in, it's worth writing down a bit of text to explain what all the weirdness going on in here is doing.
(I was able to follow it along with a lot of context, but I think it's going to be hard to wrap our heads around this once this gets in)
|
||
storage | ||
.write() | ||
.expect("acquisition of write lock on string storage should succeed") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think expect
panics on error; since we already have a structure for returning an error from this function, I think it's a bit nicer to just return back an error ;)
fn intern(&mut self, item: Rc<str>) -> StringId { | ||
// For performance, delay converting the [&str] to a [String] until | ||
// after it has been determined to not exist in the set. This avoids | ||
// temporary allocations. | ||
let index = match self.set.get_index_of(&item) { | ||
Some(index) => index, | ||
None => { | ||
let (index, _inserted) = self.set.insert_full(item.clone()); | ||
// This wouldn't make any sense; the item couldn't be found so | ||
// we try to insert it, but suddenly it exists now? | ||
debug_assert!(_inserted); | ||
index | ||
} | ||
}; | ||
StringId::from_offset(index) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It occurred to me that while ideally all this structure would be lock-free and whatnot, inserting and claiming an index is actually the part that would be really nice to be able to do in a non-blocking manner.
I was googling and didn't find a lot of interesting solid/production pre-existing data structure implementations with the kinds of characteristics we'd want (which is super disappointing -- I miss you java.util.concurrent), but yea, may be worth keeping an eye out if we find something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe https://crates.io/crates/dashmap ? But we'd need to solve the index assignment (easy if we don't do index reuse, really annoying if we want to do that)
Replaced by #725 |
What does this PR do?
A brief description of the change being made with this pull request.
Motivation
What inspired you to submit this pull request?
Additional Notes
Anything else we should know when reviewing?
How to test the change?
Describe here in detail how the change can be validated.