-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Add push and get methods for RestartLastVotedForkSlots #33613
Add push and get methods for RestartLastVotedForkSlots #33613
Conversation
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## master #33613 +/- ##
========================================
Coverage 81.9% 81.9%
========================================
Files 816 816
Lines 219753 219911 +158
========================================
+ Hits 180001 180169 +168
+ Misses 39752 39742 -10 |
gossip/src/cluster_info.rs
Outdated
let mut last_voted_fork_slots = RestartLastVotedForkSlots::new( | ||
self.id(), | ||
now, | ||
last_vote_bankhash, | ||
self.my_shred_version(), | ||
); | ||
last_voted_fork_slots.fill(update); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need this 2 stage initialization?
Can't new
fill out the slots as well?
also, the update logic is quite different from push_epoch_slots
; i.e the existing value in gossip crds is entirely discarded. is that intentional? if so then RestartLastVotedForkSlots
api could be quite simpler (e.g. no need for 2 stage initialization).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now we only have one packet per validator, yes the value with newer timestamp should always replace the old one.
The 2 stage initialization was still there because we need self.max_compressed_slot_size(), which can't be called in a constructor. But maybe we can approximate assuming header etc is around 100+?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this initialization, max_compressed_slot_size
is always called on a RestartLastVotedForkSlots
with empty slots
. So it is a constant number. You can define it as a const
and then define a test which validates the const
value. Similar to:
https://github.com/solana-labs/solana/blob/09e858d93/gossip/src/cluster_info.rs#L3442-L3448
Also looking at the code, when there are too many slots, fill
is taking the initial set of slots and ignoring the remaining ones.
You probably want to do the opposite; i.e. take the last slots, not the first slots.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, actually I don't care too much about time complexity here, because it is only called once on start. So I can try to pack in as many slots as possible into one packet. Changed.
gossip/src/crds_value.rs
Outdated
last_voted_fork.sort(); | ||
let mut compressed_slots = CompressedSlots::new(1); | ||
let fork_length = last_voted_fork.len(); | ||
let last_voted_slot = last_voted_fork.last(); | ||
for slice_length in | ||
max(1, min(fork_length, MAX_RESTART_LAST_VOTED_FORK_SLOTS_SPACE))..fork_length + 1 | ||
{ | ||
let mut new_slots = CompressedSlots::new(slice_length); | ||
let new_slice = &last_voted_fork[fork_length.saturating_sub(slice_length)..]; | ||
new_slots.add(new_slice); | ||
let _ = new_slots.deflate(); | ||
if last_voted_slot.unwrap() - new_slots.first_slot() < MAX_SLOTS_PER_ENTRY as u64 | ||
&& serialized_size(&new_slots).unwrap() | ||
< MAX_RESTART_LAST_VOTED_FORK_SLOTS_SPACE as u64 | ||
{ | ||
compressed_slots = new_slots; | ||
} else { | ||
break; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now pretty confusing and hacky. Can we do something better in here?
I think part of the complexity is because of using CompressedSlots
here, but we need a much simpler construct here because unlike EpochSlots
this is not updating/appending to old values.
For example, just encoding the last slot, and a bit-vector for representing the parents.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we only have ~800 bytes available, a straight bit-vector can only encode 7.2k slots, so I put flate2 compression in there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure that flate2 compression don't make it worse?
I mean the bits representing parent slots can be totally random and not have any patterns, in which case compression could make it worse.
Also there is this concern with using compression: #14362 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is valid concern. The problem of not using compression is that we can only have 7.2k slots going back in time which is roughly 48 minutes, this is really too short because currently we rarely resolve an outage within 48 minutes. (Even if your local vote is chaining to a parent generated 2 hours ago, your bitmap in the past 48 minutes will all be 0's so it's zero information.)
About whether compression will make it worse, I tested a few case, it sounds like compression can compress large sequence of 0's or 1's really well. If you think about what's happening in an outage (@carllin since he expressed similar concerns):
- outages where blocks arrive very slowly (outage 20230225, 20220430): most validators can't really vote on the blocks generated by others, so it's voting on a local fork with blocks generated by itself
- outages where new blocks can't chain off each other, they all keep building from the parent (outage 20220930): same as above, voting on a local fork with blocks generated by itself
- outages where >1/3 of the validators don't agree any more (outage 20220601): in this case we normally have several big forks, so you should see >1/3 of 1's in the forks we care about, that should be many consecutive 1's
So I think in the outage scenarios we have seen, compression will generally help. If we are concerned about compression here, we can implement our own compression in a separate PR if needed, or we can consider more than one RestartLastVotedForkSlots packets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you expect a lot of consecutive 1
's in the parent chain, then maybe we can use a run-length encoding for 1
s
https://en.wikipedia.org/wiki/Run-length_encoding
The lengths can also be encoded with varint encoding:
https://github.com/solana-labs/solana/blob/54b796f5a/sdk/program/src/serde_varint.rs
I don't know if it is going to be better or not, and the worst case still adds overhead compared to not encoded raw data.
do you want to do some experiments on how does above compare against flate2 compression?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think during outages there will be lots of consecutive 0's or consecutive 1's in the bitmap, consecutive 0's happen when blocks arrive very slowly, consecutive 1's happen when >1/3 of the validators diverge. In both cases run-length encoding can probably compress the bitmap tight enough to fit 16k slots. I'll have a try at implementing a very simple run-length encoding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I replaced flate2 compression with a simple run-length encoding implementation. Under the best case (tons of consecutive 0's or 1's) the compression rate is 127:1. Under the worst case (very random) it performs worse than raw bits. The performance is comparable to flate2, but it's so much easier to fit everything into given byte array here. The previous flate2 implementation has a bug because at end of compression it needs to write some headers into the output buffer, and the header length is very hard to predict.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And RestartLastVotedForkSlots will select raw bits if that performs better.
gossip/src/crds_value.rs
Outdated
pub last_voted_hash: Hash, | ||
pub shred_version: u16, | ||
} | ||
|
||
impl Sanitize for RestartLastVotedForkSlots { | ||
fn sanitize(&self) -> std::result::Result<(), SanitizeError> { | ||
if self.slots.is_empty() { | ||
if self.last_voted_slot == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because you almost will never encode last_vote_slot to be 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so is the case for slot 1 or 2 or 3. should we rule them out as well?
if nothing breaks with last_voted_slot == 0
then I don't think sanitize should bother with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
gossip/src/crds_value.rs
Outdated
); | ||
} | ||
Err(e) => error!("Error compressing slots {:?}", e), | ||
_ => (), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What cases is _
ignoring?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed.
gossip/src/crds_value.rs
Outdated
pub fn to_slots(&self, min_slot: Slot) -> Result<Vec<Slot>, DecompressError> { | ||
let mut slots = Vec::new(); | ||
if self.last_voted_slot >= min_slot { | ||
let mut uncompressed = Vec::with_capacity(self.uncompressed_bytes as usize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How come EpochSlots
does not need to encode uncompressed_bytes
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EpochSlots uses self.num to see how many slots are encoded, which is similar to self.uncompressed_bytes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A malicious node can cause OOM by sending very large uncompressed_bytes
value, and you are not sanitizing this field.
This has at most a minor impact on performance. We can remove it entirely an instead use something like:
Vec::with_capacity(self.compressed_slots.len() * 2)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually we don't need more than MAX_SLOTS_PER_ENTRY, changed.
gossip/src/crds_value.rs
Outdated
pub fn to_slots(&self, min_slot: Slot) -> Result<Vec<Slot>, DecompressError> { | ||
let mut slots = Vec::new(); | ||
if self.last_voted_slot >= min_slot { | ||
let mut uncompressed = Vec::with_capacity(self.uncompressed_bytes as usize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A malicious node can cause OOM by sending very large uncompressed_bytes
value, and you are not sanitizing this field.
This has at most a minor impact on performance. We can remove it entirely an instead use something like:
Vec::with_capacity(self.compressed_slots.len() * 2)
gossip/src/crds_value.rs
Outdated
pub last_voted_hash: Hash, | ||
pub shred_version: u16, | ||
} | ||
|
||
impl Sanitize for RestartLastVotedForkSlots { | ||
fn sanitize(&self) -> std::result::Result<(), SanitizeError> { | ||
if self.slots.is_empty() { | ||
if self.last_voted_slot == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so is the case for slot 1 or 2 or 3. should we rule them out as well?
if nothing breaks with last_voted_slot == 0
then I don't think sanitize should bother with that.
gossip/src/crds_value.rs
Outdated
let uncompressed_bytes; | ||
if last_voted_fork.is_empty() { | ||
return Err("Last voted slot must be specified".to_string()); | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The line above is a return
. so this else
and the extra indention below is redundant.
We also don't need the uninitialized variables above (e.g. let last_voted_slot;
) if the else
is removed.
gossip/src/crds_value.rs
Outdated
return Err("Last voted slot must be specified".to_string()); | ||
} else { | ||
last_voted_fork.sort_by_key(|a| Reverse(*a)); | ||
last_voted_slot = last_voted_fork[0]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[0]
can panic. There is last_voted_fork.is_empty()
above , but the better style is to remove that and instead change this line to:
let Some(last_voted_slot) = last_voted_fork.get(0).copied() else {
return Err(...);
}
gossip/src/crds_value.rs
Outdated
Err(e) => return Err(e.to_string()), | ||
// compression ended successfully, proceed. | ||
Ok(flate2::Status::StreamEnd) => (), | ||
Ok(flate2::Status::BufError) => (), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am guessing in this case some of the slots are dropped because there is no more buffer to write to.
Can you comment this out here in the code?
Also can you have a test that in this case the other nodes will still receive a valid payload and can decompress to a valid list of slots consistent with the most recent slots in the original vector.
gossip/src/crds_value.rs
Outdated
pub fn new(bits: &BitVec<u8>) -> Self { | ||
let mut encoded: Vec<U16> = Vec::new(); | ||
let mut current_bit = 1; | ||
let mut current_bit_count: u16 = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why u16 is sufficient here?
can't the gaps exceed 65536?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because when we generate the code we only send MAX_SLOTS_PER_ENTRY slots, which is 16k, so u16 is enough.
gossip/src/crds_value.rs
Outdated
// The vector always starts with 1. Encode number of 1's and 0's consecutively. | ||
// For example, 110000111 is [2, 4, 3]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why start with 1
?
are you including last_voted_slot
in the bit-vec as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, last_voted_slot is included in the bit-vec. If we can always start with 1 we don't encode whether the first value is 0 or 1.
gossip/src/crds_value.rs
Outdated
let encoded = (0..bits.len()) | ||
.map(|i| bits.get(i)) | ||
.dedup_with_count() | ||
.map(|(count, _)| count) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.map_while(|(count, _value)| u16::try_from(count).ok())
instead of .map
here and remove u16::try_from
from valid_and_has_space
.
Also don't need .try_inot().unwrap()
below either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
gossip/src/crds_value.rs
Outdated
.map(|i| bits.get(i)) | ||
.dedup_with_count() | ||
.map(|(count, _)| count) | ||
.take_while(|count| Self::valid_and_has_space(&mut total_serde_bytes, *count)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This transformation is technically a scan
, not a take_while
.
See the example here:
https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.scan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
gossip/src/crds_value.rs
Outdated
} | ||
|
||
pub fn slots_count(&self) -> usize { | ||
self.encoded.iter().map(|x| x.0 as usize).sum::<usize>() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as
has no type-safety checks.
usize::from
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
gossip/src/crds_value.rs
Outdated
.encoded | ||
.iter() | ||
.zip([1, 0].iter().cycle()) | ||
.flat_map(|(bit_count, bit)| std::iter::repeat(bit).take(bit_count.0.into())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.into
is ambiguous that what type change happens here.
can you use usize::from
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
gossip/src/crds_value.rs
Outdated
.filter_map(|(offset, bit): (usize, &i32)| { | ||
if *bit > 0 { | ||
Some(last_slot.saturating_sub(offset as Slot)) | ||
} else { | ||
None | ||
} | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
separate filter
then map
here would be more readable than filter_map
with and if
branch.
.filter(|(_offset, bit)| *bit == 1)
.map(|(offset, _bit)| last_slot.saturating_sub(offset as Slot))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
gossip/src/crds_value.rs
Outdated
for i in 0..self.bits.len() { | ||
if self.bits.get(i) { | ||
let slot = last_slot.saturating_sub(i as Slot); | ||
if slot < min_slot { | ||
return result; | ||
} | ||
result.push(slot); | ||
if result.len() > MAX_SLOTS_PER_ENTRY { | ||
break; | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can also be written using iterators instead of manual for
loops and if
branches;
something like:
(0..self.bits.len())
.filter(|k| self.bits.get(k))
.map(|offset| last_slot.saturing_sub(offset as Slot))
.take_while(|slot| slot >= min_slot)
.take(MAX_SLOTS_PER_ENTRY)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
gossip/src/crds_value.rs
Outdated
let (first_voted_slot, last_voted_slot) = match last_voted_fork.iter().minmax() { | ||
NoElements => return Err("Last voted slot must be specified".to_string()), | ||
OneElement(slot) => (slot, slot), | ||
MinMax(min, max) => (min, max), | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let Some((first_voted_slot, last_voted_slot)) = last_voted_fork.iter().minmax().into_option() else {
return Err(...);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
gossip/src/crds_value.rs
Outdated
uncompressed_bitvec.set(last_voted_slot - *slot, true); | ||
} | ||
let run_length_encoding = RunLengthEncoding::new(&uncompressed_bitvec); | ||
let raw_offsets = RawOffsets::new(uncompressed_bitvec); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line should go inside the else
block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
gossip/src/cluster_info.rs
Outdated
@@ -962,6 +962,23 @@ impl ClusterInfo { | |||
} | |||
} | |||
|
|||
pub fn push_restart_last_voted_fork_slots(&self, update: &[Slot], last_vote_bankhash: Hash) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update
is pretty ambiguous here.
can we name the arg fork
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
gossip/src/crds_value.rs
Outdated
if *current_bytes > RestartLastVotedForkSlots::MAX_BYTES as u32 { | ||
return None; | ||
} | ||
Some(U16(count)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*current_bytes += ((u16::BITS - count.leading_zeros() + 6) / 7).max(1) as usize;
(*current_bytes <= RestartLastVotedForkSlots::MAX_BYTES).then_some(U16(count))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
gossip/src/crds_value.rs
Outdated
Self { encoded } | ||
} | ||
|
||
pub fn slots_count(&self) -> usize { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you rename slots_count
to num_encoded_slots
?
Also this function shouldn't be pub
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
gossip/src/crds_value.rs
Outdated
} | ||
|
||
impl RunLengthEncoding { | ||
pub fn new(bits: &BitVec<u8>) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not pub
.
can you go over all extraneous pub
s in this code and remove them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
gossip/src/crds_value.rs
Outdated
.zip([1, 0].iter().cycle()) | ||
.flat_map(|(bit_count, bit)| std::iter::repeat(bit).take(usize::from(bit_count.0))) | ||
.enumerate() | ||
.filter(|(_, bit)| **bit > 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
**bit == 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
gossip/src/crds_value.rs
Outdated
} | ||
|
||
pub fn to_slots(&self, last_slot: Slot, min_slot: Slot) -> Vec<Slot> { | ||
let mut result: Vec<Slot> = self |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you rename result
to slots
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
gossip/src/crds_value.rs
Outdated
.flat_map(|(bit_count, bit)| std::iter::repeat(bit).take(usize::from(bit_count.0))) | ||
.enumerate() | ||
.filter(|(_, bit)| **bit > 0) | ||
.map(|(offset, _)| last_slot.saturating_sub(offset as Slot)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably stop if offset > last_slot
.
We cannot assume values received from other nodes are sane.
.map_while(|(offset, _bit)| {
let offset = Slot::try_from(offset).ok()?;
last_slot.checked_sub(offset)
})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
gossip/src/crds_value.rs
Outdated
} | ||
|
||
pub fn to_slots(&self, last_slot: Slot, min_slot: Slot) -> Vec<Slot> { | ||
let mut result: Vec<Slot> = (0..self.bits.len()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename result
to slots
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
gossip/src/crds_value.rs
Outdated
pub fn to_slots(&self, last_slot: Slot, min_slot: Slot) -> Vec<Slot> { | ||
let mut result: Vec<Slot> = (0..self.bits.len()) | ||
.filter(|index| self.bits.get(*index)) | ||
.map(|offset| last_slot.saturating_sub(offset)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.map_while(|offset| last_slot.checked_sub(offset))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
gossip/src/crds_value.rs
Outdated
last_voted_fork: &[Slot], | ||
last_voted_hash: Hash, | ||
shred_version: u16, | ||
) -> Result<Self, String> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you define an actual error type, say RestartLastVotedForkSlotsError
, instead of String
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
gossip/src/crds_value.rs
Outdated
#[allow(dead_code)] | ||
#[derive(Debug, Clone)] | ||
pub struct RestartLastVotedForkSlotsError { | ||
error: String, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By an actual error type I meant something like this:
https://github.com/solana-labs/solana/blob/f00507536/gossip/src/contact_info.rs#L43-L67
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
gossip/src/crds_value.rs
Outdated
struct U16(#[serde(with = "serde_varint")] u16); | ||
|
||
#[derive(Deserialize, Serialize, Clone, Debug, PartialEq, Eq, AbiExample)] | ||
struct RunLengthEncoding { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this just be
struct RunLengthEncoding(Vec<u8>);
to be less verbose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
gossip/src/cluster_info.rs
Outdated
CrdsData::RestartLastVotedForkSlots(last_voted_fork_slots), | ||
&self.keypair(), | ||
)), | ||
Err(e) => error!("failed to create RestartLastVotedForkSlots {e:?}"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code here does not have enough context if logging the error and then ignoring it is sufficient or not.
I would rather this returns the error to the caller.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
gossip/src/crds_value.rs
Outdated
let offset = Slot::try_from(offset).ok()?; | ||
last_slot.checked_sub(offset) | ||
}) | ||
.take(MAX_SLOTS_PER_ENTRY) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this MAX_SLOTS_PER_ENTRY
here?
This seems like a left-over from the older code which was using EpochSlots
.
There is already a RestartLastVotedForkSlots::MAX_BYTES
which limits number of slots for entries created by this node.
The entries received from other nodes are also bounded by what fits into a PACKET_DATA_SIZE
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RunLengthEncoding can potentially pack tons of zeroes in it, but we only send back 1's to the caller, and it's bound by local root, so I think you are right we can remove this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, we can pack tones of ones as well, and last_root
can be equal to u64::MAX
, and then the sender can send a lot of slots due to RLE encoding.
so now that I am thinking, we actually need to limit the length of output vector here because you cannot trust the value.
The limit does not need to be MAX_SLOTS_PER_ENTRY
because those are not related anymore but we do in fact need a .take(max_slots)
here. sorry for the back and forth on this.
RawOffsets::to_slots
does not need a limit because that is already limited by the packet size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching this. I think we can go back to 81k slots in the original design doc then. However, if it's 81k slots then u16 no longer fits, I've changed it into u32.
gossip/src/crds_value.rs
Outdated
.filter(|index| self.0.get(*index)) | ||
.map_while(|offset| last_slot.checked_sub(offset)) | ||
.take_while(|slot| *slot >= min_slot) | ||
.take(MAX_SLOTS_PER_ENTRY) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similarly here re MAX_SLOTS_PER_ENTRY
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
remove unnecessary constraints in to_slots.
gossip/src/crds_value.rs
Outdated
@@ -566,6 +571,9 @@ impl RawOffsets { | |||
} | |||
} | |||
|
|||
// Per design doc, we should start wen_restart within 9 hours. | |||
pub const MAX_SLOTS_RESTART_LAST_VOTED_FORK_SLOTS: usize = 81000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This definition also needs to go inside struct RestartLastVotedForkSlots
. Similar to MAX_BYTES
.
With 5000 nodes, each sending 81k slot, it would take 3GB of memory just to store all those Vec<Slot>
.
Can we reduce this limit to 65,535
so that we can stick with u16
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's fine. The original argument for 9 hours is so that admins can wake up and join the restart, but if they sleep sound enough then even 9 hours wouldn't be enough. In reality we normally have >80% of the admins wake up within 7 hours.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
gossip/src/crds_value.rs
Outdated
@@ -600,6 +597,9 @@ impl RestartLastVotedForkSlots { | |||
// This number is MAX_CRDS_OBJECT_SIZE - empty serialized RestartLastVotedForkSlots. | |||
const MAX_BYTES: usize = 824; | |||
|
|||
// Per design doc, we should start wen_restart within 7 hours. | |||
pub const MAX_SLOTS: usize = 65535; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this need to be pub
?
also u16::MAX as usize
or 0xFFFF
might be more descriptive where this number comes from.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed. It needs to be pub because later when we create the fork from wen_restart code we can limit the fork size at 65535.
Add push_restart_last_voted_fork_slots and get_restart_last_voted_fork_slots.