Skip to content

Commit

Permalink
pitfall on merge iterator
Browse files Browse the repository at this point in the history
Signed-off-by: Alex Chi <iskyzh@gmail.com>
  • Loading branch information
skyzh committed Jan 21, 2024
1 parent 892e6ab commit c6e700e
Show file tree
Hide file tree
Showing 3 changed files with 15 additions and 1 deletion.
10 changes: 10 additions & 0 deletions mini-lsm-book/src/week1-02-merge-iterator.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,16 @@ a->1, b->del, c->4, d->5, e->4

The constructor of the merge iterator takes a vector of iterators. We assume the one with a lower index (i.e., the first one) has the latest data.

One common pitfall is on error handling. For example,

```rust
let Some(mut inner_iter) = self.iters.peek_mut() {
inner_iter.next()?; // <- will cause problem
}
```

If `next` returns an error (i.e., due to disk failure, network failure, checksum error, etc.), it is no longer valid. However, when we go out of the if condition and return the error to the caller, `PeekMut`'s drop will try move the element within the heap, which causes an access to an invalid iterator. Therefore, you will need to do all error handling by yourself instead of using `?` within the scope of `PeekMut`.

We want to avoid dynamic dispatch as much as possible, and therefore we do not use `Box<dyn StorageIterator>` in the system. Instead, we prefer static dispatch using generics.

## Task 3: LSM Iterator + Fused Iterator
Expand Down
3 changes: 2 additions & 1 deletion mini-lsm-book/src/week2-02-simple.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,8 @@ In this chapter, you will:

## Test Your Understanding

* (I know this is stupid but) could you please repeat the definition of read/write/space amplifications? What are the ways to accurately compute them, and what are the ways to estimate them?
* (I know this is stupid but) could you please repeat the definition of read/write/space amplifications?
* What are the ways to accurately compute the read/write/space amplifications, and what are the ways to estimate them?
* Is it correct that a key will only be purged from the LSM tree if the user requests to delete it and it has been compacted in the bottom-most level?
* Is it a good strategy to periodically do a full compaction on the LSM tree? Why or why not?
* Actively choosing some old files/levels to compact even if they do not violate the level amplifier would be a good choice, is it true? (Look at the Lethe paper!)
Expand Down
3 changes: 3 additions & 0 deletions mini-lsm/src/tests/day2.rs
Original file line number Diff line number Diff line change
Expand Up @@ -120,6 +120,7 @@ fn test_task2_merge_1() {
(Bytes::from("a"), Bytes::from("1.1")),
(Bytes::from("b"), Bytes::from("2.1")),
(Bytes::from("c"), Bytes::from("3.1")),
(Bytes::from("e"), Bytes::new()),
]);
let i2 = MockIterator::new(vec![
(Bytes::from("a"), Bytes::from("1.2")),
Expand All @@ -146,6 +147,7 @@ fn test_task2_merge_1() {
(Bytes::from("b"), Bytes::from("2.1")),
(Bytes::from("c"), Bytes::from("3.1")),
(Bytes::from("d"), Bytes::from("4.2")),
(Bytes::from("e"), Bytes::new()),
],
);

Expand All @@ -158,6 +160,7 @@ fn test_task2_merge_1() {
(Bytes::from("b"), Bytes::from("2.3")),
(Bytes::from("c"), Bytes::from("3.3")),
(Bytes::from("d"), Bytes::from("4.3")),
(Bytes::from("e"), Bytes::new()),
],
);
}
Expand Down

0 comments on commit c6e700e

Please sign in to comment.