Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support recycling log files #224

Merged
merged 54 commits into from
Jul 20, 2022
Merged
Show file tree
Hide file tree
Changes from 52 commits
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
9f9fb54
[Enhancement]Support recycling log files from expired logs.
LykxSassinator Jun 8, 2022
32aa562
Minor modifications on `bench_falloc`
LykxSassinator Jun 9, 2022
f98c76c
Merge branch 'tikv:master' into recycle_logs
LykxSassinator Jun 9, 2022
3da6ac5
[Bugfix] Fix the unexpected bug on the judgement of loops in `bench_f…
LykxSassinator Jun 10, 2022
f322af0
Supplements for UTs on `RecycleFileCollection` and codes clean-up.
LykxSassinator Jun 14, 2022
d2cec63
Merge branch 'tikv:master' into recycle_logs
LykxSassinator Jun 14, 2022
9b5a9a4
Supplement necessary UTs for validating the new format_version with V2
LykxSassinator Jun 14, 2022
5dd42b1
Clean-up unncessary annotations and codes.
LykxSassinator Jun 15, 2022
834b996
Clean-up unnecessary annotations and unused codes.
LykxSassinator Jun 15, 2022
392bb6e
[Refinement] Simplify the configuration on the capacity of `RecycleFi…
LykxSassinator Jun 15, 2022
9ef0f8d
[Refinement] Simplify the configuration on the capacity of `RecycleFi…
LykxSassinator Jun 15, 2022
b508c04
Merge branch 'recycle_logs' of github.com:LykxSassinator/raft-engine …
LykxSassinator Jun 20, 2022
0a32a25
Reconstruct the implementation of FileCollection for recycling.
LykxSassinator Jun 20, 2022
5820b13
Refactor the `sign_checksum(...Version, Signature)` with `pre_write(.…
LykxSassinator Jun 21, 2022
2047d29
Merge branch 'master' into recycle_logs
LykxSassinator Jun 21, 2022
1aaeb59
Remove unnecessary annotations in `pipe.rs`
LykxSassinator Jun 22, 2022
2a07b53
Modification for code-style consistency.
LykxSassinator Jun 23, 2022
55e5fac
Merge branch 'master' into recycle_logs
LykxSassinator Jun 23, 2022
b786881
Merge branch 'master' into recycle_logs
LykxSassinator Jun 23, 2022
cf9b77e
Make `Recycle log files` compatible to the new feature tikv#229.
LykxSassinator Jun 23, 2022
0a2b085
Merge branch 'master' into recycle_logs
LykxSassinator Jun 24, 2022
7e6a5cb
Update changelog.
LykxSassinator Jun 24, 2022
ccf5ef0
Merge branch 'master' into recycle_logs
LykxSassinator Jun 24, 2022
09103a8
Supply extra setting `allow_recycle` for on / off `Recycle log files`
LykxSassinator Jun 24, 2022
042607a
Merge branch 'recycle_logs' of github.com:LykxSassinator/raft-engine …
LykxSassinator Jun 24, 2022
c13eed0
Fix code-style bugs examed by `clippy`
LykxSassinator Jun 25, 2022
b1ac73c
Merge branch 'tikv:master' into recycle_logs
LykxSassinator Jun 28, 2022
c37c1c3
Supply `allow-recycle` flag for `stress` tool.
LykxSassinator Jun 28, 2022
db4c0de
Modifications for compatibilities on code-style.
LykxSassinator Jun 30, 2022
1614ef1
Supplement extra UTs for abnormal cases.
LykxSassinator Jul 1, 2022
47e5f7f
Supplement extra ut for testing the abnormal case when `recycle` is o…
LykxSassinator Jul 5, 2022
577ba0f
Modifications on code-style for compatibilities.
LykxSassinator Jul 7, 2022
930fe3f
Minor modifications for compatibilities to other componenents.
LykxSassinator Jul 8, 2022
3e0cfd5
Fix bugs in pipe.rs
LykxSassinator Jul 12, 2022
4060018
Support removing stale files when `engine` exit.
LykxSassinator Jul 12, 2022
190b0e8
Merge branch 'master' into recycle_logs
LykxSassinator Jul 13, 2022
3dd41bf
Supplement necessary code coverage on `rename` in `test_manage_file_r…
LykxSassinator Jul 13, 2022
605c173
Remove unncessary prints in `test_manage_file_rename`
LykxSassinator Jul 13, 2022
2797ccb
Merge branch 'master' into recycle_logs
LykxSassinator Jul 14, 2022
ce6f42f
Simplify the strategy of computing `purge` in the func `purge_to()`.
LykxSassinator Jul 14, 2022
b187930
Refine the performance of `fetch_active_file` by accessing `files`.
LykxSassinator Jul 14, 2022
e3e7c92
Merge branch 'master' into recycle_logs
LykxSassinator Jul 14, 2022
adabbe2
Merge branch 'master' into recycle_logs
LykxSassinator Jul 15, 2022
2276597
Modifications for code-style compatibilities.
LykxSassinator Jul 17, 2022
51b11b9
Merge branch 'recycle_logs' of github.com:LykxSassinator/raft-engine …
LykxSassinator Jul 17, 2022
7df1fee
Modifications on `purge_to` for more readabilities.
LykxSassinator Jul 18, 2022
80802b4
Fix fmt errs.
LykxSassinator Jul 18, 2022
71f19c0
Supply extra purge strategy for reclaiming redundant space occupied b…
LykxSassinator Jul 18, 2022
1d81e69
Bugfix for the incorrect strategy of counting purged files in `purge_…
LykxSassinator Jul 19, 2022
69a4015
Remove unnecessary functions.
LykxSassinator Jul 19, 2022
0a88245
Bugfix for the strategy in `purge_to` and supplement extra uts for ne…
LykxSassinator Jul 19, 2022
5353d0a
Replace the interface `rename` with `reuse` for more compatiblities.
LykxSassinator Jul 20, 2022
f1a9f9e
Minor modifications on the api in the FilesSystem trait.
LykxSassinator Jul 20, 2022
9d98e7b
Unify the testcase name with the name of its relevant dir.
LykxSassinator Jul 20, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
### New Features

* Add `PerfContext` which records detailed time breakdown of the write process to thread-local storage.
* Support recycling obsolete log files to reduce the cost of `fallocate`-ing new ones.

### Public API Changes

Expand Down
2 changes: 2 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,8 @@ rayon = "1.5"
rhai = { version = "1.7", features = ["sync"], optional = true }
scopeguard = "1.1"
serde = { version = "1.0", features = ["derive"] }
serde_repr = "0.1"
strum = { version = "0.24.0", features = ["derive"] }
thiserror = "1.0"

[dev-dependencies]
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ cargo +nightly test --test failpoints --all-features -- --test-threads 1

```
cargo +nightly bench --all-features <bench-case-name>
cargo run --release --package stress --help
cargo run --release --package stress -- --help
```

## License
Expand Down
76 changes: 61 additions & 15 deletions src/config.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
use log::warn;
use serde::{Deserialize, Serialize};

use crate::file_pipe_log::Version;
use crate::pipe_log::Version;
use crate::{util::ReadableSize, Result};

const MIN_RECOVERY_READ_BLOCK_SIZE: usize = 512;
Expand Down Expand Up @@ -58,7 +58,7 @@ pub struct Config {
/// Version of the log file.
///
/// Default: 1
pub format_version: u64,
pub format_version: Version,

/// Target file size for rotating log files.
///
Expand All @@ -83,6 +83,14 @@ pub struct Config {
///
/// Default: None
pub memory_limit: Option<ReadableSize>,

/// Whether to recycle stale logs.
/// If `true`, `purge` operations on logs will firstly put stale
/// files into a list for recycle. It's only available if
/// `format_version` >= `2`.
///
/// Default: false,
pub enable_log_recycle: bool,
}

impl Default for Config {
Expand All @@ -95,12 +103,13 @@ impl Default for Config {
recovery_threads: 4,
batch_compression_threshold: ReadableSize::kb(8),
bytes_per_sync: ReadableSize::mb(4),
format_version: 1, // 1 => Version::V1
format_version: Version::V1,
target_file_size: ReadableSize::mb(128),
purge_threshold: ReadableSize::gb(10),
purge_rewrite_threshold: None,
purge_rewrite_garbage_ratio: 0.6,
memory_limit: None,
enable_log_recycle: false,
};
// Test-specific configurations.
#[cfg(test)]
Expand Down Expand Up @@ -140,20 +149,40 @@ impl Config {
);
self.recovery_threads = MIN_RECOVERY_THREADS;
}
if !Version::is_valid(self.format_version) {
warn!(
"format-version ({}) is invalid, setting it to {}",
self.format_version,
Version::default() as u64
);
self.format_version = Version::default() as u64;
if self.enable_log_recycle {
if !self.format_version.has_log_signing() {
return Err(box_err!(
"format_version: {:?} is invalid when 'enable_log_recycle' on, setting it to V2",
self.format_version
));
}
if self.purge_threshold.0 / self.target_file_size.0 >= std::u32::MAX as u64 {
return Err(box_err!(
"File count exceed UINT32_MAX, calculated by 'purge-threshold / target-file-size'"
));
}
}
#[cfg(not(feature = "swap"))]
if self.memory_limit.is_some() {
warn!("memory-limit will be ignored because swap feature is not enabled");
}
Ok(())
}

/// Returns the capacity for recycling log files.
pub(crate) fn recycle_capacity(&self) -> usize {
// Attention please, log files with Version::V1 could not be recycled, it might
// cause LogBatchs in a mess in the recycled file, where the reader might get
// an obsolete entries (unexpected) from the recycled file.
if !self.format_version.has_log_signing() {
return 0;
}
if self.enable_log_recycle && self.purge_threshold.0 >= self.target_file_size.0 {
(self.purge_threshold.0 / self.target_file_size.0) as usize
} else {
0
}
}
}

#[cfg(test)]
Expand All @@ -176,16 +205,16 @@ mod tests {
bytes-per-sync = "2KB"
target-file-size = "1MB"
purge-threshold = "3MB"
format-version = 11
format-version = 1
"#;
let load: Config = toml::from_str(custom).unwrap();
assert_eq!(load.dir, "custom_dir");
assert_eq!(load.recovery_mode, RecoveryMode::TolerateTailCorruption);
assert_eq!(load.bytes_per_sync, ReadableSize::kb(2));
assert_eq!(load.target_file_size, ReadableSize::mb(1));
assert_eq!(load.purge_threshold, ReadableSize::mb(3));
assert_eq!(load.format_version, 11_u64);
assert!(!Version::is_valid(load.format_version));
assert_eq!(load.format_version, Version::V1);
assert!(!load.enable_log_recycle);
}

#[test]
Expand All @@ -202,7 +231,8 @@ mod tests {
recovery-threads = 0
bytes-per-sync = "0KB"
target-file-size = "5000MB"
format-version = 20
format-version = 2
enable-log-recycle = true
"#;
let soft_load: Config = toml::from_str(soft_error).unwrap();
let mut soft_sanitized = soft_load;
Expand All @@ -214,7 +244,23 @@ mod tests {
soft_sanitized.purge_rewrite_threshold.unwrap(),
soft_sanitized.target_file_size
);
assert_eq!(soft_sanitized.format_version, Version::default() as u64);
assert_eq!(soft_sanitized.format_version, Version::V2);
assert!(soft_sanitized.enable_log_recycle);

let format_error = r#"
enable-log-recycle = true
"#;
let mut cfg_load: Config = toml::from_str(format_error).unwrap();
assert!(cfg_load.sanitize().is_err());

let file_count_error = r#"
target-file-size = "1B"
purge-threshold = "4GB"
format-version = 2
enable-log-recycle = true
"#;
let mut file_count_load: Config = toml::from_str(file_count_error).unwrap();
assert!(file_count_load.sanitize().is_err());
}

#[test]
Expand Down
Loading