-
Notifications
You must be signed in to change notification settings - Fork 6.4k
Performance Benchmarks
These benchmarks measure RocksDB performance when data resides on flash storage. (The benchmarks on this page were generated in June 2020 with RocksDB 6.10.0 unless otherwise noted)
All of the benchmarks are run on the same AWS instance. Here are the details of the test setup:
- Instance type: m5d.2xlarge 8 CPU, 32 GB Memory, 1 x 300 NVMe SSD.
- Kernel version: Linux 4.14.177-139.253.amzn2.x86_64
- File System: XFS with discard enabled
To understand the performance of the SSD card, we ran an fio test and observed 117K IOPS of 4KB reads (See Performance Benchmarks#fio test results for outputs).
All tests were executed against by executing benchmark.sh with the following parameters (unless otherwise specified): NUM_KEYS=900000000 NUM_THREADS=32 CACHE_SIZE=6442450944 For long-running tests, the tests were executed with a duration of 5400 seconds (DURATION=5400)
All other parameters used the default values, unless explicitly mentioned here. Tests were executed sequentially against the same database instance. The db_bench tool was generated via "make release".
The following test sequence was executed:
NUM_KEYS=900000000 NUM_THREADS=32 CACHE_SIZE=6442450944 benchmark.sh bulkload
Measure performance to load 900 million keys into the database. The keys are inserted in random order. The database is empty at the beginning of this benchmark run and gradually fills up. No data is being read when the data load is in progress.
NUM_KEYS=900000000 NUM_THREADS=32 CACHE_SIZE=6442450944 DURATION=5400 benchmark.sh overwrite
Measure performance to randomly overwrite keys into the database. The database was first created by the previous benchmark.
NUM_KEYS=900000000 NUM_THREADS=32 CACHE_SIZE=6442450944 DURATION=5400 benchmark.sh readwhilewriting
Measure performance to randomly read keys and ongoing updates to existing keys. The database from Test #2 was used as the starting point.
NUM_KEYS=900000000 NUM_THREADS=32 CACHE_SIZE=6442450944 DURATION=5400 benchmark.sh randomread
Measure random read performance of a database.
The following shows results of these tests using various releases and parameters.
The test cases were executed with various block sizes. The Direct I/O (DIO) test was executed with an 8K block size. In the "RL" tests, a timed rate-limited operation was place before the reported operation. For example, between the "bulkload" and "overwrite" operations, a 30-minute "rate-limited overwrite (limited at 2MB/sec) was conducted. This timed operation was meant as a means to help guarantee any flush or other background operation happened before the "timed reported" operation, thereby creating more predicatability in the percentile perforamnce numbers.
8K: Complete bulkload in 4560 seconds 4K: Complete bulkload in 5215 seconds 16K: Complete bulkload in 3996 seconds DIO: Complete bulkload in 4547 seconds 8K RL: Complete bulkload in 4388 seconds
Block | ops/sec | mb/sec | Size-GB | L0_GB | Sum_GB | W-Amp | W-MB/s | usec/op | p50 | p75 | p99 | p99.9 | p99.99 | Uptime | Stall-time | Stall% | du -s - k |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
8K | 924468 | 370.3 | 0.2 | 157.1 | 157.1 | 1.0 | 167.5 | 1.1 | 0.5 | 0.8 | 2 | 4 | 1119 | 960 | 00:03:45.193 | 23.5 | 101411592 |
4K | 853217 | 341.8 | 0.2 | 165.3 | 165.3 | 1.0 | 165.9 | 1.2 | 0.5 | 0.8 | 2 | 4 | 1159 | 1020 | 00:04:41.465 | 27.6 | 108748512 |
16K | 1027567 | 411.6 | 0.1 | 149.0 | 149.0 | 1.0 | 181.6 | 1.0 | 0.5 | 0.8 | 2 | 3 | 1021 | 840 | 00:02:23.600 | 17.1 | 99070240 |
DIO | 921342 | 369.0 | 0.2 | 156.6 | 156.6 | 1.0 | 167.0 | 1.1 | 0.5 | 0.8 | 2 | 4 | 1104 | 960 | 00:03:27.280 | 21.6 | 101412440 |
8K RL | 989786 | 396.5 | 0.2 | 159.4 | 159.4 | 1.0 | 179.5 | 1.0 | 0.5 | 0.8 | 2 | 4 | 1043 | 909 | 00:02:41.514 | 17.8 | 101406496 |
Block | ops/sec | mb/sec | Size-GB | L0_GB | Sum_GB | W-Amp | W-MB/s | usec/op | p50 | p75 | p99 | p99.9 | p99.99 | Stall-time | Stall% | du -s -k |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
8K | 85756 | 34.3 | 0.1 | 161.4 | 739.9 | 4.5 | 142.2 | 373.1 | 9.7 | 274.1 | 5613 | 25620 | 47726 | 00:20:18.388 | 22.9 | 159903832 |
4K | 79856 | 32.0 | 0.2 | 166.0 | 716.9 | 4.3 | 136.3 | 400.7 | 9.7 | 268.9 | 5914 | 25394 | 47296 | 00:25:37.183 | 28.5 | 168094916 |
16K | 93678 | 37.5 | 0.1 | 174.4 | 825.0 | 4.7 | 156.8 | 341.6 | 9.4 | 279.2 | 4453 | 24796 | 47038 | 00:16:24.878 | 18.3 | 155953232 |
DIO | 85655 | 34.3 | 0.1 | 163.9 | 734.9 | 4.4 | 140.7 | 373.6 | 9.7 | 263.1 | 6250 | 25807 | 47678 | 00:18:51.145 | 21.2 | 159470752 |
8K RL | 85542 | 34.3 | 0.1 | 161.2 | 757.8 | 4.7 | 143.6 | 748.1 | 340.5 | 735.8 | 11852 | 30851 | 59137 | 5401 | 00:08:18.359 | 9.2 |
Block | ops/sec | mb/sec | Size-GB | L0_GB | Sum_GB | W-Amp | W-MB/s | usec/op | p50 | p75 | p99 | p99.9 | p99.99 | Stall-time | Stall% | du -s -k |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
8K | 89285 | 28.0 | 0.1 | 4.2 | 199.6 | 47.5 | 37.9 | 358.4 | 281.1 | 427.9 | 2935 | 7587 | 19029 | 00:13:7.325 | 14.6 | 139287936 |
4K | 116759 | 36.2 | 0.1 | 3.6 | 203.8 | 56.6 | 38.9 | 274.1 | 224.4 | 328.0 | 2534 | 6131 | 13678 | 00:20:58.789 | 23.5 | 147504716 |
16K | 64393 | 20.4 | 0.1 | 4.1 | 194.0 | 47.3 | 36.8 | 496.9 | 402.3 | 642.7 | 3488 | 7251 | 8880 | 00:10:58.906 | 12.2 | 138132068 |
DIO | 98698 | 30.9 | 0.1 | 3.9 | 197.4 | 50.6 | 37.6 | 324.2 | 257.7 | 353.7 | 2764 | 6583 | 13742 | 00:16:47.979 | 18.8 | 139319040 |
8K RL | 101598 | 31.9 | 0.1 | 3.2 | 97.2 | 30.3 | 18.4 | 629.9 | 587.5 | 805.9 | 3922 | 6881 | 19699 | 5402 | 00:00:0.054 | 0.0 |
Block | ops/sec | mb/sec | Size-GB | L0_GB | Sum_GB | W-Amp | W-MB/s | usec/op | p50 | p75 | p99 | p99.9 | p99.99 | du =s -k |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
8K | 101647 | 32.0 | 0.1 | 0.0 | 3.9 | 0 | .7 | 314.8 | 410.7 | 498.8 | 761 | 1247 | 3092 | 139119060 |
4K | 130846 | 40.7 | 0.1 | 0.0 | 1.0 | 0 | .1 | 244.6 | 291.7 | 347.5 | 663 | 865 | 2626 | 147417776 |
16K | 70884 | 22.6 | 0.1 | 0.0 | 1.3 | 0 | .2 | 451.4 | 547.5 | 715.0 | 1039 | 1397 | 2598 | 138040824 |
DIO | 144737 | 45.5 | 0.1 | 0.1 | 0.7 | 7.0 | .1 | 221.1 | 239.8 | 320.9 | 578 | 866 | 2133 | 139239620 |
8K RL | 105790 | 33.4 | 0.1 | 0.0 | 0.0 | 0 | 605.0 | 683.0 | 807.9 | 1579 | 3133 | 6152 | 5403 | 139681920 |
The test cases were executed with the default block size and a value size of 2K. Only 100M keys were written to the database. Complete bulkload in 2018 seconds
Test | ops/sec | mb/sec | Size-GB | L0_GB | Sum_GB | W-Amp | W-MB/s | usec/op | p50 | p75 | p99 | p99.9 | p99.99 | Uptime | Stall-time | Stall% | du -s -k |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
bulkload | 272448 | 537.3 | 0.1 | 85.3 | 85.3 | 1.0 | 242.6 | 3.7 | 0.7 | 1.1 | 3 | 1105 | 1285 | 360 | 00:03:52.679 | 64.6 | 57285876 |
overwrite | 22940 | 45.2 | 0.1 | 229.3 | 879.4 | 3.8 | 169.0 | 1394.9 | 212.9 | 350.7 | 7603 | 26151 | 160352 | 5328 | 01:06:21.977 | 74.7 | 110458852 |
readwhilewriting | 87093 | 154.2 | 0.1 | 5.4 | 162.6 | 30.1 | 31.0 | 367.4 | 369.2 | 491.9 | 2209 | 6302 | 13544 | 5360 | 00:00:1.160 | 0.0 | 92081776 |
readrandom | 95666 | 169.9 | 0.1 | 0.0 | 0.0 | 0 | 0 | 334.5 | 411.1 | 498.7 | 742 | 1214 | 2789 | 5358 | 00:00:0.000 | 0.0 | 92092164 |
These tests were executed against different versions of RocksDB, by checking out the corresponding branch and doing a "make release".
6.10.0: Complete bulkload in 4560 seconds 6.3.6: Complete bulkload in 4584 seconds 6.0.2: Complete bulkload in 4668 seconds
Version | ops/sec | mb/sec | Size-GB | L0_GB | Sum_GB | W-Amp | W-MB/s | usec/op | p50 | p75 | p99 | p99.9 | p99.99 | Uptime | Stall-time | Stall% | du -s -k |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
6.10.0 | 924468 | 370.3 | 0.2 | 157.1 | 157.1 | 1.0 | 167.5 | 1.1 | 0.5 | 0.8 | 2 | 4 | 1119 | 960 | 00:03:45.193 | 23.5 | 101411592 |
6.3.6 | 921714 | 369.2 | 0.2 | 156.7 | 156.7 | 1.0 | 167.1 | 1.1 | 0.5 | 0.8 | 2 | 4 | 1133 | 960 | 00:04:2.070 | 25.2 | 101437836 |
6.0.2 | 933665 | 374.0 | 0.2 | 158.7 | 158.7 | 1.0 | 169.2 | 1.1 | 0.5 | 0.8 | 2 | 4 | 1105 | 960 | 00:03:31.627 | 22.0 | 101434096 |
Test Case 2 : NUM_KEYS=900000000 NUM_THREADS=32 CACHE_SIZE=6442450944 DURATION=5400 benchmark.sh overwrite
Version | ops/sec | mb/sec | Size-GB | L0_GB | Sum_GB | W-Amp | W-MB/s | usec/op | p50 | p75 | p99 | p99.9 | p99.99 | Stall-time | Stall% | du -s -k |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
6.10.0 | 85756 | 34.3 | 0.1 | 161.4 | 739.9 | 4.5 | 142.2 | 373.1 | 9.7 | 274.1 | 5613 | 25620 | 47726 | 00:20:18.388 | 22.9 | 159903832 |
6.3.6 | 92328 | 37.0 | 0.2 | 174.0 | 818.4 | 4.7 | 155.4 | 346.6 | 8.9 | 263.8 | 4432 | 24581 | 46753 | 00:20:24.697 | 22.7 | 162288400 |
6.0.2 | 86767 | 34.8 | 0.2 | 164.8 | 740.4 | 4.4 | 141.4 | 368.8 | 9.8 | 294.7 | 5900. | 25623 | 47755 | 00:17:6.887 | 19.2 | 162797372 |
Test Case 3 : NUM_KEYS=900000000 NUM_THREADS=32 CACHE_SIZE=6442450944 DURATION=5400 benchmark.sh readwhilewriting
Version | ops/sec | mb/sec | Size-GB | L0_GB | Sum_GB | W-Amp | W-MB/s | usec/op | p50 | p75 | p99 | p99.9 | p99.99 | Stall-time | Stall% | du -s -k |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
6.10.0 | 89285 | 28.0 | 0.1 | 4.2 | 199.6 | 47.5 | 37.9 | 358.4 | 281.1 | 427.9 | 2935 | 7587 | 19029 | 00:13:7.325 | 14.6 | 139287936 |
6.3.6 | 90189 | 28.6 | 0.1 | 4.1 | 213.1 | 51.9 | 40.6 | 354.8 | 288.1 | 430.2 | 2781 | 6357 | 15268 | 00:13:58.835 | 15.6 | 141082740 |
6.0.2 | 90140 | 28.3 | 0.1 | 4.1 | 209.8 | 51.1 | 39.9 | 355.0 | 290.1 | 445.1 | 2789 | 6354 | 15951 | 00:12:13.384 | 13.6 | 139700676 |
Test Case 4 : NUM_KEYS=900000000 NUM_THREADS=32 CACHE_SIZE=6442450944 DURATION=5400 benchmark.sh readrandom
Version | ops/sec | mb/sec | Size-GB | L0_GB | Sum_GB | W-Amp | W-MB/s | usec/op | p50 | p75 | p99 | p99.9 | p99.99 | du -s -k |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
6.10.0 | 101647 | 32.0 | 0.1 | 0.0 | 3.9 | 0 | .7 | 314.8 | 410.7 | 498.8 | 761 | 1247 | 3092 | 139119060 |
6.3.6 | 100168 | 31.8 | 0.1 | 0.0 | 0.9 | 0 | .1 | 319.5 | 411.3 | 499.2 | 769 | 1248 | 2787 | 140911608 |
6.9.2 | 101023 | 31.8 | 0.1 | 0.0 | 6.0 | 0 | 1.1 | 316.8 | 412.5 | 499.7 | 763 | 1239 | 3900 | 139423196 |
]$ fio --randrepeat=1 --ioengine=sync --direct=1 --gtod_reduce=1 --name=test --filename=/data/test_file --bs=4k --iodepth=64 --size=4G --readwrite=randread --numjobs=32 --group_reporting
test: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=64
...
fio-2.14
Starting 32 processes
Jobs: 3 (f=3): [_(3),r(1),_(1),E(1),_(10),r(1),_(13),r(1),E(1)] [100.0% done] [445.3MB/0KB/0KB /s] [114K/0/0 iops] [eta 00m:00s]
test: (groupid=0, jobs=32): err= 0: pid=28042: Fri Jul 24 01:36:19 2020
read : io=131072MB, bw=469326KB/s, iops=117331, runt=285980msec
cpu : usr=1.29%, sys=3.26%, ctx=33585114, majf=0, minf=297
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued : total=r=33554432/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=64
Run status group 0 (all jobs):
READ: io=131072MB, aggrb=469325KB/s, minb=469325KB/s, maxb=469325KB/s, mint=285980msec, maxt=285980msec
Disk stats (read/write):
nvme1n1: ios=33654742/61713, merge=0/40, ticks=8723764/89064, in_queue=8788592, util=100.00%
]$ fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=/data/test_file --bs=4k --iodepth=64 --size=4G --readwrite=randread
test: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=64
fio-2.14
Starting 1 process
Jobs: 1 (f=1): [r(1)] [100.0% done] [456.3MB/0KB/0KB /s] [117K/0/0 iops] [eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=28385: Fri Jul 24 01:36:56 2020
read : io=4096.0MB, bw=547416KB/s, iops=136854, runt= 7662msec
cpu : usr=22.20%, sys=48.81%, ctx=144112, majf=0, minf=73
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued : total=r=1048576/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=64
Run status group 0 (all jobs):
READ: io=4096.0MB, aggrb=547416KB/s, minb=547416KB/s, maxb=547416KB/s, mint=7662msec, maxt=7662msec
Disk stats (read/write):
nvme1n1: ios=1050868/1904, merge=0/1, ticks=374836/2900, in_queue=370532, util=98.70%
- July 2018: Performance Benchmark 201807
- 2014: Performance Benchmark 2014
Contents
- RocksDB Wiki
- Overview
- RocksDB FAQ
- Terminology
- Requirements
- Contributors' Guide
- Release Methodology
- RocksDB Users and Use Cases
- RocksDB Public Communication and Information Channels
-
Basic Operations
- Iterator
- Prefix seek
- SeekForPrev
- Tailing Iterator
- Compaction Filter
- Multi Column Family Iterator
- Read-Modify-Write (Merge) Operator
- Column Families
- Creating and Ingesting SST files
- Single Delete
- Low Priority Write
- Time to Live (TTL) Support
- Transactions
- Snapshot
- DeleteRange
- Atomic flush
- Read-only and Secondary instances
- Approximate Size
- User-defined Timestamp
- Wide Columns
- BlobDB
- Online Verification
- Options
- MemTable
- Journal
- Cache
- Write Buffer Manager
- Compaction
- SST File Formats
- IO
- Compression
- Full File Checksum and Checksum Handoff
- Background Error Handling
- Huge Page TLB Support
- Tiered Storage (Experimental)
- Logging and Monitoring
- Known Issues
- Troubleshooting Guide
- Tests
- Tools / Utilities
-
Implementation Details
- Delete Stale Files
- Partitioned Index/Filters
- WritePrepared-Transactions
- WriteUnprepared-Transactions
- How we keep track of live SST files
- How we index SST
- Merge Operator Implementation
- RocksDB Repairer
- Write Batch With Index
- Two Phase Commit
- Iterator's Implementation
- Simulation Cache
- [To Be Deprecated] Persistent Read Cache
- DeleteRange Implementation
- unordered_write
- Extending RocksDB
- RocksJava
- Lua
- Performance
- Projects Being Developed
- Misc