-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix valueLog doRunGC to use consistent units #724
Conversation
some comparisons were made between bytes and megabytes this change tracks all values in bytes and makes comparisons using consistent units
This could have been addressed a couple ways, I chose to just do everything in bytes instead of MB. This removes a few divisions, and the only downside is that the trace messages will output less friendly numbers. |
Not sure if the appveyor failure is related or not, but fixing this bug may expose other issues in value log GC. Specifically, because this line was comparing bytes and megabytes, we never would have stopped sampling because the number of bytes we read: https://github.com/dgraph-io/badger/blob/master/value.go#L1188 That means we either ran out of time (10s), read 10000 entries, or got to the end of the log. This then relates to the checks we perform here: https://github.com/dgraph-io/badger/blob/master/value.go#L1253 Notice how the row count check here is stricter than the size check. This ends up related, because now there will be more cases where we stopped sampling due to size, thus more cases where the row count might not be satisfied any more. Obviously all of this depends on the data and tuning in practice, but it's worth pointing out. On that same topic, I have concerns about users being able to tune this effectively. I think it makes total sense that you want lots of ways to constrain how much data is sampled (entries, size, time), but at this point we're trying to decide if the sample was good enough. It seems like satisfying size or entries ought to be enough, but as coded today we require both. I anticipate this being hard for some users to tune, as satisfying both requires you have a good idea of the value sizes, which might vary considerably in practice. Any thoughts on this check? |
Actually, perhaps we should disregard my last comment. At least when doing offline compaction, just applying this fix makes it work very well, without any other changes or tuning. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To add some context, we used to do it purely based on size, but if key-values are really small, then that ended up causing a lot of LSM tree lookups to sample the log file. That's why we added the criteria for the number of keys, so we can quit early if we have enough key samples.
I'd prefer keeping the units to MB and the fields in reason to floats.
Reviewable status: 0 of 1 files reviewed, all discussions resolved
some comparisons were made between bytes and megabytes
this change tracks all values in bytes and makes comparisons
using consistent units
This change is