Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Completely remove cache #96

Merged
merged 66 commits into from
Jul 23, 2019
Merged

Completely remove cache #96

merged 66 commits into from
Jul 23, 2019

Conversation

dstuebe
Copy link
Contributor

@dstuebe dstuebe commented Nov 16, 2018

Changes:

  • Use large mapped buffers for reading in PageFiles
  • Use mapped buffers for reading in Blocked longs as well
  • Remove caffeine caches
  • Improve metadata lookup cache (Map -> Array)
  • Update config for JDK 11
  • Add stochastic first flush for lookupData
  • Simplify VirtualPageFile (remove extra metadata)
  • Improve logger stats for partitions
  • Add parallel performance tests

@codecov-io
Copy link

codecov-io commented Nov 17, 2018

Codecov Report

Merging #96 into master will increase coverage by 1.13%.
The diff coverage is 85.93%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master      #96      +/-   ##
============================================
+ Coverage     85.69%   86.83%   +1.13%     
+ Complexity      743      682      -61     
============================================
  Files            55       53       -2     
  Lines          2524     2294     -230     
  Branches        187      175      -12     
============================================
- Hits           2163     1992     -171     
+ Misses          259      215      -44     
+ Partials        102       87      -15
Impacted Files Coverage Δ Complexity Δ
src/main/java/com/upserve/uppend/CounterStore.java 100% <ø> (ø) 1 <0> (ø) ⬇️
...c/main/java/com/upserve/uppend/blobs/FilePage.java 100% <ø> (+24%) 5 <0> (+2) ⬆️
...pserve/uppend/metrics/CounterStoreWithMetrics.java 100% <ø> (ø) 15 <0> (-4) ⬇️
...serve/uppend/blobs/VirtualAppendOnlyBlobStore.java 84.21% <ø> (+5.26%) 7 <0> (+1) ⬆️
...n/java/com/upserve/uppend/CounterStoreBuilder.java 66.66% <0%> (-18.34%) 6 <3> (-2)
src/main/java/com/upserve/uppend/BlockStats.java 88.88% <0%> (+16.66%) 10 <1> (+3) ⬆️
...java/com/upserve/uppend/CounterStorePartition.java 84.09% <100%> (-10.36%) 14 <4> (-3)
...main/java/com/upserve/uppend/FileCounterStore.java 89.28% <100%> (+14.28%) 16 <3> (-7) ⬇️
...ava/com/upserve/uppend/AppendOnlyStoreBuilder.java 100% <100%> (ø) 10 <0> (-7) ⬇️
...main/java/com/upserve/uppend/blobs/MappedPage.java 100% <100%> (ø) 4 <2> (+1) ⬆️
... and 35 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0cdc4c9...d90e8b2. Read the comment docs.

// Consider sorting by blob pos or even grouping by the page of the blob pos and then flat-mapping the reads by page.
// Values stream can now be parallel, but it breaks everything...
LongStream longStream = blocks.values(lookups[hash].getValue(lookupKey));
return longStream.mapToObj(blobs[hash]::read);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applying Parallel here means the calling application cannot also be using parallel.

The previous version, using lazy concat did not suffer this issue because calling parallel on a stream is more of an aspirational statement than a demand.

The result is that our reads have never been parallel and I am unsure how to resolve this. Previously, when we hit this issue with using parallel during write we just removed that code - having decided it was not necessary (it was validating the page file metadata on open - that check and the metadata are now gone).

I think we should look at this in the larger context of how to make the gastronomer list queries actually execute in parallel.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At present, some portions of the query are, but many are not parallel.

@dstuebe dstuebe requested a review from bfulton February 25, 2019 00:29
@jeffrey-a-meunier jeffrey-a-meunier self-requested a review May 28, 2019 19:20
StandardOpenOption[] openOptions;
if (readOnly) {
if (readOnly){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space after paren?

@@ -213,7 +202,7 @@ public void append(final long pos, final long val) {
log.trace("appended value {} to {} at {}", val, file, pos);
}

public LongStream values(Long pos) {
public LongStream values(Long pos){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space after paren?

} else {
int numValues = (int) size;
long[] values = new long[numValues];
for (int i = 0; i < numValues; i++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use an offset variable here as well?

I have a feeling that this method could be rewritten to make it more clear, and the two loops could probably be combined into one. You could also move the if(size > valuesPerBlock) and if(size == 0) checks to the top of the ladder and make them separate if statements -- no else clauses would be needed.

But otoh if you're going for speed, then there's something to be said for putting the most likely case first in the if ladder. The value of doing that depends on how often the method gets called, though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right - this method is definitely in need of clarification.

I am going to leave the implementation alone for now. I have marked it as deprecated. The next task for Uppend is implementing this method using a spliterator that allows parallel reads.


long size = buf.getLong();
buf.getLong();
final long size = readLong(pos);

if (size < 0) {
long nextPos = -size;
long[] values = new long[valuesPerBlock];
for (int i = 0; i < valuesPerBlock; i++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use offset variable here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented out the lazy method entirely.


if (size < 0) {
long nextPos = -size;
long[] values = new long[valuesPerBlock];
for (int i = 0; i < valuesPerBlock; i++) {
values[i] = buf.getLong();
values[i] = readLong(pos + 16 + i * 8);
}
return LongStreams.lazyConcat(Arrays.stream(values), () -> values(nextPos));
} else if (size > valuesPerBlock) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reorganize with if(size > valuesPerBlock) and if(size == 0) at top?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented out the lazy method entirely.

@CommandLine.Option(names = {"-b", "--buffer-size"}, description = "Buffer Size (small|medium|large)")
BufferSize bufferSize = BufferSize.medium;

@CommandLine.Option(names = {"-s", "--size"}, description = "Benchmark size (nano|micro|small|medium|large|huge|gigantic)")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reminds me of this:
https://en.wikipedia.org/wiki/Intel_Memory_Model#Memory_models
The combinations were for code and/or data to be less than/greater than 64KB.

package com.upserve.uppend.cli.benchmark;

public enum BufferSize {
small(16 * 1024 * 1024 + 16),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment on the need for + 16?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Page size used to include 16 bytes of metadata.
To allow page size == buffer size, I added 16 here.
Removed.

@@ -156,21 +203,36 @@ public Long findKey(VirtualLongBlobStore longBlobStore, LookupKey key) {
if (comparison < 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is the positive comparison, I'd put this one first:

if(comparison == 0) {
  key.setPosition(keyPosition);
  hitCount.increment();
  return longBlobStore.readLong(keyPosition);
}

The remaining if/else can be a bit simpler:

if (comparison < 0) {
  upperKey = midpointKey;
  keyIndexUpper = midpointKeyIndex;
  bisectKeyTreeArrayIndex = bisectKeyTreeArrayIndex * 2;
} else
  lowerKey = midpointKey;
  keyIndexLower = midpointKeyIndex;
  bisectKeyTreeArrayIndex = bisectKeyTreeArrayIndex * 2 + 1;
}

return 0;
}
return -1;
}
Copy link
Contributor

@jeffrey-a-meunier jeffrey-a-meunier Jul 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can shorten the previous if statement:

if(o1 == null)
  return o2 == null ? 0 : -1;

SafeDeleting.removeTempPath(path);

appendOnlyStore = new AppendOnlyStoreBuilder()
.withPartitionSize(numPartitions)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method name confuses me. Is it the size of the partition or the number of partitions? Should the method be renamed withPartitionCount?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh - your right - lets hope the IDE does most of the work on this one...

SafeDeleting.removeTempPath(path);

appendOnlyStore = new AppendOnlyStoreBuilder()
.withPartitionSize(numPartitions)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

withPartitionCount?

@dstuebe dstuebe merged commit a03663e into master Jul 23, 2019
@dstuebe dstuebe deleted the linearized_tree branch July 23, 2019 17:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants