Searchable snapshot read FST performance issue. #88366

howardhuanghua · 2022-07-08T08:57:45Z

Elasticsearch Version

8.0

Installed Plugins

No response

Java Version

bundled

OS Version

CentOS

Problem Description

Searchable snapshot read tip file has performance issue, currently BaseSearchableSnapshotIndexInput extends from BufferedIndexInput.
When we trigger the follow terms query,

GET /index/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "query": {
    "terms": {
      "id": [ "7129519", "7191131" ]
    }
  }
}
'

FST read tip file would use ReverseRandomAccessReader which read from the tail.
I have printed the doReadInternal details:

[2022-07-07T17:05:24,351][INFO ][o.e.x.s.s.i.MetadataCachingIndexInput] [node] cache readInternal: read [_28_Lucene84_0.tip][4134192-4135216] ([1024] bytes) from [randomaccess[length=1847637, file pointer=102073, offset=4032119]] name[randomaccess]
[2022-07-07T17:05:24,351][INFO ][o.e.x.s.s.i.MetadataCachingIndexInput] [node] cache readInternal: read [_28_Lucene84_0.tip][4134191-4135215] ([1024] bytes) from [randomaccess[length=1847637, file pointer=102072, offset=4032119]] name[randomaccess]
[2022-07-07T17:05:24,351][INFO ][o.e.x.s.s.i.MetadataCachingIndexInput] [node] cache readInternal: read [_28_Lucene84_0.tip][4134190-4135214] ([1024] bytes) from [randomaccess[length=1847637, file pointer=102071, offset=4032119]] name[randomaccess]
[2022-07-07T17:05:24,351][INFO ][o.e.x.s.s.i.MetadataCachingIndexInput] [node] cache readInternal: read [_28_Lucene84_0.tip][4134189-4135213] ([1024] bytes) from [randomaccess[length=1847637, file pointer=102070, offset=4032119]] name[randomaccess]
[2022-07-07T17:05:24,351][INFO ][o.e.x.s.s.i.MetadataCachingIndexInput] [node] cache readInternal: read [_28_Lucene84_0.tip][4134188-4135212] ([1024] bytes) from [randomaccess[length=1847637, file pointer=102069, offset=4032119]] name[randomaccess]
[2022-07-07T17:05:24,351][INFO ][o.e.x.s.s.i.MetadataCachingIndexInput] [node] cache readInternal: read [_28_Lucene84_0.tip][4134187-4135211] ([1024] bytes) from [randomaccess[length=1847637, file pointer=102068, offset=4032119]] name[randomaccess]

We could see that, each time read 1024 bytes, and only single byte is useful, and 1023 are duplicated reading, 4134192-4135216, 4134191-4135215..., that's because we use ReverseRandomAccessReader read from the tail.

Could we use ByteBufferIndexInput instead of BufferedIndexInput for BaseSearchableSnapshotIndexInput ?

Steps to Reproduce

Create a searchable snapshot index.
Trigger terms query.

Logs (if relevant)

No response

The text was updated successfully, but these errors were encountered:

elasticmachine · 2022-07-08T11:20:50Z

Pinging @elastic/es-distributed (Team:Distributed)

howardhuanghua · 2022-07-12T11:04:36Z

I got a proposal, for searchable snapshot .tip file reading, enhance BufferedIndexInput to support reverse cache:

tlrx · 2022-07-26T12:25:09Z

Thanks @howardhuanghua for reporting.

I agree that ReverseRandomAccessReader works inefficiently when the wrapped RandomAccessInput is a BufferedIndexInput: the internal buffer of BufferedIndexInput is forced to be refilled from the frozen/cold cache for every backward byte read. In Lucene the RandomAccessInput is created using IndexInput#randomAccessSlice(long offset, long length) and there is no indication of the forward/backward nature of reads (it's random, after all :) so enhancing BufferedIndexInput to support reverse cache reads might not be straightforward.

A possible solution could be to detect successive backward (or forward) reads and make a bet that the next read will also be backward (or forward) and refill the internal buffer accordingly. I can give this a try but I'd like to wait for an on-going effort we have that should add frozen tier to our nightly benchmarks. Once this is done we should hopefully better see the benefits of implementing this change.

howardhuanghua · 2022-07-26T12:54:05Z

Thanks @tlrx , I mean we could still read forward, but just reverse fill 1k buffer.
For example, for tip file, if we read one byte from position 3000, than we read 1977-3000 into buffer, and retrive the last byte.
Here is the override readByte function demo in BaseCacheIndexInput.

+    @Override
+    public byte readByte(long pos) throws IOException {
+        long index = pos - bufferStart;
+        if (index < 0 || index >= buffer.limit()) {
+            if (fileInfo.physicalName().endsWith(TERMS_INDEX_EXTENSION)) {
+                // tip use ReverseRandomAccessReader, reverse fill buffer
+                long reverseStart = pos - bufferSize + 1;
+                bufferStart = reverseStart > 0 ? reverseStart : 0;
+                buffer.limit(0);  // trigger refill() on read
+                seekInternal(bufferStart);
+                refill();
+                index = reverseStart > 0 ? bufferSize - 1 : pos;
+            } else {
+                bufferStart = pos;
+                buffer.limit(0);  // trigger refill() on read
+                seekInternal(pos);
+                refill();
+                index = 0;
+            }
+        }
+        return buffer.get((int) index);
+    }

romseygeek · 2023-06-13T12:47:53Z

This should be fixed by apache/lucene#12357

nik9000 · 2023-06-20T15:46:30Z

That Lucene change should go live in Elasticsearch 8.9.0.

howardhuanghua added >bug needs:triage Requires assignment of a team area label labels Jul 8, 2022

DaveCTurner added :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs and removed needs:triage Requires assignment of a team area label labels Jul 8, 2022

elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Jul 8, 2022

nik9000 closed this as completed Jun 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Searchable snapshot read FST performance issue. #88366

Searchable snapshot read FST performance issue. #88366

howardhuanghua commented Jul 8, 2022

elasticmachine commented Jul 8, 2022

howardhuanghua commented Jul 12, 2022

tlrx commented Jul 26, 2022

howardhuanghua commented Jul 26, 2022

romseygeek commented Jun 13, 2023

nik9000 commented Jun 20, 2023 •

edited

Loading

Searchable snapshot read FST performance issue. #88366

Searchable snapshot read FST performance issue. #88366

Comments

howardhuanghua commented Jul 8, 2022

Elasticsearch Version

Installed Plugins

Java Version

OS Version

Problem Description

Steps to Reproduce

Logs (if relevant)

elasticmachine commented Jul 8, 2022

howardhuanghua commented Jul 12, 2022

tlrx commented Jul 26, 2022

howardhuanghua commented Jul 26, 2022

romseygeek commented Jun 13, 2023

nik9000 commented Jun 20, 2023 • edited Loading

nik9000 commented Jun 20, 2023 •

edited

Loading