[BUG] Evaluate the performance of hybridfs against mmapfs #8298

jainankitk · 2023-06-27T19:13:19Z

Describe the bug
Elasticsearch ran some benchmark 4-5 years back for defaulting to hybridfs. Given lot has changed since, makes sense to rerun the benchmark for comparing the performance of different file types

Expected behavior
Run the performance benchmark for different store types and change the default to best one. It can changed using the index.store.type setting.

bbarani · 2023-06-27T19:54:19Z

@rishabh6788 Can you see if you can help here?

rishabh6788 · 2023-06-27T20:09:44Z

@jainankitk Does this require change in the underlying hardware, I mean the attached EBS volume type or is this something the OpenSearch handles logically?

jainankitk · 2023-06-27T20:16:35Z

@rishabh6788 - Opensearch (more precisely lucene) takes care of the abstraction on top of EBS volume or any other storage type. Just need to specify the different fs types using the index.store.type setting

rishabh6788 · 2023-06-27T20:24:18Z

As per my understanding you want to run benchmarks with different index.store.type settings. Could you please share the types you want to benchmark with?

jainankitk · 2023-06-27T20:24:21Z

We can try both types of workload (nyc_taxis / http_logs) with few different memory settings like:

2 GB memory
4 GB memory
8 GB memory
16 GB memory
32 GB memory

Options for index.store.type setting in opensearch.yml:

mmapfs
hybridfs
niofs

rishabh6788 · 2023-06-27T21:30:00Z

As of now the benchmark platform supports r5.xlarge/r6g.xlarge instance-types with 50% heap enabled. We do have a backlog to add multiple instance types for benchmark runs and working on it. For now I have initiated performance run for the following configuration:

Single-node, r5.xlarge node, 16GB heap for nyc_taxis and http_logs workloads.

NYC_TAXIS:
mmapfs: https://build.ci.opensearch.org/blue/organizations/jenkins/benchmark-test/detail/benchmark-test/813/pipeline
hybridfs: https://build.ci.opensearch.org/blue/organizations/jenkins/benchmark-test/detail/benchmark-test/814/pipeline
niosfs: https://build.ci.opensearch.org/blue/organizations/jenkins/benchmark-test/detail/benchmark-test/815/pipeline

HTTP_LOGS:
mmapfs: https://build.ci.opensearch.org/blue/organizations/jenkins/benchmark-test/detail/benchmark-test/816/pipeline
hybridfs: https://build.ci.opensearch.org/blue/organizations/jenkins/benchmark-test/detail/benchmark-test/817/pipeline
niofs: https://build.ci.opensearch.org/blue/organizations/jenkins/benchmark-test/detail/benchmark-test/818/pipeline

All the performance metrics gets ingested into a separate datastore cluster which we use to generate visualizations and dashboards .
We will be running these benchmarks once daily for the next couple of days to be able to visualize the data.

In case you want to take a look at the benchmark final results generated for above mentioned runs you can check the console logs by clicking on ./test.sh benchmark-test --bundle-manifest bundle-manifest.yml dropdown after the run has completed successfully.
Feel free to reach out to me in case you need help with seeing the results.

jainankitk · 2023-06-28T01:47:01Z

Quick comparison between the nyc_taxis numbers show that the performance for mmapfs and hybridfs is almost similar and much better than niofs as expected.

That being said, I doubt that nyc_taxis / http_logs workloads are hitting lucene segments files outside of ("nvd", "dvd", "tim", "tip", "dim", "kdd", "kdi", "cfs", "doc"), which are mmaped even on hybridfs. Hence, we need to identify workload that can hit segment files outside of this list like positions (.pos), payloads (.pay), Term Vector (.tvd) or Stored Fields (.fdt).

@mikemccand – In case we don’t have such workload in Opensearch, I am wondering if we can leverage lucene microbenchmark for comparing the mmap / nio performance for these segment files? Also, I came across this github issue that advocates MADV_RANDOM flag for randomly accessed Lucene segment files to prevent page cache thrashing. Any thoughts on that?

uschindler · 2024-03-26T15:38:17Z

See apache/lucene#13196, which is going into Lucene 9.11. It is used with Java 21 or later.

With recent Java 21/22 changes around project Panama this is no longer needed as you can pass a MemorySegment (used by new version of MMapDirectory) directly to native code using a MethodHandle. See above PR.

The only remaining problem is fadvise() to reduce impact on merging or when using NIOFSDir. fadvise needs a file descriptor, so native support in the JDK is a requirement. There's already a discussion going on on the OpenJDK bug tracker: https://bugs.openjdk.org/browse/JDK-8292771

jainankitk added bug Something isn't working untriaged labels Jun 27, 2023

anasalkouz added distributed framework and removed untriaged labels Jun 27, 2023

anasalkouz mentioned this issue Jun 27, 2023

[BUG] Default to mmapfs within hybridfs #8297

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Evaluate the performance of hybridfs against mmapfs #8298

[BUG] Evaluate the performance of hybridfs against mmapfs #8298

jainankitk commented Jun 27, 2023

bbarani commented Jun 27, 2023

rishabh6788 commented Jun 27, 2023

jainankitk commented Jun 27, 2023 •

edited

Loading

rishabh6788 commented Jun 27, 2023

jainankitk commented Jun 27, 2023

rishabh6788 commented Jun 27, 2023 •

edited

Loading

jainankitk commented Jun 28, 2023

uschindler commented Mar 26, 2024

[BUG] Evaluate the performance of hybridfs against mmapfs #8298

[BUG] Evaluate the performance of hybridfs against mmapfs #8298

Comments

jainankitk commented Jun 27, 2023

bbarani commented Jun 27, 2023

rishabh6788 commented Jun 27, 2023

jainankitk commented Jun 27, 2023 • edited Loading

rishabh6788 commented Jun 27, 2023

jainankitk commented Jun 27, 2023

rishabh6788 commented Jun 27, 2023 • edited Loading

jainankitk commented Jun 28, 2023

uschindler commented Mar 26, 2024

jainankitk commented Jun 27, 2023 •

edited

Loading

rishabh6788 commented Jun 27, 2023 •

edited

Loading