Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Evaluate the performance of hybridfs against mmapfs #8298

Open
jainankitk opened this issue Jun 27, 2023 · 8 comments
Open

[BUG] Evaluate the performance of hybridfs against mmapfs #8298

jainankitk opened this issue Jun 27, 2023 · 8 comments
Labels
bug Something isn't working distributed framework

Comments

@jainankitk
Copy link
Collaborator

Describe the bug
Elasticsearch ran some benchmark 4-5 years back for defaulting to hybridfs. Given lot has changed since, makes sense to rerun the benchmark for comparing the performance of different file types

Expected behavior
Run the performance benchmark for different store types and change the default to best one. It can changed using the index.store.type setting.

@jainankitk jainankitk added bug Something isn't working untriaged labels Jun 27, 2023
@bbarani
Copy link
Member

bbarani commented Jun 27, 2023

@rishabh6788 Can you see if you can help here?

@rishabh6788
Copy link
Contributor

@jainankitk Does this require change in the underlying hardware, I mean the attached EBS volume type or is this something the OpenSearch handles logically?

@jainankitk
Copy link
Collaborator Author

jainankitk commented Jun 27, 2023

@rishabh6788 - Opensearch (more precisely lucene) takes care of the abstraction on top of EBS volume or any other storage type. Just need to specify the different fs types using the index.store.type setting

@rishabh6788
Copy link
Contributor

As per my understanding you want to run benchmarks with different index.store.type settings. Could you please share the types you want to benchmark with?

@jainankitk
Copy link
Collaborator Author

We can try both types of workload (nyc_taxis / http_logs) with few different memory settings like:

  • 2 GB memory
  • 4 GB memory
  • 8 GB memory
  • 16 GB memory
  • 32 GB memory

Options for index.store.type setting in opensearch.yml:

  • mmapfs
  • hybridfs
  • niofs

@rishabh6788
Copy link
Contributor

rishabh6788 commented Jun 27, 2023

As of now the benchmark platform supports r5.xlarge/r6g.xlarge instance-types with 50% heap enabled. We do have a backlog to add multiple instance types for benchmark runs and working on it. For now I have initiated performance run for the following configuration:

Single-node, r5.xlarge node, 16GB heap for nyc_taxis and http_logs workloads.

NYC_TAXIS:
mmapfs: https://build.ci.opensearch.org/blue/organizations/jenkins/benchmark-test/detail/benchmark-test/813/pipeline
hybridfs: https://build.ci.opensearch.org/blue/organizations/jenkins/benchmark-test/detail/benchmark-test/814/pipeline
niosfs: https://build.ci.opensearch.org/blue/organizations/jenkins/benchmark-test/detail/benchmark-test/815/pipeline

HTTP_LOGS:
mmapfs: https://build.ci.opensearch.org/blue/organizations/jenkins/benchmark-test/detail/benchmark-test/816/pipeline
hybridfs: https://build.ci.opensearch.org/blue/organizations/jenkins/benchmark-test/detail/benchmark-test/817/pipeline
niofs: https://build.ci.opensearch.org/blue/organizations/jenkins/benchmark-test/detail/benchmark-test/818/pipeline

All the performance metrics gets ingested into a separate datastore cluster which we use to generate visualizations and dashboards .
We will be running these benchmarks once daily for the next couple of days to be able to visualize the data.

In case you want to take a look at the benchmark final results generated for above mentioned runs you can check the console logs by clicking on ./test.sh benchmark-test --bundle-manifest bundle-manifest.yml dropdown after the run has completed successfully.
Feel free to reach out to me in case you need help with seeing the results.

@jainankitk
Copy link
Collaborator Author

Quick comparison between the nyc_taxis numbers show that the performance for mmapfs and hybridfs is almost similar and much better than niofs as expected.

That being said, I doubt that nyc_taxis / http_logs workloads are hitting lucene segments files outside of ("nvd", "dvd", "tim", "tip", "dim", "kdd", "kdi", "cfs", "doc"), which are mmaped even on hybridfs. Hence, we need to identify workload that can hit segment files outside of this list like positions (.pos), payloads (.pay), Term Vector (.tvd) or Stored Fields (.fdt).

@mikemccand – In case we don’t have such workload in Opensearch, I am wondering if we can leverage lucene microbenchmark for comparing the mmap / nio performance for these segment files? Also, I came across this github issue that advocates MADV_RANDOM flag for randomly accessed Lucene segment files to prevent page cache thrashing. Any thoughts on that?

@uschindler
Copy link
Contributor

See apache/lucene#13196, which is going into Lucene 9.11. It is used with Java 21 or later.

With recent Java 21/22 changes around project Panama this is no longer needed as you can pass a MemorySegment (used by new version of MMapDirectory) directly to native code using a MethodHandle. See above PR.

The only remaining problem is fadvise() to reduce impact on merging or when using NIOFSDir. fadvise needs a file descriptor, so native support in the JDK is a requirement. There's already a discussion going on on the OpenJDK bug tracker: https://bugs.openjdk.org/browse/JDK-8292771

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working distributed framework
Projects
None yet
Development

No branches or pull requests

5 participants