Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split the remote global metadata file to metadata attribute files #12190

Merged
merged 29 commits into from
May 16, 2024

Conversation

shiv0408
Copy link
Member

@shiv0408 shiv0408 commented Feb 6, 2024

Description

We are now uploading the global metadata of a cluster state as a separate file for each metadata attribute like coordination metadata, settings, templates and all of the custom metadata attributes. Remote global state directory will look like below:

base folder/
    |
    |--> index/
    |     | --> index_UUID/
    |              | --> metadata__<inverted_index_metadata_version>__<inverted_codec_version>__<timestamp>.dat
    |              | --> metadata__<inverted_index_metadata_version>__<inverted_codec_version>__<timestamp>.dat  
    |
    |--> global-metadata/
    |       | --> coordination__<inverted_metadata_version>__<inverted_codec_version>__<timestamp>.dat
    |       | --> settings__<inverted_metadata_version>__<inverted_codec_version>__<timestamp>.dat
    |       | --> templates__<inverted_metadata_version>__<inverted_codec_version>__<timestamp>.dat
    |       | --> custom__<type>__<inverted_metadata_version>__<inverted_codec_version>__<timestamp>.dat
    |
    |
    |--> manifest/
    |       | --> manifest__<inverted_term>__<inverted_version>__<inverted_codec_version>__<timestamp>
    |       | --> manifest__<inverted_term>__<inverted_version>__<inverted_codec_version>__<timestamp>

Splitting the global-metadata into multiple files have improved the incremental metadata upload time to S3 by 50-70%, and full metadata upload by upto 5% because of parallel upload of global metadata attribute and index metadata files.
These benchmarks were done by writing a microbenchmark on main (shiv0408/OpenSearch@fe5fad8) and on top of PR branch (shiv0408/OpenSearch@88ab1ac)

Following are the benchmark results:

Benchmark on main

Benchmark                                                                            (indicesAliasesTemplates)  Mode  Cnt     Score    Error  Units
RemoteClusterStateBenchmark.measureFullMetadataUpload                                1000|     100|       100|  avgt   30    60.832 ±  0.642  ms/op
RemoteClusterStateBenchmark.measureFullMetadataUpload                               10000|    1000|      1000|  avgt   30   615.146 ±  1.765  ms/op
RemoteClusterStateBenchmark.measureFullMetadataUpload                               20000|    2000|      2000|  avgt   30  1227.299 ±  4.178  ms/op
RemoteClusterStateBenchmark.measureFullMetadataUpload                               50000|    5000|      5000|  avgt   30  3031.392 ± 19.117  ms/op
RemoteClusterStateBenchmark.measureIncrementalClusterStateUpdate_Coordination        1000|     100|       100|  avgt   30     2.440 ±  0.014  ms/op
RemoteClusterStateBenchmark.measureIncrementalClusterStateUpdate_Coordination       10000|    1000|      1000|  avgt   30    25.849 ±  0.105  ms/op
RemoteClusterStateBenchmark.measureIncrementalClusterStateUpdate_Coordination       20000|    2000|      2000|  avgt   30    52.243 ±  0.476  ms/op
RemoteClusterStateBenchmark.measureIncrementalClusterStateUpdate_Coordination       50000|    5000|      5000|  avgt   30   139.867 ±  1.062  ms/op
RemoteClusterStateBenchmark.measureIncrementalClusterStateUpdate_IndexMetadata       1000|     100|       100|  avgt   30    32.722 ±  0.541  ms/op
RemoteClusterStateBenchmark.measureIncrementalClusterStateUpdate_IndexMetadata      10000|    1000|      1000|  avgt   30   311.668 ±  2.694  ms/op
RemoteClusterStateBenchmark.measureIncrementalClusterStateUpdate_IndexMetadata      20000|    2000|      2000|  avgt   30   622.160 ±  3.091  ms/op
RemoteClusterStateBenchmark.measureIncrementalClusterStateUpdate_IndexMetadata      50000|    5000|      5000|  avgt   30  1578.523 ±  2.661  ms/op
RemoteClusterStateBenchmark.measureIncrementalClusterStateUpdate_Settings            1000|     100|       100|  avgt   30     2.470 ±  0.007  ms/op
RemoteClusterStateBenchmark.measureIncrementalClusterStateUpdate_Settings           10000|    1000|      1000|  avgt   30    26.391 ±  0.250  ms/op
RemoteClusterStateBenchmark.measureIncrementalClusterStateUpdate_Settings           20000|    2000|      2000|  avgt   30    53.320 ±  0.876  ms/op
RemoteClusterStateBenchmark.measureIncrementalClusterStateUpdate_Settings           50000|    5000|      5000|  avgt   30   144.819 ±  1.237  ms/op
RemoteClusterStateBenchmark.measureIncrementalClusterStateUpdate_Templates           1000|     100|       100|  avgt   30     2.814 ±  0.024  ms/op
RemoteClusterStateBenchmark.measureIncrementalClusterStateUpdate_Templates          10000|    1000|      1000|  avgt   30    29.080 ±  0.160  ms/op
RemoteClusterStateBenchmark.measureIncrementalClusterStateUpdate_Templates          20000|    2000|      2000|  avgt   30    60.032 ±  0.397  ms/op
RemoteClusterStateBenchmark.measureIncrementalClusterStateUpdate_Templates          50000|    5000|      5000|  avgt   30   155.970 ±  2.390  ms/op

Benchmark after splitting the global metadata

Benchmark                                                                            (indicesAliasesTemplates)  Mode  Cnt     Score    Error  Units
RemoteClusterStateBenchmark.measureFullMetadataUpload                                1000|     100|       100|  avgt   30    59.594 ±  0.323  ms/op
RemoteClusterStateBenchmark.measureFullMetadataUpload                               10000|    1000|      1000|  avgt   30   599.334 ±  2.941  ms/op
RemoteClusterStateBenchmark.measureFullMetadataUpload                               20000|    2000|      2000|  avgt   30  1198.450 ±  5.466  ms/op
RemoteClusterStateBenchmark.measureFullMetadataUpload                               50000|    5000|      5000|  avgt   30  2990.730 ± 15.318  ms/op
RemoteClusterStateBenchmark.measureIncrementalClusterStateUpdate_Coordination        1000|     100|       100|  avgt   30     0.800 ±  0.019  ms/op
RemoteClusterStateBenchmark.measureIncrementalClusterStateUpdate_Coordination       10000|    1000|      1000|  avgt   30     8.483 ±  0.059  ms/op
RemoteClusterStateBenchmark.measureIncrementalClusterStateUpdate_Coordination       20000|    2000|      2000|  avgt   30    17.231 ±  0.271  ms/op
RemoteClusterStateBenchmark.measureIncrementalClusterStateUpdate_Coordination       50000|    5000|      5000|  avgt   30    65.734 ±  1.375  ms/op
RemoteClusterStateBenchmark.measureIncrementalClusterStateUpdate_IndexMetadata       1000|     100|       100|  avgt   30    31.890 ±  0.295  ms/op
RemoteClusterStateBenchmark.measureIncrementalClusterStateUpdate_IndexMetadata      10000|    1000|      1000|  avgt   30   304.154 ±  0.994  ms/op
RemoteClusterStateBenchmark.measureIncrementalClusterStateUpdate_IndexMetadata      20000|    2000|      2000|  avgt   30   606.649 ±  1.042  ms/op
RemoteClusterStateBenchmark.measureIncrementalClusterStateUpdate_IndexMetadata      50000|    5000|      5000|  avgt   30  1530.920 ± 14.235  ms/op
RemoteClusterStateBenchmark.measureIncrementalClusterStateUpdate_Settings            1000|     100|       100|  avgt   30     0.832 ±  0.008  ms/op
RemoteClusterStateBenchmark.measureIncrementalClusterStateUpdate_Settings           10000|    1000|      1000|  avgt   30     8.253 ±  0.226  ms/op
RemoteClusterStateBenchmark.measureIncrementalClusterStateUpdate_Settings           20000|    2000|      2000|  avgt   30    20.208 ±  0.280  ms/op
RemoteClusterStateBenchmark.measureIncrementalClusterStateUpdate_Settings           50000|    5000|      5000|  avgt   30    65.269 ±  0.439  ms/op
RemoteClusterStateBenchmark.measureIncrementalClusterStateUpdate_Templates           1000|     100|       100|  avgt   30     1.166 ±  0.005  ms/op
RemoteClusterStateBenchmark.measureIncrementalClusterStateUpdate_Templates          10000|    1000|      1000|  avgt   30    12.657 ±  0.245  ms/op
RemoteClusterStateBenchmark.measureIncrementalClusterStateUpdate_Templates          20000|    2000|      2000|  avgt   30    26.883 ±  0.419  ms/op
RemoteClusterStateBenchmark.measureIncrementalClusterStateUpdate_Templates          50000|    5000|      5000|  avgt   30    88.283 ±  0.754  ms/op

Related Issues

Resolves #12468
Resolves #10645

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Failing checks are inspected and point to the corresponding known issue(s) (See: Troubleshooting Failing Builds)
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)
  • Public documentation issue/PR created

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

…bute files

Signed-off-by: Shivansh Arora <hishiv@amazon.com>
Copy link
Contributor

github-actions bot commented Feb 6, 2024

Compatibility status:

Checks if related components are compatible with change 8244c6d

Incompatible components

Skipped components

Compatible components

Copy link
Contributor

github-actions bot commented Feb 6, 2024

❌ Gradle check result for 6bf7bc9: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for d0875f9: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for f6a2431: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Shivansh Arora <hishiv@amazon.com>
@shiv0408 shiv0408 force-pushed the cluster_state_split branch from f3e853b to 5d6a0ad Compare February 20, 2024 09:33
Copy link
Contributor

❌ Gradle check result for f3e853b: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 5d6a0ad: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@soosinha
Copy link
Member

Looks good on a high level. Can you move it out of draft ?

Signed-off-by: Shivansh Arora <hishiv@amazon.com>
Copy link
Contributor

❌ Gradle check result for 279dbbe: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@shiv0408 shiv0408 changed the title Split the cluster state remote global metadata file to metadata attri… Split the remote global metadata file to metadata attribute files Feb 26, 2024
@shiv0408 shiv0408 marked this pull request as ready for review February 26, 2024 08:32
Copy link
Contributor

❌ Gradle check result for 52d5f81: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@shiv0408 shiv0408 force-pushed the cluster_state_split branch from 52d5f81 to 4f8a64e Compare May 16, 2024 01:30
Copy link
Contributor

❌ Gradle check result for 4f8a64e: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

✅ Gradle check result for 4f8a64e: SUCCESS

Copy link
Member

@shwetathareja shwetathareja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving this change. Lets refactor RemoteClusterStateService.java before making any further changes to this class.

@shwetathareja shwetathareja merged commit da3ab92 into opensearch-project:main May 16, 2024
28 checks passed
@shwetathareja shwetathareja added the backport 2.x Backport to 2.x branch label May 16, 2024
opensearch-trigger-bot bot pushed a commit that referenced this pull request May 16, 2024
…2190)

* Split the cluster state remote global metadata file to metadata attribute files

Signed-off-by: Shivansh Arora <hishiv@amazon.com>
(cherry picked from commit da3ab92)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
shiv0408 added a commit to shiv0408/OpenSearch that referenced this pull request May 16, 2024
…ensearch-project#12190)

* Split the cluster state remote global metadata file to metadata attribute files

Signed-off-by: Shivansh Arora <hishiv@amazon.com>
(cherry picked from commit da3ab92)
shiv0408 added a commit to shiv0408/OpenSearch that referenced this pull request May 17, 2024
…ensearch-project#12190)

* Split the cluster state remote global metadata file to metadata attribute files

Signed-off-by: Shivansh Arora <hishiv@amazon.com>
(cherry picked from commit da3ab92)
deshsidd pushed a commit to deshsidd/OpenSearch that referenced this pull request May 17, 2024
…ensearch-project#12190)

* Split the cluster state remote global metadata file to metadata attribute files

Signed-off-by: Shivansh Arora <hishiv@amazon.com>
andrross pushed a commit to shiv0408/OpenSearch that referenced this pull request Jun 4, 2024
…ensearch-project#12190)

* Split the cluster state remote global metadata file to metadata attribute files

Signed-off-by: Shivansh Arora <hishiv@amazon.com>
(cherry picked from commit da3ab92)
shiv0408 added a commit to shiv0408/OpenSearch that referenced this pull request Jun 5, 2024
…ensearch-project#12190)

* Split the cluster state remote global metadata file to metadata attribute files

Signed-off-by: Shivansh Arora <31575408+shiv0408@users.noreply.github.com>
(cherry picked from commit da3ab92)
shiv0408 added a commit to shiv0408/OpenSearch that referenced this pull request Jun 5, 2024
…ensearch-project#12190)

* Split the cluster state remote global metadata file to metadata attribute files

Signed-off-by: Shivansh Arora <hishiv@amazon.com>
(cherry picked from commit da3ab92)
shwetathareja pushed a commit that referenced this pull request Jun 5, 2024
…ibute files (#13703)

* Split the remote global metadata file to metadata attribute files (#12190)

* Split the cluster state remote global metadata file to metadata attribute files

Signed-off-by: Shivansh Arora <hishiv@amazon.com>
(cherry picked from commit da3ab92)

* Remove conflicting static method from Metadata.Custom interface

Signed-off-by: Shivansh Arora <hishiv@amazon.com>

---------

Signed-off-by: Shivansh Arora <hishiv@amazon.com>
parv0201 pushed a commit to parv0201/OpenSearch that referenced this pull request Jun 10, 2024
…ensearch-project#12190)

* Split the cluster state remote global metadata file to metadata attribute files

Signed-off-by: Shivansh Arora <hishiv@amazon.com>
@shiv0408 shiv0408 deleted the cluster_state_split branch June 18, 2024 10:46
kkewwei pushed a commit to kkewwei/OpenSearch that referenced this pull request Jul 24, 2024
…ibute files (opensearch-project#13703)

* Split the remote global metadata file to metadata attribute files (opensearch-project#12190)

* Split the cluster state remote global metadata file to metadata attribute files

Signed-off-by: Shivansh Arora <hishiv@amazon.com>
(cherry picked from commit da3ab92)

* Remove conflicting static method from Metadata.Custom interface

Signed-off-by: Shivansh Arora <hishiv@amazon.com>

---------

Signed-off-by: Shivansh Arora <hishiv@amazon.com>
Signed-off-by: kkewwei <kkewwei@163.com>
@shiv0408 shiv0408 self-assigned this Aug 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch Cluster Manager ClusterManager:RemoteState enhancement Enhancement or improvement to existing feature or request skip-changelog Storage:Durability Issues and PRs related to the durability framework Storage:Remote Storage Issues and PRs relating to data and metadata storage v2.15.0 Issues and PRs related to version 2.15.0
Projects
Status: ✅ Done
Status: ✅ Done
8 participants