Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Star Tree] Lucene Abstractions for Star Tree File Formats #15278

Conversation

sarthakaggarwal97
Copy link
Contributor

@sarthakaggarwal97 sarthakaggarwal97 commented Aug 16, 2024

Description

Coming here from #14809.

Star Tree File Formats are responsible are writing the star tree data, meta and doc values directly into the segment.
For this to happen seamlessly, Star Tree depends on Lucene Producers and Consumers which are not public / extensible for OpenSearch.

Related Issues

Resolves #15279

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@sarthakaggarwal97 sarthakaggarwal97 changed the title [Startree] Lucene Consumer and Producer Abstractions for Star Tree File Formats [Star Tree] Lucene Consumer and Producer Abstractions for Star Tree File Formats Aug 16, 2024
@github-actions github-actions bot added enhancement Enhancement or improvement to existing feature or request Indexing:Performance labels Aug 16, 2024
@sarthakaggarwal97 sarthakaggarwal97 changed the title [Star Tree] Lucene Consumer and Producer Abstractions for Star Tree File Formats [Star Tree] Lucene Abstractions for Star Tree File Formats Aug 16, 2024
@sarthakaggarwal97 sarthakaggarwal97 marked this pull request as ready for review August 16, 2024 11:39
Copy link
Contributor

❌ Gradle check result for 310503c: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 23bc749: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@sarthakaggarwal97 sarthakaggarwal97 force-pushed the startree-fileformat-bases branch from 23bc749 to f31fe90 Compare August 21, 2024 08:46
Copy link
Contributor

✅ Gradle check result for f31fe90: SUCCESS

Copy link
Contributor

@ajaymovva ajaymovva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
@sarthakaggarwal97 sarthakaggarwal97 force-pushed the startree-fileformat-bases branch from f31fe90 to 3d167ff Compare August 23, 2024 04:45
Copy link
Contributor

❕ Gradle check result for 3d167ff: UNSTABLE

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

@sachinpkale sachinpkale merged commit 9e5604b into opensearch-project:main Aug 23, 2024
33 of 34 checks passed
@bharath-techie bharath-techie added the backport 2.x Backport to 2.x branch label Aug 27, 2024
opensearch-trigger-bot bot pushed a commit that referenced this pull request Aug 27, 2024
---------
Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
(cherry picked from commit 9e5604b)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
linuxpi pushed a commit that referenced this pull request Aug 27, 2024
…15436)

---------
Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
(cherry picked from commit 9e5604b)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
shiv0408 added a commit to shiv0408/OpenSearch that referenced this pull request Sep 2, 2024
* Optimize global ordinal includes/excludes for prefix matching (opensearch-project#14371)

* Optimize global ordinal includes/excludes for prefix matching

If an aggregration specifies includes or excludes based on a regular
expression, and the regular expression has a finite expansion followed
by .*, then we can optimize the global ordinal filter.

Specifically, in this case, we can expand the matching prefixes, then
include/exclude the range of global ordinals that start with each
prefix.

Signed-off-by: Michael Froh <froh@amazon.com>

* Add unit test

Signed-off-by: Michael Froh <froh@amazon.com>

* Add changelog entry

Signed-off-by: Michael Froh <froh@amazon.com>

* Improve test coverage

Updated the unit test to be functionally equivalent, but it covers
more of the regex logic.

Signed-off-by: Michael Froh <froh@amazon.com>

* Improve test coverage

Signed-off-by: Michael Froh <froh@amazon.com>

* Fix bug in exclude-only case with no doc values in segment

Signed-off-by: Michael Froh <froh@amazon.com>

* Address comments from @mch2

Signed-off-by: Michael Froh <froh@amazon.com>

---------

Signed-off-by: Michael Froh <froh@amazon.com>

* Adding access to noSubMatches and noOverlappingMatches in Hyphenation… (opensearch-project#13895)

* Adding access to noSubMatches and noOverlappingMatches in HyphenationCompoundWordTokenFilter

Signed-off-by: Evan Kielley <evankielley@gmail.com>

* Add Changelog Entry

Signed-off-by: Mohammad Hasnain Mohsin Rajan <hasnain2808@gmail.com>

* test: add hyphenation decompounder tests

Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com>

* test: refactor tests

Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com>

* test: reformat test files

Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com>

* chore: add changelog entry for 2.X

Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com>

* chore: remove 3.x changelog

Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com>

* chore: commonify settingsarr

Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com>

* chore: commonify settingsarr

Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com>

* chore: linting

Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com>

---------

Signed-off-by: Evan Kielley <evankielley@gmail.com>
Signed-off-by: Mohammad Hasnain Mohsin Rajan <hasnain2808@gmail.com>
Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com>
Co-authored-by: Evan Kielley <evankielley@gmail.com>

* Add Settings related to Workload Management feature (opensearch-project#15028)

* add QeryGroup Service tests
Signed-off-by: Ruirui Zhang <mariazrr@amazon.com>

* add PR to changelog
Signed-off-by: Ruirui Zhang <mariazrr@amazon.com>

* change the test directory
Signed-off-by: Ruirui Zhang <mariazrr@amazon.com>

* modify comments to be more specific
Signed-off-by: Ruirui Zhang <mariazrr@amazon.com>

* add test coverage
Signed-off-by: Ruirui Zhang <mariazrr@amazon.com>

* remove QUERY_GROUP_RUN_INTERVAL_SETTING as we'll define it in QueryGroupService
Signed-off-by: Ruirui Zhang <mariazrr@amazon.com>

* address comments
Signed-off-by: Ruirui Zhang <mariazrr@amazon.com>

* Update affiliation for @nknize. (opensearch-project#15322)

Signed-off-by: dblock <dblock@amazon.com>

* Add log when download completes with file size (opensearch-project#15224)

Signed-off-by: Gaurav Bafna <gbbafna@amazon.com>

* Support Filtering on Large List encoded by Bitmap (version update) (opensearch-project#15352)

Signed-off-by: Andriy Redko <andriy.redko@aiven.io>

* Add support for index level slice count setting (opensearch-project#15336)

Signed-off-by: Ganesh Ramadurai <gramadur@amazon.com>

* Adding allowlist setting for ingest-useragent and ingest-geoip processors (opensearch-project#15325)

* Adding allowlist setting for user-agent, geo-ip and updated tests for ingest-common.

Signed-off-by: Sarat Vemulapalli <vemulapallisarat@gmail.com>

* Remove duplicate test in ingest-common

Signed-off-by: Sarat Vemulapalli <vemulapallisarat@gmail.com>

* Adding changelog

Signed-off-by: Sarat Vemulapalli <vemulapallisarat@gmail.com>

---------

Signed-off-by: Sarat Vemulapalli <vemulapallisarat@gmail.com>

* Add Delete QueryGroup API Logic (opensearch-project#14735)

* Add Delete QueryGroup API Logic
Signed-off-by: Ruirui Zhang <mariazrr@amazon.com>

* modify changelog
Signed-off-by: Ruirui Zhang <mariazrr@amazon.com>

* include comments from create pr
Signed-off-by: Ruirui Zhang <mariazrr@amazon.com>

* remove delete all
Signed-off-by: Ruirui Zhang <mariazrr@amazon.com>

* rebase and address comments
Signed-off-by: Ruirui Zhang <mariazrr@amazon.com>

* rebase
Signed-off-by: Ruirui Zhang <mariazrr@amazon.com>

* address comments
Signed-off-by: Ruirui Zhang <mariazrr@amazon.com>

* address comments
Signed-off-by: Ruirui Zhang <mariazrr@amazon.com>

* address comments
Signed-off-by: Ruirui Zhang <mariazrr@amazon.com>

* add UT coverage
Signed-off-by: Ruirui Zhang <mariazrr@amazon.com>

* [Star Tree] Lucene Abstractions for Star Tree File Formats  (opensearch-project#15278)

---------
Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>

* [Star tree] Changes to handle derived metrics such as avg as part of star tree mapping (opensearch-project#15152)

---------
Signed-off-by: Bharathwaj G <bharath78910@gmail.com>

* relaxing the join validation for nodes which have only store disabled but only publication enabled

* relaxing the join validation for nodes which have only store disabled but only publication enabled

Signed-off-by: Rajiv Kumar Vaidyanathan <rajivkv@amazon.com>

---------

Signed-off-by: Michael Froh <froh@amazon.com>
Signed-off-by: Evan Kielley <evankielley@gmail.com>
Signed-off-by: Mohammad Hasnain Mohsin Rajan <hasnain2808@gmail.com>
Signed-off-by: Mohammad Hasnain <hasnain2808@gmail.com>
Signed-off-by: dblock <dblock@amazon.com>
Signed-off-by: Gaurav Bafna <gbbafna@amazon.com>
Signed-off-by: Andriy Redko <andriy.redko@aiven.io>
Signed-off-by: Ganesh Ramadurai <gramadur@amazon.com>
Signed-off-by: Sarat Vemulapalli <vemulapallisarat@gmail.com>
Signed-off-by: Rajiv Kumar Vaidyanathan <rajivkv@amazon.com>
Co-authored-by: Michael Froh <froh@amazon.com>
Co-authored-by: Mohammad Hasnain Mohsin Rajan <hasnain2808@gmail.com>
Co-authored-by: Evan Kielley <evankielley@gmail.com>
Co-authored-by: Ruirui Zhang <mariazrr@amazon.com>
Co-authored-by: Daniel (dB.) Doubrovkine <dblock@amazon.com>
Co-authored-by: Gaurav Bafna <85113518+gbbafna@users.noreply.github.com>
Co-authored-by: Andriy Redko <andriy.redko@aiven.io>
Co-authored-by: Ganesh Krishna Ramadurai <gramadur@icloud.com>
Co-authored-by: Sarat Vemulapalli <vemulapallisarat@gmail.com>
Co-authored-by: Sarthak Aggarwal <sarthagg@amazon.com>
Co-authored-by: Bharathwaj G <bharath78910@gmail.com>
Co-authored-by: Rajiv Kumar Vaidyanathan <rajivkv@amazon.com>
akolarkunnu pushed a commit to akolarkunnu/OpenSearch that referenced this pull request Sep 10, 2024
…h-project#15278)

---------
Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
dk2k pushed a commit to dk2k/OpenSearch that referenced this pull request Oct 17, 2024
…h-project#15278)

---------
Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
dk2k pushed a commit to dk2k/OpenSearch that referenced this pull request Oct 21, 2024
…h-project#15278)

---------
Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch enhancement Enhancement or improvement to existing feature or request Indexing:Performance skip-changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request] Lucene Abstractions for Star Tree
6 participants