Skip to content

Commit

Permalink
Merge branch 'main' of github.com:opensearch-project/OpenSearch into …
Browse files Browse the repository at this point in the history
…doc_count_mapper_change
  • Loading branch information
bharath-techie committed Aug 21, 2024
2 parents 8200abf + 01184de commit d5efcde
Show file tree
Hide file tree
Showing 94 changed files with 4,323 additions and 197 deletions.
2 changes: 1 addition & 1 deletion .github/CODEOWNERS
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
# 3. Use the command palette to run the CODEOWNERS: Show owners of current file command, which will display all code owners for the current file.

# Default ownership for all repo files
* @anasalkouz @andrross @ashking94 @Bukhtawar @CEHENKLE @dblock @dbwiddis @gbbafna @kotwanikunal @linuxpi @mch2 @msfroh @nknize @owaiskazi19 @reta @Rishikesh1159 @sachinpkale @saratvemulapalli @shwetathareja @sohami @VachaShah
* @anasalkouz @andrross @ashking94 @Bukhtawar @CEHENKLE @dblock @dbwiddis @gbbafna @jainankitk @kotwanikunal @linuxpi @mch2 @msfroh @nknize @owaiskazi19 @reta @Rishikesh1159 @sachinpkale @saratvemulapalli @shwetathareja @sohami @VachaShah

/modules/lang-painless/ @anasalkouz @andrross @ashking94 @Bukhtawar @CEHENKLE @dblock @dbwiddis @gbbafna @jed326 @kotwanikunal @mch2 @msfroh @nknize @owaiskazi19 @reta @Rishikesh1159 @sachinpkale @saratvemulapalli @shwetathareja @sohami @VachaShah
/modules/parent-join/ @anasalkouz @andrross @ashking94 @Bukhtawar @CEHENKLE @dblock @dbwiddis @gbbafna @jed326 @kotwanikunal @mch2 @msfroh @nknize @owaiskazi19 @reta @Rishikesh1159 @sachinpkale @saratvemulapalli @shwetathareja @sohami @VachaShah
Expand Down
7 changes: 7 additions & 0 deletions .github/workflows/dependabot_pr.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,13 @@ jobs:
with:
token: ${{ steps.github_app_token.outputs.token }}

# See please https://docs.gradle.org/8.10/userguide/upgrading_version_8.html#minimum_daemon_jvm_version
- name: Set up JDK 17
uses: actions/setup-java@v4
with:
java-version: 17
distribution: temurin

- name: Update Gradle SHAs
run: |
./gradlew updateSHAs
Expand Down
14 changes: 13 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,12 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
- [Concurrent Segment Search] Support composite aggregations with scripting ([#15072](https://github.com/opensearch-project/OpenSearch/pull/15072))
- Add `rangeQuery` and `regexpQuery` for `constant_keyword` field type ([#14711](https://github.com/opensearch-project/OpenSearch/pull/14711))
- Add took time to request nodes stats ([#15054](https://github.com/opensearch-project/OpenSearch/pull/15054))
- [Workload Management] Add Get QueryGroup API Logic ([14709](https://github.com/opensearch-project/OpenSearch/pull/14709))
- [Workload Management] Add Settings for Workload Management feature ([#15028](https://github.com/opensearch-project/OpenSearch/pull/15028))
- [Workload Management] QueryGroup resource tracking framework changes ([#13897](https://github.com/opensearch-project/OpenSearch/pull/13897))
- Support filtering on a large list encoded by bitmap ([#14774](https://github.com/opensearch-project/OpenSearch/pull/14774))
- Add slice execution listeners to SearchOperationListener interface ([#15153](https://github.com/opensearch-project/OpenSearch/pull/15153))
- Adding access to noSubMatches and noOverlappingMatches in Hyphenation ([#13895](https://github.com/opensearch-project/OpenSearch/pull/13895))

### Dependencies
- Bump `netty` from 4.1.111.Final to 4.1.112.Final ([#15081](https://github.com/opensearch-project/OpenSearch/pull/15081))
Expand All @@ -29,13 +33,20 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
- Bump `org.tukaani:xz` from 1.9 to 1.10 ([#15110](https://github.com/opensearch-project/OpenSearch/pull/15110))
- Bump `actions/setup-java` from 1 to 4 ([#15104](https://github.com/opensearch-project/OpenSearch/pull/15104))
- Bump `org.apache.avro:avro` from 1.11.3 to 1.12.0 in /plugins/repository-hdfs ([#15119](https://github.com/opensearch-project/OpenSearch/pull/15119))
- Bump `org.bouncycastle:bcpg-fips` from 1.0.7.1 to 2.0.8 and `org.bouncycastle:bc-fips` from 1.0.2.5 to 2.0.0 in /distribution/tools/plugin-cli ([#15103](https://github.com/opensearch-project/OpenSearch/pull/15103))
- Bump `org.bouncycastle:bcpg-fips` from 1.0.7.1 to 2.0.9 ([#15103](https://github.com/opensearch-project/OpenSearch/pull/15103), [#15299](https://github.com/opensearch-project/OpenSearch/pull/15299))
- Bump `com.azure:azure-core` from 1.49.1 to 1.51.0 ([#15111](https://github.com/opensearch-project/OpenSearch/pull/15111))
- Bump `org.xerial.snappy:snappy-java` from 1.1.10.5 to 1.1.10.6 ([#15207](https://github.com/opensearch-project/OpenSearch/pull/15207))
- Bump `com.azure:azure-xml` from 1.0.0 to 1.1.0 ([#15206](https://github.com/opensearch-project/OpenSearch/pull/15206))
- Bump `reactor` from 3.5.19 to 3.5.20 ([#15262](https://github.com/opensearch-project/OpenSearch/pull/15262))
- Bump `reactor-netty` from 1.1.21 to 1.1.22 ([#15262](https://github.com/opensearch-project/OpenSearch/pull/15262))
- Bump `org.apache.kerby:kerb-admin` from 2.0.3 to 2.1.0 ([#15301](https://github.com/opensearch-project/OpenSearch/pull/15301))
- Bump `com.azure:azure-core-http-netty` from 1.15.1 to 1.15.3 ([#15300](https://github.com/opensearch-project/OpenSearch/pull/15300))
- Bump `com.gradle.develocity` from 3.17.6 to 3.18 ([#15297](https://github.com/opensearch-project/OpenSearch/pull/15297))
- Bump `commons-cli:commons-cli` from 1.8.0 to 1.9.0 ([#15298](https://github.com/opensearch-project/OpenSearch/pull/15298))

### Changed
- Add lower limit for primary and replica batch allocators timeout ([#14979](https://github.com/opensearch-project/OpenSearch/pull/14979))
- Optimize regexp-based include/exclude on aggregations when pattern matches prefixes ([#14371](https://github.com/opensearch-project/OpenSearch/pull/14371))
- Replace and block usages of org.apache.logging.log4j.util.Strings ([#15238](https://github.com/opensearch-project/OpenSearch/pull/15238))

### Deprecated
Expand All @@ -50,6 +61,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
- Fix array_index_out_of_bounds_exception when indexing documents with field name containing only dot ([#15126](https://github.com/opensearch-project/OpenSearch/pull/15126))
- Fixed array field name omission in flat_object function for nested JSON ([#13620](https://github.com/opensearch-project/OpenSearch/pull/13620))
- Fix range aggregation optimization ignoring top level queries ([#15194](https://github.com/opensearch-project/OpenSearch/pull/15194))
- Fix incorrect parameter names in MinHash token filter configuration handling ([#15233](https://github.com/opensearch-project/OpenSearch/pull/15233))

### Security

Expand Down
1 change: 1 addition & 0 deletions MAINTAINERS.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ This document contains a list of maintainers in this repo. See [opensearch-proje
| Anas Alkouz | [anasalkouz](https://github.com/anasalkouz) | Amazon |
| Andrew Ross | [andrross](https://github.com/andrross) | Amazon |
| Andriy Redko | [reta](https://github.com/reta) | Aiven |
| Ankit Jain | [jainankitk](https://github.com/jainankitk) | Amazon |
| Ashish Singh | [ashking94](https://github.com/ashking94) | Amazon |
| Bukhtawar Khan | [Bukhtawar](https://github.com/Bukhtawar) | Amazon |
| Charlotte Henkle | [CEHENKLE](https://github.com/CEHENKLE) | Amazon |
Expand Down
4 changes: 2 additions & 2 deletions buildSrc/version.properties
Original file line number Diff line number Diff line change
Expand Up @@ -33,8 +33,8 @@ netty = 4.1.112.Final
joda = 2.12.7

# project reactor
reactor_netty = 1.1.21
reactor = 3.5.19
reactor_netty = 1.1.22
reactor = 3.5.20

# client dependencies
httpclient5 = 5.2.3
Expand Down
1 change: 0 additions & 1 deletion client/rest/licenses/reactor-core-3.5.19.jar.sha1

This file was deleted.

1 change: 1 addition & 0 deletions client/rest/licenses/reactor-core-3.5.20.jar.sha1
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
1fc0f91e2b93778a974339d2c24363d7f34f90b4
2 changes: 1 addition & 1 deletion distribution/tools/plugin-cli/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ base {
dependencies {
compileOnly project(":server")
compileOnly project(":libs:opensearch-cli")
api "org.bouncycastle:bcpg-fips:2.0.8"
api "org.bouncycastle:bcpg-fips:2.0.9"
api "org.bouncycastle:bc-fips:2.0.0"
testImplementation project(":test:framework")
testImplementation 'com.google.jimfs:jimfs:1.3.0'
Expand Down

This file was deleted.

Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
f69719ef8dbf34d5f906ce480496446b2fd2ae27
Original file line number Diff line number Diff line change
Expand Up @@ -54,11 +54,16 @@
*/
public class HyphenationCompoundWordTokenFilterFactory extends AbstractCompoundWordTokenFilterFactory {

private final boolean noSubMatches;
private final boolean noOverlappingMatches;
private final HyphenationTree hyphenationTree;

HyphenationCompoundWordTokenFilterFactory(IndexSettings indexSettings, Environment env, String name, Settings settings) {
super(indexSettings, env, name, settings);

noSubMatches = settings.getAsBoolean("no_sub_matches", false);
noOverlappingMatches = settings.getAsBoolean("no_overlapping_matches", false);

String hyphenationPatternsPath = settings.get("hyphenation_patterns_path", null);
if (hyphenationPatternsPath == null) {
throw new IllegalArgumentException("hyphenation_patterns_path is a required setting.");
Expand All @@ -85,7 +90,9 @@ public TokenStream create(TokenStream tokenStream) {
minWordSize,
minSubwordSize,
maxSubwordSize,
onlyLongestMatch
onlyLongestMatch,
noSubMatches,
noOverlappingMatches
);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -65,10 +65,10 @@ private Map<String, String> convertSettings(Settings settings) {
if (settings.hasValue("hash_count")) {
settingMap.put("hashCount", settings.get("hash_count"));
}
if (settings.hasValue("bucketCount")) {
if (settings.hasValue("bucket_count")) {
settingMap.put("bucketCount", settings.get("bucket_count"));
}
if (settings.hasValue("hashSetSize")) {
if (settings.hasValue("hash_set_size")) {
settingMap.put("hashSetSize", settings.get("hash_set_size"));
}
if (settings.hasValue("with_rotation")) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -50,8 +50,12 @@
import org.opensearch.test.IndexSettingsModule;
import org.opensearch.test.OpenSearchTestCase;
import org.hamcrest.MatcherAssert;
import org.junit.Before;

import java.io.IOException;
import java.io.InputStream;
import java.nio.file.Files;
import java.nio.file.Path;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
Expand All @@ -63,17 +67,27 @@
import static org.hamcrest.Matchers.instanceOf;

public class CompoundAnalysisTests extends OpenSearchTestCase {

Settings[] settingsArr;

@Before
public void initialize() throws IOException {
final Path home = createTempDir();
copyHyphenationPatternsFile(home);
this.settingsArr = new Settings[] { getJsonSettings(home), getYamlSettings(home) };
}

public void testDefaultsCompoundAnalysis() throws Exception {
Settings settings = getJsonSettings();
IndexSettings idxSettings = IndexSettingsModule.newIndexSettings("test", settings);
AnalysisModule analysisModule = createAnalysisModule(settings);
TokenFilterFactory filterFactory = analysisModule.getAnalysisRegistry().buildTokenFilterFactories(idxSettings).get("dict_dec");
MatcherAssert.assertThat(filterFactory, instanceOf(DictionaryCompoundWordTokenFilterFactory.class));
for (Settings settings : this.settingsArr) {
IndexSettings idxSettings = IndexSettingsModule.newIndexSettings("test", settings);
AnalysisModule analysisModule = createAnalysisModule(settings);
TokenFilterFactory filterFactory = analysisModule.getAnalysisRegistry().buildTokenFilterFactories(idxSettings).get("dict_dec");
MatcherAssert.assertThat(filterFactory, instanceOf(DictionaryCompoundWordTokenFilterFactory.class));
}
}

public void testDictionaryDecompounder() throws Exception {
Settings[] settingsArr = new Settings[] { getJsonSettings(), getYamlSettings() };
for (Settings settings : settingsArr) {
for (Settings settings : this.settingsArr) {
List<String> terms = analyze(settings, "decompoundingAnalyzer", "donaudampfschiff spargelcremesuppe");
MatcherAssert.assertThat(terms.size(), equalTo(8));
MatcherAssert.assertThat(
Expand All @@ -83,6 +97,26 @@ public void testDictionaryDecompounder() throws Exception {
}
}

// Hyphenation Decompounder tests mimic the behavior of lucene tests
// lucene/analysis/common/src/test/org/apache/lucene/analysis/compound/TestHyphenationCompoundWordTokenFilterFactory.java
public void testHyphenationDecompounder() throws Exception {
for (Settings settings : this.settingsArr) {
List<String> terms = analyze(settings, "hyphenationAnalyzer", "min veninde som er lidt af en læsehest");
MatcherAssert.assertThat(terms.size(), equalTo(10));
MatcherAssert.assertThat(terms, hasItems("min", "veninde", "som", "er", "lidt", "af", "en", "læsehest", "læse", "hest"));
}
}

// Hyphenation Decompounder tests mimic the behavior of lucene tests
// lucene/analysis/common/src/test/org/apache/lucene/analysis/compound/TestHyphenationCompoundWordTokenFilterFactory.java
public void testHyphenationDecompounderNoSubMatches() throws Exception {
for (Settings settings : this.settingsArr) {
List<String> terms = analyze(settings, "hyphenationAnalyzerNoSubMatches", "basketballkurv");
MatcherAssert.assertThat(terms.size(), equalTo(3));
MatcherAssert.assertThat(terms, hasItems("basketballkurv", "basketball", "kurv"));
}
}

private List<String> analyze(Settings settings, String analyzerName, String text) throws IOException {
IndexSettings idxSettings = IndexSettingsModule.newIndexSettings("test", settings);
AnalysisModule analysisModule = createAnalysisModule(settings);
Expand Down Expand Up @@ -111,21 +145,28 @@ public Map<String, AnalysisProvider<TokenFilterFactory>> getTokenFilters() {
}));
}

private Settings getJsonSettings() throws IOException {
private void copyHyphenationPatternsFile(Path home) throws IOException {
InputStream hyphenation_patterns_path = getClass().getResourceAsStream("da_UTF8.xml");
Path config = home.resolve("config");
Files.createDirectory(config);
Files.copy(hyphenation_patterns_path, config.resolve("da_UTF8.xml"));
}

private Settings getJsonSettings(Path home) throws IOException {
String json = "/org/opensearch/analysis/common/test1.json";
return Settings.builder()
.loadFromStream(json, getClass().getResourceAsStream(json), false)
.put(IndexMetadata.SETTING_VERSION_CREATED, Version.CURRENT)
.put(Environment.PATH_HOME_SETTING.getKey(), createTempDir().toString())
.put(Environment.PATH_HOME_SETTING.getKey(), home.toString())
.build();
}

private Settings getYamlSettings() throws IOException {
private Settings getYamlSettings(Path home) throws IOException {
String yaml = "/org/opensearch/analysis/common/test1.yml";
return Settings.builder()
.loadFromStream(yaml, getClass().getResourceAsStream(yaml), false)
.put(IndexMetadata.SETTING_VERSION_CREATED, Version.CURRENT)
.put(Environment.PATH_HOME_SETTING.getKey(), createTempDir().toString())
.put(Environment.PATH_HOME_SETTING.getKey(), home.toString())
.build();
}
}
Loading

0 comments on commit d5efcde

Please sign in to comment.