Skip to content

Commit

Permalink
Enhance searchable snapshots to enable a read-only view of older snap…
Browse files Browse the repository at this point in the history
…shots (#5812)

* Enhance searchable snapshots to enable a read-only view of older snapshots (#5429)

* Enhance searchable snapshots to enable a read-only view of older snapshots

This change removes the guardrails around N-1 backward compatibility and uses Lucene's "expert" APIs to read snapshots (Lucene segments) older than N-1 on a best-effort basis. The functionality is gated by an additional feature flag, separate from the searchable snapshots flag. Note that the Lucene integration is rather inefficient because the necessary "expert" Lucene APIs are still package-private.

Signed-off-by: Kartik Ganesh <gkart@amazon.com>

* Added some unit tests

This change also includes a test index ZIP file for the unit tests. The change also introduces a bug fix in the readAnySegmentsInfo method to close the reader before returning the SegmentInfos object - this avoids dangling/open file handles.

Signed-off-by: Kartik Ganesh <gkart@amazon.com>

* Incorporating PR feedback

Signed-off-by: Kartik Ganesh <gkart@amazon.com>

* Incorporate PR comments from andrross

Signed-off-by: Kartik Ganesh <gkart@amazon.com>

* Remove use of IndexSetting for minimum version for snapshots backwards compatibility

Signed-off-by: Kartik Ganesh <gkart@amazon.com>

* Moved ES 6.3.0 test data to a subdirectory

This change also includes an update to the file name to clarify that it is an ES index, and changing the associated markdown file name to just README.md. All tests that reference this ZIP file have corresponding changes to the path they reference.

Signed-off-by: Kartik Ganesh <gkart@amazon.com>

* Update unit tests to use try-with-resources

Signed-off-by: Kartik Ganesh <gkart@amazon.com>

* Added FeatureFlagSetter helper class

Also refactored unit test classes to use the helper class.

Signed-off-by: Kartik Ganesh <gkart@amazon.com>

* Incorporating PR feedback from @mch2

Signed-off-by: Kartik Ganesh <gkart@amazon.com>

* Fix IndexSettingsTests

Updated the asserts in IndexSettingsTests to account for the new defaulting behavior.

Signed-off-by: Kartik Ganesh <gkart@amazon.com>

Signed-off-by: Kartik Ganesh <gkart@amazon.com>

* Fixed compile issues after cherry-pick

Note that the unit tests are still failing at this commit since the Lucene 9 libraries no longer hold constants for Lucene 7 and below, so the fromId logic resolves the Lucene version to 8.

Signed-off-by: Kartik Ganesh <gkart@amazon.com>

* Fix multiple aspects of version resolution

This change fixes resolution of the Lucene version for legacy versions since the Lucene 9 libraries no longer hold constants for Lucene 7 and below. The change also updates DECLARED_VERSIONS to derive from the Versions class rather than LegacyESVersions (thereby ignoring legacy versions). This in turn required a change to the minimumIndexCompatibleVersion logic for LegacyESVersion. Finally, the testMinimumIndexCompatibilityVersion unit test was updated to use accurate version identifiers.

All unit tests pass and the code compiles, but actual functionality is still broken because some backwards compatibility logic was removed in the current branch that is retained in 2.x

Signed-off-by: Kartik Ganesh <gkart@amazon.com>

* Reintroducing backwards compatibility logic in certain classes

This reverts changes made in #4728 and #4702. These were only made in main and not backported to 2.x
This change also adds unit tests for IndexMetadataGenerations

Signed-off-by: Kartik Ganesh <gkart@amazon.com>

* Declared LegacyESVersion constants for better readability

This commit also includes a correction to documentation, and removes the unnecessary "afterWriteSnapBlob" runnable from BlobStoreRepository

Signed-off-by: Kartik Ganesh <gkart@amazon.com>

* Fixing PR reference in CHANGELOG

Signed-off-by: Kartik Ganesh <gkart@amazon.com>

* Revert CHANGELOG update

This feature was released in 2.5.0 so it no longer needs to be listed in the changelog.

Signed-off-by: Kartik Ganesh <gkart@amazon.com>

* Remove outdated comment

Signed-off-by: Kartik Ganesh <gkart@amazon.com>

Signed-off-by: Kartik Ganesh <gkart@amazon.com>
  • Loading branch information
kartg authored Jan 13, 2023
1 parent 32dd9e2 commit 095c222
Show file tree
Hide file tree
Showing 34 changed files with 697 additions and 100 deletions.
2 changes: 1 addition & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -120,4 +120,4 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
### Security

[Unreleased 3.0]: https://github.com/opensearch-project/OpenSearch/compare/2.4...HEAD
[Unreleased 2.x]: https://github.com/opensearch-project/OpenSearch/compare/2.4...2.x
[Unreleased 2.x]: https://github.com/opensearch-project/OpenSearch/compare/2.4...2.x
22 changes: 22 additions & 0 deletions server/src/main/java/org/opensearch/LegacyESVersion.java
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,10 @@
*/
public class LegacyESVersion extends Version {

public static final LegacyESVersion V_6_0_0 = new LegacyESVersion(6000099, org.apache.lucene.util.Version.fromBits(7, 0, 0));
public static final LegacyESVersion V_6_5_0 = new LegacyESVersion(6050099, org.apache.lucene.util.Version.fromBits(7, 0, 0));
public static final LegacyESVersion V_7_2_0 = new LegacyESVersion(7020099, org.apache.lucene.util.Version.LUCENE_8_0_0);

// todo move back to Version.java if retiring legacy version support
protected static final ImmutableOpenIntMap<Version> idToVersion;
protected static final ImmutableOpenMap<String, Version> stringToVersion;
Expand Down Expand Up @@ -237,4 +241,22 @@ public String toString() {
}
return sb.toString();
}

@Override
protected Version computeMinIndexCompatVersion() {
final int prevLuceneVersionMajor = this.luceneVersion.major - 1;
final int bwcMajor;
if (major == 5) {
bwcMajor = 2; // we jumped from 2 to 5
} else if (major == 7) {
return LegacyESVersion.fromId(6000026);
} else {
bwcMajor = major - 1;
}
final int bwcMinor = 0;
return new LegacyESVersion(
bwcMajor * 1000000 + bwcMinor * 10000 + 99,
org.apache.lucene.util.Version.fromBits(prevLuceneVersionMajor, 0, 0)
);
}
}
5 changes: 2 additions & 3 deletions server/src/main/java/org/opensearch/Version.java
Original file line number Diff line number Diff line change
Expand Up @@ -316,8 +316,7 @@ public XContentBuilder toXContent(XContentBuilder builder, Params params) throws
* lazily once.
*/
static class DeclaredVersionsHolder {
// use LegacyESVersion.class since it inherits Version fields
protected static final List<Version> DECLARED_VERSIONS = Collections.unmodifiableList(getDeclaredVersions(LegacyESVersion.class));
protected static final List<Version> DECLARED_VERSIONS = Collections.unmodifiableList(getDeclaredVersions(Version.class));
}

// lazy initialized because we don't yet have the declared versions ready when instantiating the cached Version
Expand Down Expand Up @@ -394,7 +393,7 @@ public Version minimumIndexCompatibilityVersion() {
return res;
}

private Version computeMinIndexCompatVersion() {
protected Version computeMinIndexCompatVersion() {
final int bwcMajor;
if (major == 5) {
bwcMajor = 2; // we jumped from 2 to 5
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -401,8 +401,7 @@ public Builder addBlocks(IndexMetadata indexMetadata) {
if (IndexMetadata.INDEX_BLOCKS_READ_ONLY_ALLOW_DELETE_SETTING.get(indexMetadata.getSettings())) {
addIndexBlock(indexName, IndexMetadata.INDEX_READ_ONLY_ALLOW_DELETE_BLOCK);
}
if (IndexModule.Type.REMOTE_SNAPSHOT.getSettingsKey()
.equals(indexMetadata.getSettings().get(IndexModule.INDEX_STORE_TYPE_SETTING.getKey()))) {
if (IndexModule.Type.REMOTE_SNAPSHOT.match(indexMetadata.getSettings().get(IndexModule.INDEX_STORE_TYPE_SETTING.getKey()))) {
addIndexBlock(indexName, IndexMetadata.REMOTE_READ_ONLY_ALLOW_DELETE);
}
return this;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@
import com.carrotsearch.hppc.cursors.ObjectCursor;
import com.carrotsearch.hppc.cursors.ObjectObjectCursor;
import org.opensearch.Assertions;
import org.opensearch.LegacyESVersion;
import org.opensearch.Version;
import org.opensearch.action.admin.indices.rollover.RolloverInfo;
import org.opensearch.action.support.ActiveShardCount;
Expand Down Expand Up @@ -1840,13 +1841,17 @@ public static IndexMetadata fromXContent(XContentParser parser) throws IOExcepti
throw new IllegalArgumentException("Unexpected token " + token);
}
}
if (Assertions.ENABLED) {

final Version indexCreatedVersion = Version.indexCreated(builder.settings);
// Reference:
// https://github.com/opensearch-project/OpenSearch/blob/4dde0f2a3b445b2fc61dab29c5a2178967f4a3e3/server/src/main/java/org/opensearch/cluster/metadata/IndexMetadata.java#L1620-L1628
if (Assertions.ENABLED && indexCreatedVersion.onOrAfter(LegacyESVersion.V_6_5_0)) {
assert mappingVersion : "mapping version should be present for indices";
}
if (Assertions.ENABLED) {
assert settingsVersion : "settings version should be present for indices";
}
if (Assertions.ENABLED) {
// Reference:
// https://github.com/opensearch-project/OpenSearch/blob/2e4b27b243d8bd2c515f66cf86c6d1d6a601307f/server/src/main/java/org/opensearch/cluster/metadata/IndexMetadata.java#L1824
if (Assertions.ENABLED && indexCreatedVersion.onOrAfter(LegacyESVersion.V_7_2_0)) {
assert aliasesVersion : "aliases version should be present for indices";
}
return builder.build();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ public static RoutingPool getShardPool(ShardRouting shard, RoutingAllocation all
*/
public static RoutingPool getIndexPool(IndexMetadata indexMetadata) {
Settings indexSettings = indexMetadata.getSettings();
if (IndexModule.Type.REMOTE_SNAPSHOT.getSettingsKey().equals(indexSettings.get(IndexModule.INDEX_STORE_TYPE_SETTING.getKey()))) {
if (IndexModule.Type.REMOTE_SNAPSHOT.match(indexSettings.get(IndexModule.INDEX_STORE_TYPE_SETTING.getKey()))) {
return REMOTE_CAPABLE;
}
return LOCAL_ONLY;
Expand Down
20 changes: 20 additions & 0 deletions server/src/main/java/org/opensearch/common/lucene/Lucene.java
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@
import org.apache.lucene.index.SegmentCommitInfo;
import org.apache.lucene.index.SegmentInfos;
import org.apache.lucene.index.SegmentReader;
import org.apache.lucene.index.StandardDirectoryReader;
import org.apache.lucene.search.DocIdSetIterator;
import org.apache.lucene.search.Explanation;
import org.apache.lucene.search.FieldDoc;
Expand Down Expand Up @@ -134,6 +135,25 @@ public static SegmentInfos readSegmentInfos(Directory directory) throws IOExcept
return SegmentInfos.readLatestCommit(directory);
}

/**
* A variant of {@link #readSegmentInfos(Directory)} that supports reading indices written by
* older major versions of Lucene. The underlying implementation is a workaround since the
* "expert" readLatestCommit API is currently package-private in Lucene. First, all commits in
* the given {@link Directory} are listed - this result includes older Lucene commits. Then,
* the latest index commit is opened via {@link DirectoryReader} by including a minimum supported
* Lucene major version based on the minimum compatibility of the given {@link org.opensearch.Version}.
*/
public static SegmentInfos readSegmentInfosExtendedCompatibility(Directory directory, org.opensearch.Version minimumVersion)
throws IOException {
// This list is sorted from oldest to latest
List<IndexCommit> indexCommits = DirectoryReader.listCommits(directory);
IndexCommit latestCommit = indexCommits.get(indexCommits.size() - 1);
final int minSupportedLuceneMajor = minimumVersion.minimumIndexCompatibilityVersion().luceneVersion.major;
try (StandardDirectoryReader reader = (StandardDirectoryReader) DirectoryReader.open(latestCommit, minSupportedLuceneMajor, null)) {
return reader.getSegmentInfos();
}
}

/**
* Returns an iterable that allows to iterate over all files in this segments info
*/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,13 @@ public class FeatureFlags {
*/
public static final String SEARCHABLE_SNAPSHOT = "opensearch.experimental.feature.searchable_snapshot.enabled";

/**
* Gates the ability for Searchable Snapshots to read snapshots that are older than the
* guaranteed backward compatibility for OpenSearch (one prior major version) on a best effort basis.
*/
public static final String SEARCHABLE_SNAPSHOT_EXTENDED_COMPATIBILITY =
"opensearch.experimental.feature.searchable_snapshot.extended_compatibility.enabled";

/**
* Gates the functionality of extensions.
* Once the feature is ready for production release, this feature flag can be removed.
Expand Down
14 changes: 11 additions & 3 deletions server/src/main/java/org/opensearch/index/IndexModule.java
Original file line number Diff line number Diff line change
Expand Up @@ -461,11 +461,19 @@ public boolean match(String setting) {
}

/**
* Convenience method to check whether the given IndexSettings contains
* an {@link #INDEX_STORE_TYPE_SETTING} set to the value of this type.
* Convenience method to check whether the given {@link IndexSettings}
* object contains an {@link #INDEX_STORE_TYPE_SETTING} set to the value of this type.
*/
public boolean match(IndexSettings settings) {
return match(INDEX_STORE_TYPE_SETTING.get(settings.getSettings()));
return match(settings.getSettings());
}

/**
* Convenience method to check whether the given {@link Settings}
* object contains an {@link #INDEX_STORE_TYPE_SETTING} set to the value of this type.
*/
public boolean match(Settings settings) {
return match(INDEX_STORE_TYPE_SETTING.get(settings));
}
}

Expand Down
31 changes: 31 additions & 0 deletions server/src/main/java/org/opensearch/index/IndexSettings.java
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@
import org.opensearch.common.unit.ByteSizeUnit;
import org.opensearch.common.unit.ByteSizeValue;
import org.opensearch.common.unit.TimeValue;
import org.opensearch.common.util.FeatureFlags;
import org.opensearch.index.translog.Translog;
import org.opensearch.indices.replication.common.ReplicationType;
import org.opensearch.ingest.IngestService;
Expand All @@ -59,11 +60,13 @@
import java.util.function.Function;
import java.util.function.UnaryOperator;

import static org.opensearch.common.util.FeatureFlags.SEARCHABLE_SNAPSHOT_EXTENDED_COMPATIBILITY;
import static org.opensearch.index.mapper.MapperService.INDEX_MAPPING_DEPTH_LIMIT_SETTING;
import static org.opensearch.index.mapper.MapperService.INDEX_MAPPING_FIELD_NAME_LENGTH_LIMIT_SETTING;
import static org.opensearch.index.mapper.MapperService.INDEX_MAPPING_NESTED_DOCS_LIMIT_SETTING;
import static org.opensearch.index.mapper.MapperService.INDEX_MAPPING_NESTED_FIELDS_LIMIT_SETTING;
import static org.opensearch.index.mapper.MapperService.INDEX_MAPPING_TOTAL_FIELDS_LIMIT_SETTING;
import static org.opensearch.index.store.remote.directory.RemoteSnapshotDirectory.SEARCHABLE_SNAPSHOT_EXTENDED_COMPATIBILITY_MINIMUM_VERSION;

/**
* This class encapsulates all index level settings and handles settings updates.
Expand Down Expand Up @@ -585,6 +588,9 @@ public final class IndexSettings {
private final boolean isRemoteTranslogStoreEnabled;
private final String remoteStoreTranslogRepository;
private final String remoteStoreRepository;
private final boolean isRemoteSnapshot;
private Version extendedCompatibilitySnapshotVersion;

// volatile fields are updated via #updateIndexMetadata(IndexMetadata) under lock
private volatile Settings settings;
private volatile IndexMetadata indexMetadata;
Expand Down Expand Up @@ -748,6 +754,14 @@ public IndexSettings(final IndexMetadata indexMetadata, final Settings nodeSetti
isRemoteTranslogStoreEnabled = settings.getAsBoolean(IndexMetadata.SETTING_REMOTE_TRANSLOG_STORE_ENABLED, false);
remoteStoreTranslogRepository = settings.get(IndexMetadata.SETTING_REMOTE_TRANSLOG_STORE_REPOSITORY);
remoteStoreRepository = settings.get(IndexMetadata.SETTING_REMOTE_STORE_REPOSITORY);
isRemoteSnapshot = IndexModule.Type.REMOTE_SNAPSHOT.match(this.settings);

if (isRemoteSnapshot && FeatureFlags.isEnabled(SEARCHABLE_SNAPSHOT_EXTENDED_COMPATIBILITY)) {
extendedCompatibilitySnapshotVersion = SEARCHABLE_SNAPSHOT_EXTENDED_COMPATIBILITY_MINIMUM_VERSION;
} else {
extendedCompatibilitySnapshotVersion = Version.CURRENT.minimumIndexCompatibilityVersion();
}

this.searchThrottled = INDEX_SEARCH_THROTTLED.get(settings);
this.queryStringLenient = QUERY_STRING_LENIENT_SETTING.get(settings);
this.queryStringAnalyzeWildcard = QUERY_STRING_ANALYZE_WILDCARD.get(nodeSettings);
Expand Down Expand Up @@ -1017,6 +1031,23 @@ public String getRemoteStoreTranslogRepository() {
return remoteStoreTranslogRepository;
}

/**
* Returns true if this is remote/searchable snapshot
*/
public boolean isRemoteSnapshot() {
return isRemoteSnapshot;
}

/**
* If this is a remote snapshot and the extended compatibility
* feature flag is enabled, this returns the minimum {@link Version}
* supported. In all other cases, the return value is the
* {@link Version#minimumIndexCompatibilityVersion()} of {@link Version#CURRENT}.
*/
public Version getExtendedCompatibilitySnapshotVersion() {
return extendedCompatibilitySnapshotVersion;
}

/**
* Returns the node settings. The settings returned from {@link #getSettings()} are a merged version of the
* index settings and the node settings where node settings are overwritten by index settings.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@
import org.apache.lucene.search.ReferenceManager;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.Lock;
import org.opensearch.Version;
import org.opensearch.common.concurrent.GatedCloseable;
import org.opensearch.common.lucene.Lucene;
import org.opensearch.common.lucene.index.OpenSearchDirectoryReader;
Expand Down Expand Up @@ -89,6 +90,7 @@ public class ReadOnlyEngine extends Engine {
private final CompletionStatsCache completionStatsCache;
private final boolean requireCompleteHistory;
private final TranslogManager translogManager;
private final Version minimumSupportedVersion;

protected volatile TranslogStats translogStats;

Expand All @@ -115,6 +117,8 @@ public ReadOnlyEngine(
) {
super(config);
this.requireCompleteHistory = requireCompleteHistory;
// fetch the minimum Version for extended backward compatibility use-cases
this.minimumSupportedVersion = config.getIndexSettings().getExtendedCompatibilitySnapshotVersion();
try {
Store store = config.getStore();
store.incRef();
Expand All @@ -126,7 +130,11 @@ public ReadOnlyEngine(
// we obtain the IW lock even though we never modify the index.
// yet this makes sure nobody else does. including some testing tools that try to be messy
indexWriterLock = obtainLock ? directory.obtainLock(IndexWriter.WRITE_LOCK_NAME) : null;
this.lastCommittedSegmentInfos = Lucene.readSegmentInfos(directory);
if (isExtendedCompatibility()) {
this.lastCommittedSegmentInfos = Lucene.readSegmentInfosExtendedCompatibility(directory, this.minimumSupportedVersion);
} else {
this.lastCommittedSegmentInfos = Lucene.readSegmentInfos(directory);
}
if (seqNoStats == null) {
seqNoStats = buildSeqNoStats(config, lastCommittedSegmentInfos);
ensureMaxSeqNoEqualsToGlobalCheckpoint(seqNoStats);
Expand Down Expand Up @@ -215,7 +223,17 @@ protected final OpenSearchDirectoryReader wrapReader(

protected DirectoryReader open(IndexCommit commit) throws IOException {
assert Transports.assertNotTransportThread("opening index commit of a read-only engine");
return new SoftDeletesDirectoryReaderWrapper(DirectoryReader.open(commit), Lucene.SOFT_DELETES_FIELD);
DirectoryReader reader;
if (isExtendedCompatibility()) {
reader = DirectoryReader.open(commit, this.minimumSupportedVersion.luceneVersion.major, null);
} else {
reader = DirectoryReader.open(commit);
}
return new SoftDeletesDirectoryReaderWrapper(reader, Lucene.SOFT_DELETES_FIELD);
}

private boolean isExtendedCompatibility() {
return Version.CURRENT.minimumIndexCompatibilityVersion().onOrAfter(this.minimumSupportedVersion);
}

@Override
Expand Down
14 changes: 12 additions & 2 deletions server/src/main/java/org/opensearch/index/shard/IndexShard.java
Original file line number Diff line number Diff line change
Expand Up @@ -2067,7 +2067,7 @@ public void openEngineAndRecoverFromTranslog() throws IOException {
};

// Do not load the global checkpoint if this is a remote snapshot index
if (IndexModule.Type.REMOTE_SNAPSHOT.match(indexSettings) == false) {
if (indexSettings.isRemoteSnapshot() == false) {
loadGlobalCheckpointToReplicationTracker();
}

Expand Down Expand Up @@ -2126,7 +2126,7 @@ private void innerOpenEngineAndTranslog(LongSupplier globalCheckpointSupplier) t
}

private boolean assertSequenceNumbersInCommit() throws IOException {
final Map<String, String> userData = SegmentInfos.readLatestCommit(store.directory()).getUserData();
final Map<String, String> userData = fetchUserData();
assert userData.containsKey(SequenceNumbers.LOCAL_CHECKPOINT_KEY) : "commit point doesn't contains a local checkpoint";
assert userData.containsKey(MAX_SEQ_NO) : "commit point doesn't contains a maximum sequence number";
assert userData.containsKey(Engine.HISTORY_UUID_KEY) : "commit point doesn't contains a history uuid";
Expand All @@ -2141,6 +2141,16 @@ private boolean assertSequenceNumbersInCommit() throws IOException {
return true;
}

private Map<String, String> fetchUserData() throws IOException {
if (indexSettings.isRemoteSnapshot() && indexSettings.getExtendedCompatibilitySnapshotVersion() != null) {
// Inefficient method to support reading old Lucene indexes
return Lucene.readSegmentInfosExtendedCompatibility(store.directory(), indexSettings.getExtendedCompatibilitySnapshotVersion())
.getUserData();
} else {
return SegmentInfos.readLatestCommit(store.directory()).getUserData();
}
}

private void onNewEngine(Engine newEngine) {
assert Thread.holdsLock(engineMutex);
refreshListeners.setCurrentRefreshLocationSupplier(newEngine.translogManager()::getTranslogLastWriteLocation);
Expand Down
Loading

0 comments on commit 095c222

Please sign in to comment.