-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make typeless APIs usable with indices whose type name is different from _doc
#35790
Make typeless APIs usable with indices whose type name is different from _doc
#35790
Conversation
…rom `_doc`. This commit makes `document`, `update`, `explain`, `termvectors` and `mapping` typeless APIs work on indices that have a type whose name is not `_doc`. Unfortunately, this needs to be a bit of a hack since I didn't want calls with random type names to see documents with the type name that the user had chosen upon type creation. The `explain` and `termvectors` do not support being called without a type for now so the test is just using `_doc` as a type for now, we will need to fix tests later but this shouldn't require further changes server-side since passing `_doc` as a type name is what typeless APIs do internally anyway. Relates elastic#35190
Pinging @elastic/es-search |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jpountz for working on this! Some comments:
- I noticed that when there is an existing mapping type, a typeless 'put mapping' call can result in an assertion error around mapping versions. Here is a short reproduction: https://gist.github.com/jtibshirani/66bb9719a6398fe05d966bf2aba4c68a
- I think we also need to account for the types mismatch when checking whether routing is required. This applies to a few of the APIs, but here is an example call for explain requests: https://github.com/elastic/elasticsearch/blob/master/server/src/main/java/org/elasticsearch/action/explain/TransportExplainAction.java#L86
- Maybe we could add tests for 'bulk' requests and mtermvectors for completeness?
I also thought it would be good if someone from the distributed team could take a look at the PR, as these changes are a bit subtle, and there may be other special cases to consider.
@@ -3497,4 +3500,57 @@ public void testResetEngine() throws Exception { | |||
public Settings threadPoolSettings() { | |||
return Settings.builder().put(super.threadPoolSettings()).put("thread_pool.estimated_time_interval", "5ms").build(); | |||
} | |||
|
|||
public void testDeleteOnWrongType() throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small comment: maybe this test and the next should have consistent names (like testTypelessDelete
and testTypelessGet
)?
@@ -302,6 +302,11 @@ public Query buildFilteredQuery(Query query) { | |||
|
|||
private Query createTypeFilter(String[] types) { | |||
if (types != null && types.length >= 1) { | |||
if (Arrays.asList(types).contains(MapperService.SINGLE_MAPPING_NAME)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this was added to deal with 'explain' requests, but for me this makes the overall search behavior a bit unintuitive. If a user explicitly adds a (deprecated) type filter on _doc
in a search, I think they would expect the filter to be applied instead of being ignored. From a user's perspective the search APIs differ from the document CRUD APIs, where the _doc
is instead seen as part of the endpoint (and not an explicit filter).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point.
* has no mappings. | ||
* @deprecated Use {@link #mapping()} instead now that indices have a single type | ||
*/ | ||
@Deprecated | ||
public ImmutableOpenMap<String, MappingMetaData> getMappings() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that we should remove this one too in the future but decided to keep it for a follow-up PR as this PR would otherwise be much larger.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Thanks for the comments @jtibshirani, I pushed more changes to address the issues that you found. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, just a couple small comments. Do you think someone else should look it over as well, who's very familiar with some of these core classes? This PR also reminded me that we need to remove support for the _default_
mapping type.
* has no mappings. | ||
* @deprecated Use {@link #mapping()} instead now that indices have a single type | ||
*/ | ||
@Deprecated | ||
public ImmutableOpenMap<String, MappingMetaData> getMappings() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
@@ -0,0 +1,33 @@ | |||
--- | |||
"mtermvectors without types on an index that has types": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small comment: inconsistent spacing in this test (extra newline in mtermvectors.body.docs
, spaces before :
).
@@ -104,15 +107,19 @@ protected void asyncShardOperation(ExplainRequest request, ShardId shardId, | |||
|
|||
@Override | |||
protected ExplainResponse shardOperation(ExplainRequest request, ShardId shardId) throws IOException { | |||
String[] types; | |||
if (MapperService.SINGLE_MAPPING_NAME.equals(request.type())) { // typeless explain call | |||
types = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small comment, but I think an empty array would be more consistent with the default value of types
in ShardSearchRequest
, QueryShardContext
, etc.?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.
The elasticsearch/server/src/main/java/org/elasticsearch/index/mapper/MapperService.java Lines 414 to 416 in be75b40
|
@javanna Would you mind having a look at this change? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jpountz, I had missed the _default_
mapping removal.
@javanna just a friendly ping on this (I have a couple PRs that are dependent on this change). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some questions, nothing major though I think.
body: { foo: bar } | ||
|
||
- do: | ||
catch: bad_request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get where this comes from, yet the error is a bit weird, specifically because it mentions mapping update while we are deleting a document, which may confuse users:
Rejecting mapping update to [index] as the final mapping would have more than 1 type: [not_doc, some_random_type]
Not sure whether this was previously discussed and/or can be improved. I was even wondering if this should have been a 404 - doc not found or 400 - request is wrong as the type is not there...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit unrelated to this PR, the behavior was already like this before. I agree it's not great but at the same time it's a bit tricky given that DELETE calls are write calls and that there are expectations regarding versioning if you delete a non-existing document and create a new one soon after with the same id.
indices.refresh: {} | ||
|
||
- do: | ||
catch: missing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this makes more sense to me compared to the 400 above. Yet I think - besides which error code is better - the two scenarios (explain and delete) should return the same error code when referring to a doc belonging to a type that's not there?
delete: | ||
index: index | ||
type: some_random_type | ||
id: 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe add some assert for the error message that comes back too?
- match: { _type: "_doc" } | ||
- match: { _id: "1"} | ||
- match: { _version: 1} | ||
- match: { _source: { foo: bar }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we also need to check whether the doc comes back when specifying not_doc as a type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't add a test for it because this is already tested by get/11_basic_with_types.yml
?
- match: { _index: "index" } | ||
- match: { _type: "_doc" } | ||
- match: { _id: "1"} | ||
- match: { _version: 2} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we also need to check whether the doc gets deleted when specifying not_doc as a type? Or maybe that is already covered by existing tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indeed this is tested by delete/13_basic_with_types.yml
- match: { _type: "_doc" } | ||
- match: { _id: "1"} | ||
- is_true: matched | ||
- match: { explanation.value: 1 } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we also need to check whether the doc is found when specifying not_doc as a type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have one already
type: some_random_type | ||
id: 1 | ||
|
||
- is_false: found |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we also need to check what happens when providing not_doc as a type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have one already
if (uidTerm == null) { | ||
return new ExplainResponse(shardId.getIndexName(), request.type(), request.id(), false); | ||
} | ||
// No need to check the type, IndexShard#get does it for is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for is ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for us :) thanks for catching I'll fix
If I have a mapping with custom type |
@markharwood This is done consciously: by always returning |
@elasticmachine test this please |
Are search responses planning to do this too then? |
@markharwood I'm not 100% sure about this one, the |
- deprecate type-based constructors of IndexRequest - update tests to use typeless IndexRequest constructors - no yaml tests as they have been already added in elastic#35790 Relates to elastic#35190
We introduce a typeless API in elastic#35790 where we translate the default docType "_doc" to the user-defined docType. However, we still use docType from the source rather the translated type (i.e, type of the Mapper) when parsing a document. This leads to a situation where we have two translog operations for the same document with different types: - prvOp [Index{id='9LCpwGcBkJN7eZxaB54L', type='_doc', seqNo=1, primaryTerm=1, version=1, autoGeneratedIdTimestamp=1545125562123}] - newOp [Index{id='9LCpwGcBkJN7eZxaB54L', type='not_doc', seqNo=1, primaryTerm=1, version=1, autoGeneratedIdTimestamp=-1}] Closes elastic#36769
We introduce a typeless API in #35790 where we translate the default docType "_doc" to the user-defined docType. However, we do not rewrite the SourceToParse with the resolved docType. This leads to a situation where we have two translog operations for the same document with different types: - prvOp [Index{id='9LCpwGcBkJN7eZxaB54L', type='_doc', seqNo=1, primaryTerm=1, version=1, autoGeneratedIdTimestamp=1545125562123}] - newOp [Index{id='9LCpwGcBkJN7eZxaB54L', type='not_doc', seqNo=1, primaryTerm=1, version=1, autoGeneratedIdTimestamp=-1}] Closes #36769
This commit makes
document
,update
,explain
,termvectors
andmapping
typeless APIs work on indices that have a type whose name is not
_doc
.Unfortunately, this needs to be a bit of a hack since I didn't want calls with
random type names to see documents with the type name that the user had chosen
upon type creation.
The
explain
andtermvectors
do not support being called without a type fornow so the test is just using
_doc
as a type for now, we will need to fixtests later but this shouldn't require further changes server-side since passing
_doc
as a type name is what typeless APIs do internally anyway.Relates #35190