-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for VERSION field type in SQL and EQL #85502
Add support for VERSION field type in SQL and EQL #85502
Conversation
Pinging @elastic/es-ql (Team:QL) |
Hi @luigidellaquila, I've created a changelog YAML for you. |
x-pack/plugin/ql/src/main/java/org/elasticsearch/xpack/ql/type/DataTypeConverter.java
Outdated
Show resolved
Hide resolved
…sion_data_type_in_ql' into enhancement/support_version_data_type_in_ql
@elasticmachine update branch |
@elasticmachine update branch |
@elasticmachine update branch |
@elasticmachine update branch |
@elasticmachine update branch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work, LGTM!
/** | ||
* Does the provided {@code version} support the version type (PR#85502)? | ||
*/ | ||
public static boolean supportsVersionFieldType(Version version) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
supportsVersionType()
?
@@ -160,6 +163,26 @@ public void testUnsignedLongFiltering() { | |||
} | |||
} | |||
|
|||
public void testVersionFieldFiltering() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
testVersionTypeFiltering()
?
@@ -80,7 +80,12 @@ public static ScriptTemplate nullSafeFilter(ScriptTemplate script) { | |||
} | |||
|
|||
public static ScriptTemplate nullSafeSort(ScriptTemplate script) { | |||
String methodName = script.outputType().isNumeric() ? "nullSafeSortNumeric" : "nullSafeSortString"; | |||
String methodName; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Iis this change needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a leftover, fixed
@@ -32,6 +32,7 @@ public enum EsType implements SQLType { | |||
TIME(Types.TIME), | |||
DATETIME(Types.TIMESTAMP), | |||
IP(Types.VARCHAR), | |||
VERSION(Types.VARCHAR), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not appending into this enum might raise some bwc concerns (see #65145 (comment))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, fixed
@@ -133,6 +134,9 @@ protected Object unwrapCustomValue(Object values) { | |||
// since its later processing will be type dependent. (ex.: negation of UL is only "safe" for 0 values) | |||
return convert(values, UNSIGNED_LONG); | |||
} | |||
if (dataType == VERSION && values instanceof String) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type check is safe, but wondering if necessary, since the Version converter will eventually do the same check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 also seems redundant to me
// some higher versions | ||
for (int i = 0; i < randomInt(10); i++) { | ||
index("test", "" + (docId++), builder -> { | ||
String versionVal = (2 + randomInt(50)) + "." + randomInt(50) + "." + randomInt(50); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious if there's any reason not to allow a major of 0
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing very specific, I just needed a random dataset that did not collide with my WHERE conditions.
Looking at it now, probably it's a bit overkill to test a result set, but still it validates that the query returns correct results in a fairly randomized environment.
// bad version value | ||
query = "SELECT name, version from test where version = 'foo'"; | ||
doWithQuery(query, results -> { | ||
results.next(); | ||
assertEquals("version foo", results.getString("name")); | ||
assertEquals("foo", results.getString("version")); | ||
assertFalse(results.next()); | ||
}); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this second test add anything to the first one? JDBC has no Version type knowledge so it can only convert to string, so the content doesn't really matter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since there is this distinction between valid and invalid versions, I thought it would be good to have both in the result set test. Anyway, I agree that knowing the implementation details this test is probably not crucial. I'll refactor it a bit to make it shorter.
@@ -349,6 +351,45 @@ public void testUnsignedLongVersionCompatibility() { | |||
} | |||
} | |||
|
|||
public void testVersionFieldVersionCompatibility() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
testVersionTypeVersionCompatibilty()
? SELECT '1.2.3'::version'...
makes use of no field to test.
@@ -170,6 +179,30 @@ public void testUnsignedLongFiltering() { | |||
} | |||
} | |||
|
|||
public void testVersionFieldFiltering() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
testVersionTypeFiltering()
?
@@ -44,6 +44,8 @@ public final class DataTypes { | |||
public static final DataType DATETIME = new DataType("DATETIME", "date", Long.BYTES, false, false, true); | |||
// ip | |||
public static final DataType IP = new DataType("ip", 45, false, false, true); | |||
// version | |||
public static final DataType VERSION = new DataType("version", Integer.MAX_VALUE, false, false, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Giving VERSION a size of Integer.MAX_VALUE seems wrong, though I guess that's the theoretical max. Not sure if ES type enforces any limit? It seems there's only a recommendation, but no actual limit: https://semver.org/#does-semver-have-a-size-limit-on-the-version-string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically, any string can be indexed as a Version, so the limit is the same as a KEYWORD.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
[[queries]] | ||
name = "sequenceWithVersionConcat" | ||
query = ''' | ||
sequence by transID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can a version also be the join key?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, adding a specific test case
return compare(lN, rN); | ||
} | ||
|
||
// automatic conversion for versions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this still needed? Shouldn't by now the runtime type always be Version
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It depends on how tolerant we want to be with automatic casts in situations like local folding (e.g WHERE '1.2.0' < '1.11.0'::version
) and function evaluation (eg. IIF(version > '1.1', 1, 0)
).
In general, we do not do it for local folding (eg. for numbers '2' > 1
returns false
) but we do it for field queries (eg. id > '3'
will evaluate to true
even if id
is numeric, same with version > 1.2
).
IMHO this is an inconsistency: the same operation should have the same behavior, locally and in _search.
Since local folding is just an optimization or a fallback in most of the cases, so I tend to consider the automatic cast as the expected behavior.
So from my point of view we should leave it as it is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would tend to not make an exception for Version
. Since we have this distinction between local execution and queries it should at least be consistent across types.
@@ -133,6 +134,9 @@ protected Object unwrapCustomValue(Object values) { | |||
// since its later processing will be type dependent. (ex.: negation of UL is only "safe" for 0 values) | |||
return convert(values, UNSIGNED_LONG); | |||
} | |||
if (dataType == VERSION && values instanceof String) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 also seems redundant to me
@elasticmachine update branch |
@@ -35,7 +35,7 @@ | |||
* lexically in ASCII sort order. Numeric identifiers always have lower precedence than non-numeric identifiers. | |||
* </ul> | |||
*/ | |||
class VersionEncoder { | |||
public class VersionEncoder { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the code touches on code outside QL, please find one of the authors/team responsible for the code to review these changes.
If only constants are being used, it's fine to copy them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, actually these changes are a leftover from the previous attempts to support Version without impacting on the Search/Painless implementation, but since we went with a more complete solution (see #85990), we don't need them anymore.
@@ -62,6 +64,15 @@ public String esType() { | |||
return esType; | |||
} | |||
|
|||
public ScriptSortBuilder.ScriptSortType scriptSortType() { | |||
if (isNumeric()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: return isNumeric() ? ScriptSortType.NUMBER : this == DataTypes.VERSION ? ScriptSortType.VERSION : ScriptSortType.STRING
@@ -274,7 +274,9 @@ private static XContentBuilder toXContent(ColumnInfo info, XContentBuilder build | |||
* Serializes the provided value in SQL-compatible way based on the client mode | |||
*/ | |||
public static XContentBuilder value(XContentBuilder builder, Mode mode, SqlVersion sqlVersion, Object value) throws IOException { | |||
if (value instanceof ZonedDateTime zdt) { | |||
if (value instanceof org.elasticsearch.xpack.versionfield.Version) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Version is not a widely used type hence why it should be the else if
not at the main if
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requesting changes regarding the impact on mapper-version project.
…sion_data_type_in_ql' into enhancement/support_version_data_type_in_ql
@elasticmachine update branch |
…sion_data_type_in_ql' into enhancement/support_version_data_type_in_ql
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for reviewing the feedback. LGTM!
@elasticmachine update branch |
This reverts commit 79b0078.
Had to revert due to the following:
The problem seems to be related to how Version objects are translated to strings to be serialized, so it should be deterministic, but strangely it's not (before merging the build was green and I didn't notice this error before during the tests) |
Fixes #83375