Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QL: add unsigned_long type support #65145

Merged
merged 56 commits into from
Feb 3, 2022
Merged

Conversation

bpintea
Copy link
Contributor

@bpintea bpintea commented Nov 17, 2020

This PR introduces the UNSIGNED_LONG type to QL following its availability in ES (#60050).

The type is mapped to a BigInteger whose value is checked against the UL bounds.

The SQL will now support the type as literal and in the arithmetic functions; the non-arithmetic functions however are unchanged (i.e. they still require a long / int parameter where that is the case).

The type is version-gated: for the driver SQL clients (only) the server checks their version and in case this is lower than the one introducing the UL support, it fails the request, for queries, or simply hidden in catalog functions (similar to how UNSUPPORTED is currently treated in the similar case)

The JDBC tests are adjusted to read the (bwc) version of the driver they are run against and selectively disable part of the tests accordingly.

Closes #63312

This commit introduces the UNSIGNED_LONG type to QL, following it's
availabilty in ES. Just like in ES, the type is encoded as a BigInteger.

JDBC is now also UNSIGNED_LONG enabled.

In SQL, the type is exposed only to driver clients of a version equal or
greater to 7.11.0 (and all non-driver clients, irrespective of their
version).
Remove unused imports.
Disable UNSIGNED_LONG testing for those bwc tests with drivers older
than 7.11 - first target release to support them.
Also, undo change of BigInteger.valueOf(randomNonNegativeLong()) to
randomBigInteger() for long-requiring tests.
Move the condition checking wheatehr unsigned_long fields should be
exposed to the clients from the IndexResolver to the Verifier,
SysColumns and SysTypes commands.
Vars renaming, dead code removal.
@@ -231,6 +235,18 @@ public Verifier(Metrics metrics) {

failures.addAll(localFailures);
});

if (config.version() != null) {
List<LogicalPlan> projects = plan.collectFirstChildren(x -> x instanceof Project);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Version check added to the Verifier since ideally we want only the "top" project be inspected: if the query specs an unsigned_long in a subselect, that would lead to no bwc-compatibility issues with the client.

@bpintea bpintea marked this pull request as ready for review November 23, 2020 09:19
@elasticmachine elasticmachine added the Team:QL (Deprecated) Meta label for query languages team label Nov 23, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-ql (Team:QL)

Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a first set of reviews for almost half of the changes.

Copy link
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of, thanks for looking into this. It's a far reaching feature.
Second, it is a big PR making it hard not to produce but also review. I realize it's tricky however in the future please use a feature branch where the functionality can be built in stages (introducing the data type, checking the math operations, adding the type verification, adding the tests).
It's a tiny bit of overhead as it requires planning and waiting for reviews but it's much more digestable and easier to follow.

P.S. In fact since it took a while to get to it, I stopped at the test suite to give a first set of comments.

Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I think you need to add a test in FieldExtractorTestCase as well.
  2. I would look into adding a field type to test_emp_copy index and have some integration tests with this field type. Look at DataLoader.loadEmpDatasetWithExtraIntoEs where this index is created with some "extra" fields in it. There is a small section there where a constant_keyword and a wildcard field are added to the mapping. And further down in the method, there are values assigned to those fields. I think the easiest way of adding an unsigned_long is to put in it the salary value and then add some minimal tests to the csv-spec/sql-spec set. In DataLoader you'd do something like this:
            for (int f = 0; f < fields.size(); f++) {
                // an empty value in the csv file is treated as 'null', thus skipping it in the bulk request
                if (fields.get(f).trim().length() > 0) {
                    if (hadLastItem) {
                        bulk.append(",");
                    }
                    hadLastItem = true;
                    bulk.append('"').append(titles.get(f)).append("\":\"").append(fields.get(f)).append('"');
                    if (titles.get(f).equals("gender") && extraFields) {
                        bulk.append(",\"extra_gender\":\"Female\"");
                    } else if (titles.get(f).equals("salary") && extraFields) {
                        bulk.append(",\"ul\":\"" + fields.get(f) + "\"");
                    }
                }

2.1. some ideas here
2.2. from the unsigned_long documentation I see a red-flag when it comes to using this field type in Scripts, as they seem they are unsupported. We need to decide if we add something to the documentation or handle this field type "nicely" and output a pretty error to the user in case the query they are building will end up using scripts (kind of hard to do I think). At the moment not even a simple SELECT CAST(u_l AS INTEGER) won't work for unsigned_long.
3. Do you think the tests we have so far answer the question "is running SELECT unsigned_long FROM test using an old JDBC driver version handled correctly both server and driver side?"

Copy link
Contributor

@matriv matriv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left also some first comments, but need to check more.

Copy link
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left another round of comments.

- refactoring of the QA JDBC tests;
- introduce csv/sql-spec files;
- list of punctual changes as suggested.
Take SqlVersion -> Version out of a loop.
Set a minimum shard version at UL-version for searches.
@elasticsearchmachine
Copy link
Collaborator

Hi @bpintea, I've created a changelog YAML for you.

@bpintea
Copy link
Contributor Author

bpintea commented Jan 31, 2022

When multiplying intervals with ulongs, the result overflows instead of failing with an overflow error

That's true, although not UL specific. I've opened #83336.

Copy link
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left yet another round.

This extracts the unsigned_long replacement with UnsupportedEsField into
own class, `IndexCompatibility`, in case backwards compatibility needs
to apply. This leaves the responsability to ensure BWC to
IndexResolver's callers.
@bpintea
Copy link
Contributor Author

bpintea commented Feb 2, 2022

@elasticmachine run elasticsearch-ci/packaging-tests-unix-sample

Copy link
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes - left minor polishing comments.

Move the compatibility filtering out of the Analyzer, which now receives
the IndexResolution already filtered; this now happens in the
SqlSession.
Copy link
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM from my side.

@mark-vieira mark-vieira added v8.2.0 and removed v8.1.0 labels Feb 2, 2022
Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@Luegg Luegg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm 🚀

Rename loadMapping to loadIndexResolution.
Bump INTRODUCING_UNSIGNED_LONG version.
@bpintea
Copy link
Contributor Author

bpintea commented Feb 3, 2022

@elasticmachine run elasticsearch-ci/docs

@bpintea
Copy link
Contributor Author

bpintea commented Feb 3, 2022

@elasticmachine update branch

@jrodewig
Copy link
Contributor

jrodewig commented Feb 3, 2022

@elasticmachine run elasticsearch-ci/docs

2 similar comments
@bpintea
Copy link
Contributor Author

bpintea commented Feb 3, 2022

@elasticmachine run elasticsearch-ci/docs

@jrodewig
Copy link
Contributor

jrodewig commented Feb 3, 2022

@elasticmachine run elasticsearch-ci/docs

@bpintea bpintea merged commit 67c98e9 into elastic:master Feb 3, 2022
@bpintea bpintea deleted the feat/unsigned_long branch February 3, 2022 19:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/EQL EQL querying :Analytics/SQL SQL querying >enhancement Team:QL (Deprecated) Meta label for query languages team v8.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

QL: Add support for unsigned_long field type