Skip to content
This repository has been archived by the owner on Aug 2, 2022. It is now read-only.

Added following string functions: regex, substr, substring, ltrim, rtrim, trim, upper, lower, concat, concat_ws, length, strcmp #750

Merged
Show file tree
Hide file tree
Changes from 86 commits
Commits
Show all changes
88 commits
Select commit Hold shift + click to select a range
fb2ed91
Bug fix, support long type for aggregation (#522)
penghuo Jun 17, 2020
254f2e0
Opendistro Release 1.9.0 (#532)
joshuali925 Jun 24, 2020
33c6d3e
Rename release notes to use 4 digit versions (#547)
joshuali925 Jul 6, 2020
09132da
Revert changes ahead of develop branch in master (#551)
joshuali925 Jul 9, 2020
893cd18
Merge develop branch to master (#553)
joshuali925 Jul 9, 2020
923c96d
Merge all SQL repos and adjust workflows (#549) (#554)
joshuali925 Jul 9, 2020
4b33a2f
add date and time support (#560)
penghuo Jul 13, 2020
af74293
Revert "add date and time support (#560)" (#567)
penghuo Jul 13, 2020
baac103
resolve conflict
joshuali925 Jul 29, 2020
35e37a3
Merge develop to master for ODFE 1.9.0.1 release (#633)
joshuali925 Jul 29, 2020
0ed6594
Merge fixes for github release actions from develop to master (#638)
joshuali925 Jul 29, 2020
e4981e3
Fix odbc win32 release workflow for master (#642)
joshuali925 Jul 30, 2020
c11125d
add error details for all server communication errors (#645)
jordanw-bq Jul 31, 2020
34b979e
Revert "add error details for all server communication errors (#645)"…
chloe-zh Aug 4, 2020
5ab1bc8
Merge pull request #698 from opendistro-for-elasticsearch/develop
chloe-zh Aug 20, 2020
8735723
Merge develop branch into master for od1.10 release (#701)
chloe-zh Aug 20, 2020
332ee9c
Merge branch 'develop' of github.com:opendistro-for-elasticsearch/sql
chloe-zh Aug 20, 2020
2939081
Merge workflow fix to master for od1.10 release (#704)
chloe-zh Aug 20, 2020
9e82a91
Fix download link in package description (#729)
gaiksaya Sep 4, 2020
89c6e05
Merge develop to master for ODFE 1.10.1.0 release (#733)
joshuali925 Sep 8, 2020
625a178
[1] Initial commit, checking if server build passes
lyndonbauto Sep 19, 2020
715e0c1
[1] Commiting expression documentation with REGEXP
lyndonbauto Sep 19, 2020
30e6768
[1] Failure with REGEXP doc
lyndonbauto Sep 19, 2020
6b52b2b
[1] Moved testing for regexp to binary predicates.
lyndonbauto Sep 19, 2020
baae0fc
[1] Updating parser for REGEX
lyndonbauto Sep 19, 2020
38a2bb9
[1] Parser update
lyndonbauto Sep 19, 2020
da44ff9
[1] Making REGEXP like LIKE
lyndonbauto Sep 19, 2020
04dcacb
[1] Reverting change to legacy
lyndonbauto Sep 19, 2020
0291223
[1] Checking if same without NOT
lyndonbauto Sep 19, 2020
bd93eaa
[1] testing adding to Ast Expr
lyndonbauto Sep 19, 2020
0f246c0
[1] Switching REGEX over to Integer.
lyndonbauto Sep 19, 2020
4bae1c3
[1] Reversion test
lyndonbauto Sep 19, 2020
6e88f42
[1] Add back test
lyndonbauto Sep 19, 2020
1dde4c5
[1] Fixing spacing
lyndonbauto Sep 19, 2020
4a9475d
[1] Regexp builder test.
lyndonbauto Sep 19, 2020
50dbd1a
-2
lyndonbauto Sep 19, 2020
437361e
-1
lyndonbauto Sep 19, 2020
944a8d9
[1] trying with semicolon
lyndonbauto Sep 19, 2020
300f25c
[1] Found the missing link >_<
lyndonbauto Sep 19, 2020
7028500
[1] Functions documentation
lyndonbauto Sep 19, 2020
c84b12b
[1] Fixing documentation mistake.
lyndonbauto Sep 19, 2020
077127b
[1] Retesting
lyndonbauto Sep 19, 2020
8dff4ff
[1] Trying to debug python
lyndonbauto Sep 19, 2020
accdc63
[1] more python debug info
lyndonbauto Sep 19, 2020
3364883
[1] Trying again.
lyndonbauto Sep 19, 2020
989bcfa
[1] MOre py inof
lyndonbauto Sep 19, 2020
0b1fab0
[1] Fixed except
lyndonbauto Sep 19, 2020
f02b695
[1] Simplified concat and concat_ws
lyndonbauto Sep 19, 2020
876fc11
[1] Added missing stuff to paraser
lyndonbauto Sep 19, 2020
7274aac
[1] Fixed some functions and removed some unused imports
lyndonbauto Sep 19, 2020
7c962ae
[1] Fixed STRCMP
lyndonbauto Sep 19, 2020
1236875
[1] Trying to fix aliasing issue with substring
lyndonbauto Sep 19, 2020
dd9dd32
[1] Fixed stringcompare and substring
lyndonbauto Sep 19, 2020
8238e42
[1] REmoving unused imorts
lyndonbauto Sep 19, 2020
3f1d7b2
[1] Fixed documentation
lyndonbauto Sep 19, 2020
97d524d
[1] REGEXP not supported by sqllite so removing these for now.
lyndonbauto Sep 19, 2020
28600d3
[1] Removed auto IT and added manual.
lyndonbauto Sep 19, 2020
dceb32e
[1] Fixed spacing
lyndonbauto Sep 19, 2020
571f25b
[1] Fixed type definitions
lyndonbauto Sep 19, 2020
8e369eb
[1] Fixed integer values and ltrim
lyndonbauto Sep 19, 2020
c19e084
[1] COrrecting ltrim again
lyndonbauto Sep 19, 2020
9cf9cac
[1] Changed patterns
lyndonbauto Sep 19, 2020
58553ed
[1] Fixed some minor issues
lyndonbauto Sep 19, 2020
9d2ee97
[1] reverting change i didnt make
lyndonbauto Sep 19, 2020
fdcd554
[1] Condensed logic
lyndonbauto Sep 22, 2020
99532e6
[1] Removed SUBSTRING FunctionName.
lyndonbauto Sep 22, 2020
334509b
[1] Reverted failure issues
lyndonbauto Sep 22, 2020
d3cef5b
[1] Combined substring and substr test.
lyndonbauto Sep 22, 2020
539fa2a
[1] Added ppl test and edited caps in textfunctiontest
lyndonbauto Sep 22, 2020
fc24427
[1] Testing without source
lyndonbauto Sep 22, 2020
e5e4d15
[1] Correcting format of string
lyndonbauto Sep 22, 2020
7fd259d
[1] Testing new queries
lyndonbauto Sep 22, 2020
6708826
[1] Adding resource and fixed tests
lyndonbauto Sep 22, 2020
e51aa81
[1] Added maapping and adjusted tests
lyndonbauto Sep 22, 2020
cba5aae
[1] minor corrections
lyndonbauto Sep 22, 2020
85a24e3
[1] Additional debug info
lyndonbauto Sep 22, 2020
018da57
[1] Removing unsuspported ppl functions.
lyndonbauto Sep 22, 2020
24066a9
[1] Added back unsupported
lyndonbauto Sep 22, 2020
10f143d
[1] Checking regex
lyndonbauto Sep 22, 2020
b918bad
[1] Removing printout
lyndonbauto Sep 22, 2020
3fa954c
[1] Trying to fix commit
lyndonbauto Sep 23, 2020
0ceb735
Revert "[1] Trying to fix commit"
lyndonbauto Sep 23, 2020
b03c6d6
[1] Pulling develop to fix conflicts
lyndonbauto Sep 23, 2020
3e87927
[1] Adding rest of commit
lyndonbauto Sep 23, 2020
8c67ee8
[1] Adding workflow files
lyndonbauto Sep 23, 2020
7342fb9
[1] fixed docs
lyndonbauto Sep 23, 2020
d85c258
[1] Updated based on PR comments.
lyndonbauto Sep 23, 2020
8dfd7a6
[1] Fixed code so tests pass
lyndonbauto Sep 23, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -273,6 +273,54 @@ public FunctionExpression module(Expression... expressions) {
return function(BuiltinFunctionName.MODULES, expressions);
}

public FunctionExpression substr(Expression... expressions) {
return function(BuiltinFunctionName.SUBSTR, expressions);
}

public FunctionExpression substring(Expression... expressions) {
return function(BuiltinFunctionName.SUBSTR, expressions);
}

public FunctionExpression ltrim(Expression... expressions) {
return function(BuiltinFunctionName.LTRIM, expressions);
}

public FunctionExpression rtrim(Expression... expressions) {
return function(BuiltinFunctionName.RTRIM, expressions);
}

public FunctionExpression trim(Expression... expressions) {
return function(BuiltinFunctionName.TRIM, expressions);
}

public FunctionExpression upper(Expression... expressions) {
return function(BuiltinFunctionName.UPPER, expressions);
}

public FunctionExpression lower(Expression... expressions) {
return function(BuiltinFunctionName.LOWER, expressions);
}

public FunctionExpression regexp(Expression... expressions) {
return function(BuiltinFunctionName.REGEXP, expressions);
}

public FunctionExpression concat(Expression... expressions) {
return function(BuiltinFunctionName.CONCAT, expressions);
}

public FunctionExpression concat_ws(Expression... expressions) {
return function(BuiltinFunctionName.CONCAT_WS, expressions);
}

public FunctionExpression length(Expression... expressions) {
return function(BuiltinFunctionName.LENGTH, expressions);
}

public FunctionExpression strcmp(Expression... expressions) {
return function(BuiltinFunctionName.STRCMP, expressions);
}

public FunctionExpression and(Expression... expressions) {
return function(BuiltinFunctionName.AND, expressions);
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@
import com.amazon.opendistroforelasticsearch.sql.expression.operator.arthmetic.MathematicalFunction;
import com.amazon.opendistroforelasticsearch.sql.expression.operator.predicate.BinaryPredicateOperator;
import com.amazon.opendistroforelasticsearch.sql.expression.operator.predicate.UnaryPredicateOperator;
import com.amazon.opendistroforelasticsearch.sql.expression.text.TextFunction;
import java.util.HashMap;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
Expand All @@ -47,6 +48,7 @@ public BuiltinFunctionRepository functionRepository() {
AggregatorFunction.register(builtinFunctionRepository);
DateTimeFunction.register(builtinFunctionRepository);
IntervalClause.register(builtinFunctionRepository);
TextFunction.register(builtinFunctionRepository);
return builtinFunctionRepository;
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,22 @@ public enum BuiltinFunctionName {
SUM(FunctionName.of("sum")),
COUNT(FunctionName.of("count")),

/**
* Text Functions.
*/
SUBSTR(FunctionName.of("substr")),
SUBSTRING(FunctionName.of("substring")),
RTRIM(FunctionName.of("rtrim")),
LTRIM(FunctionName.of("ltrim")),
TRIM(FunctionName.of("trim")),
UPPER(FunctionName.of("upper")),
LOWER(FunctionName.of("lower")),
REGEXP(FunctionName.of("regexp")),
CONCAT(FunctionName.of("concat")),
CONCAT_WS(FunctionName.of("concat_ws")),
LENGTH(FunctionName.of("length")),
STRCMP(FunctionName.of("strcmp")),

/**
* NULL Test.
*/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
import static com.amazon.opendistroforelasticsearch.sql.data.model.ExprValueUtils.LITERAL_NULL;
import static com.amazon.opendistroforelasticsearch.sql.data.model.ExprValueUtils.LITERAL_TRUE;
import static com.amazon.opendistroforelasticsearch.sql.data.type.ExprCoreType.BOOLEAN;
import static com.amazon.opendistroforelasticsearch.sql.data.type.ExprCoreType.INTEGER;
import static com.amazon.opendistroforelasticsearch.sql.data.type.ExprCoreType.STRING;

import com.amazon.opendistroforelasticsearch.sql.data.model.ExprBooleanValue;
Expand Down Expand Up @@ -61,6 +62,7 @@ public static void register(BuiltinFunctionRepository repository) {
repository.register(gte());
repository.register(like());
repository.register(notLike());
repository.register(regexp());
}

/**
Expand Down Expand Up @@ -245,6 +247,12 @@ private static FunctionResolver like() {
STRING));
}

private static FunctionResolver regexp() {
return FunctionDSL.define(BuiltinFunctionName.REGEXP.getName(), FunctionDSL
.impl(FunctionDSL.nullMissingHandling(OperatorUtils::matchesRegexp),
INTEGER, STRING, STRING));
}

private static FunctionResolver notLike() {
return FunctionDSL.define(BuiltinFunctionName.NOT_LIKE.getName(), FunctionDSL
.impl(FunctionDSL.nullMissingHandling(
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,231 @@
/*
*
* Copyright 2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License").
* You may not use this file except in compliance with the License.
* A copy of the License is located at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* or in the "license" file accompanying this file. This file is distributed
* on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
* express or implied. See the License for the specific language governing
* permissions and limitations under the License.
*
*/

package com.amazon.opendistroforelasticsearch.sql.expression.text;

import static com.amazon.opendistroforelasticsearch.sql.data.type.ExprCoreType.INTEGER;
import static com.amazon.opendistroforelasticsearch.sql.data.type.ExprCoreType.STRING;
import static com.amazon.opendistroforelasticsearch.sql.expression.function.FunctionDSL.define;
import static com.amazon.opendistroforelasticsearch.sql.expression.function.FunctionDSL.impl;
import static com.amazon.opendistroforelasticsearch.sql.expression.function.FunctionDSL.nullMissingHandling;

import com.amazon.opendistroforelasticsearch.sql.data.model.ExprIntegerValue;
import com.amazon.opendistroforelasticsearch.sql.data.model.ExprStringValue;
import com.amazon.opendistroforelasticsearch.sql.data.model.ExprValue;
import com.amazon.opendistroforelasticsearch.sql.expression.function.BuiltinFunctionName;
import com.amazon.opendistroforelasticsearch.sql.expression.function.BuiltinFunctionRepository;
import com.amazon.opendistroforelasticsearch.sql.expression.function.FunctionResolver;

import lombok.experimental.UtilityClass;


/**
* The definition of text functions.
* 1) have the clear interface for function define.
* 2) the implementation should rely on ExprValue.
*/
@UtilityClass
public class TextFunction {
private static String EMPTY_STRING = "";

/**
* Register String Functions.
*
* @param repository {@link BuiltinFunctionRepository}.
*/
public void register(BuiltinFunctionRepository repository) {
repository.register(substr());
repository.register(substring());
repository.register(ltrim());
repository.register(rtrim());
repository.register(trim());
repository.register(lower());
repository.register(upper());
repository.register(concat());
repository.register(concat_ws());
repository.register(length());
repository.register(strcmp());
}

/**
* Gets substring starting at given point, for optional given length.
* Form of this function using keywords instead of comma delimited variables is not supported.
* Supports following signatures:
* (STRING, INTEGER)/(STRING, INTEGER, INTEGER) -> STRING
*/
private FunctionResolver substring() {
return define(BuiltinFunctionName.SUBSTRING.getName(),
impl(nullMissingHandling(TextFunction::exprSubstrStart),
STRING, STRING, INTEGER),
impl(nullMissingHandling(TextFunction::exprSubstrStartLength),
STRING, STRING, INTEGER, INTEGER));
}

private FunctionResolver substr() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same with substring?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah they are synonyms.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you merge them, if they are same.

return define(BuiltinFunctionName.SUBSTR.getName(),
impl(nullMissingHandling(TextFunction::exprSubstrStart),
STRING, STRING, INTEGER),
impl(nullMissingHandling(TextFunction::exprSubstrStartLength),
STRING, STRING, INTEGER, INTEGER));
}

/**
* Removes leading whitespace from string.
* Supports following signatures:
* STRING -> STRING
*/
private FunctionResolver ltrim() {
return define(BuiltinFunctionName.LTRIM.getName(),
impl(nullMissingHandling((v) -> new ExprStringValue(v.stringValue().stripLeading())),
STRING, STRING));
}

/**
* Removes trailing whitespace from string.
* Supports following signatures:
* STRING -> STRING
*/
private FunctionResolver rtrim() {
return define(BuiltinFunctionName.RTRIM.getName(),
impl(nullMissingHandling((v) -> new ExprStringValue(v.stringValue().stripTrailing())),
STRING, STRING));
}

/**
* Removes leading and trailing whitespace from string.
* Has option to specify a String to trim instead of whitespace but this is not yet supported.
* Supporting String specification requires finding keywords inside TRIM command.
* Supports following signatures:
* STRING -> STRING
*/
private FunctionResolver trim() {
return define(BuiltinFunctionName.TRIM.getName(),
impl(nullMissingHandling((v) -> new ExprStringValue(v.stringValue().trim())),
STRING, STRING));
}

/**
* Converts String to lowercase.
* Supports following signatures:
* STRING -> STRING
*/
private FunctionResolver lower() {
return define(BuiltinFunctionName.LOWER.getName(),
impl(nullMissingHandling((v) -> new ExprStringValue((v.stringValue().toLowerCase()))),
STRING, STRING)
);
}

/**
* Converts String to uppercase.
* Supports following signatures:
* STRING -> STRING
*/
private FunctionResolver upper() {
return define(BuiltinFunctionName.UPPER.getName(),
impl(nullMissingHandling((v) -> new ExprStringValue((v.stringValue().toUpperCase()))),
STRING, STRING)
);
}

/**
* TODO: https://github.com/opendistro-for-elasticsearch/sql/issues/710
* Extend to accept variable argument amounts.
* Concatenates a list of Strings.
* Supports following signatures:
* (STRING, STRING) -> STRING
*/
private FunctionResolver concat() {
return define(BuiltinFunctionName.CONCAT.getName(),
impl(nullMissingHandling((str1, str2) ->
new ExprStringValue(str1.stringValue() + str2.stringValue())), STRING, STRING, STRING));
}

/**
* TODO: https://github.com/opendistro-for-elasticsearch/sql/issues/710
* Extend to accept variable argument amounts.
* Concatenates a list of Strings with a separator string.
* Supports following signatures:
* (STRING, STRING, STRING) -> STRING
*/
private FunctionResolver concat_ws() {
return define(BuiltinFunctionName.CONCAT_WS.getName(),
impl(nullMissingHandling((sep, str1, str2) ->
new ExprStringValue(str1.stringValue() + sep.stringValue() + str2.stringValue())),
STRING, STRING, STRING, STRING));
}

/**
* Calculates length of String in bytes.
* Supports following signatures:
* STRING -> INTEGER
*/
private FunctionResolver length() {
return define(BuiltinFunctionName.LENGTH.getName(),
impl(nullMissingHandling((str) ->
new ExprIntegerValue(str.stringValue().getBytes().length)), INTEGER, STRING));
}

/**
* Does String comparison of two Strings and returns Integer value.
* Supports following signatures:
* (STRING, STRING) -> INTEGER
*/
private FunctionResolver strcmp() {
return define(BuiltinFunctionName.STRCMP.getName(),
impl(nullMissingHandling((str1, str2) ->
new ExprIntegerValue(Integer.compare(
str1.stringValue().compareTo(str2.stringValue()), 0))),
INTEGER, STRING, STRING));
}

private static ExprValue exprSubstrStart(ExprValue exprValue, ExprValue start) {
int startIdx = start.integerValue();
if (startIdx == 0) {
return new ExprStringValue(EMPTY_STRING);
}
String str = exprValue.stringValue();
return exprSubStr(str, startIdx, 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the length=0 is misunderstanding, in Java SDK, the definition like below

    public String substring(int beginIndex) {
        return substring(beginIndex, length());
    }

}

private static ExprValue exprSubstrStartLength(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

considering merge whith exprSubstrStart

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merged together to use function below.

ExprValue exprValue, ExprValue start, ExprValue length) {
int startIdx = start.integerValue();
int len = length.integerValue();
if ((startIdx == 0) || (len == 0)) {
return new ExprStringValue(EMPTY_STRING);
}
String str = exprValue.stringValue();
return exprSubStr(str, startIdx, len);
}

private static ExprValue exprSubStr(String str, int start, int len) {
// Correct negative start
start = (start > 0) ? (start - 1) : (str.length() + start);

// Length 0 is only given by exprSubstrStart, exprSubstrStartLength handles this explicitly.
if ((start + len > str.length()) || (len == 0)) {
// Start is after string, return empty.
if (start > str.length()) {
return new ExprStringValue(EMPTY_STRING);
}
return new ExprStringValue(str.substring(start));
}
return new ExprStringValue(str.substring(start, start + len));
}
}

Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
package com.amazon.opendistroforelasticsearch.sql.utils;

import com.amazon.opendistroforelasticsearch.sql.data.model.ExprBooleanValue;
import com.amazon.opendistroforelasticsearch.sql.data.model.ExprIntegerValue;
import com.amazon.opendistroforelasticsearch.sql.data.model.ExprValue;
import java.util.regex.Pattern;
import lombok.experimental.UtilityClass;
Expand All @@ -35,6 +36,16 @@ public static ExprBooleanValue matches(ExprValue text, ExprValue pattern) {
.matches());
}

/**
* Checks if text matches regular expression pattern.
* @param pattern string pattern to match.
* @return if text matches pattern returns true; else return false.
*/
public static ExprIntegerValue matchesRegexp(ExprValue text, ExprValue pattern) {
return new ExprIntegerValue(Pattern.compile(pattern.stringValue()).matcher(text.stringValue())
.matches() ? 1 : 0);
}

private static final char DEFAULT_ESCAPE = '\\';

private static String patternToRegex(String patternString) {
Expand Down
Loading