This repository has been archived by the owner on Aug 2, 2022. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 186
Added following string functions: regex, substr, substring, ltrim, rtrim, trim, upper, lower, concat, concat_ws, length, strcmp #750
Merged
chloe-zh
merged 88 commits into
opendistro-for-elasticsearch:develop
from
lyndonbauto:lyndon/string-functions-new-engine
Sep 24, 2020
Merged
Changes from 86 commits
Commits
Show all changes
88 commits
Select commit
Hold shift + click to select a range
fb2ed91
Bug fix, support long type for aggregation (#522)
penghuo 254f2e0
Opendistro Release 1.9.0 (#532)
joshuali925 33c6d3e
Rename release notes to use 4 digit versions (#547)
joshuali925 09132da
Revert changes ahead of develop branch in master (#551)
joshuali925 893cd18
Merge develop branch to master (#553)
joshuali925 923c96d
Merge all SQL repos and adjust workflows (#549) (#554)
joshuali925 4b33a2f
add date and time support (#560)
penghuo af74293
Revert "add date and time support (#560)" (#567)
penghuo baac103
resolve conflict
joshuali925 35e37a3
Merge develop to master for ODFE 1.9.0.1 release (#633)
joshuali925 0ed6594
Merge fixes for github release actions from develop to master (#638)
joshuali925 e4981e3
Fix odbc win32 release workflow for master (#642)
joshuali925 c11125d
add error details for all server communication errors (#645)
jordanw-bq 34b979e
Revert "add error details for all server communication errors (#645)"…
chloe-zh 5ab1bc8
Merge pull request #698 from opendistro-for-elasticsearch/develop
chloe-zh 8735723
Merge develop branch into master for od1.10 release (#701)
chloe-zh 332ee9c
Merge branch 'develop' of github.com:opendistro-for-elasticsearch/sql
chloe-zh 2939081
Merge workflow fix to master for od1.10 release (#704)
chloe-zh 9e82a91
Fix download link in package description (#729)
gaiksaya 89c6e05
Merge develop to master for ODFE 1.10.1.0 release (#733)
joshuali925 625a178
[1] Initial commit, checking if server build passes
lyndonbauto 715e0c1
[1] Commiting expression documentation with REGEXP
lyndonbauto 30e6768
[1] Failure with REGEXP doc
lyndonbauto 6b52b2b
[1] Moved testing for regexp to binary predicates.
lyndonbauto baae0fc
[1] Updating parser for REGEX
lyndonbauto 38a2bb9
[1] Parser update
lyndonbauto da44ff9
[1] Making REGEXP like LIKE
lyndonbauto 04dcacb
[1] Reverting change to legacy
lyndonbauto 0291223
[1] Checking if same without NOT
lyndonbauto bd93eaa
[1] testing adding to Ast Expr
lyndonbauto 0f246c0
[1] Switching REGEX over to Integer.
lyndonbauto 4bae1c3
[1] Reversion test
lyndonbauto 6e88f42
[1] Add back test
lyndonbauto 1dde4c5
[1] Fixing spacing
lyndonbauto 4a9475d
[1] Regexp builder test.
lyndonbauto 50dbd1a
-2
lyndonbauto 437361e
-1
lyndonbauto 944a8d9
[1] trying with semicolon
lyndonbauto 300f25c
[1] Found the missing link >_<
lyndonbauto 7028500
[1] Functions documentation
lyndonbauto c84b12b
[1] Fixing documentation mistake.
lyndonbauto 077127b
[1] Retesting
lyndonbauto 8dff4ff
[1] Trying to debug python
lyndonbauto accdc63
[1] more python debug info
lyndonbauto 3364883
[1] Trying again.
lyndonbauto 989bcfa
[1] MOre py inof
lyndonbauto 0b1fab0
[1] Fixed except
lyndonbauto f02b695
[1] Simplified concat and concat_ws
lyndonbauto 876fc11
[1] Added missing stuff to paraser
lyndonbauto 7274aac
[1] Fixed some functions and removed some unused imports
lyndonbauto 7c962ae
[1] Fixed STRCMP
lyndonbauto 1236875
[1] Trying to fix aliasing issue with substring
lyndonbauto dd9dd32
[1] Fixed stringcompare and substring
lyndonbauto 8238e42
[1] REmoving unused imorts
lyndonbauto 3f1d7b2
[1] Fixed documentation
lyndonbauto 97d524d
[1] REGEXP not supported by sqllite so removing these for now.
lyndonbauto 28600d3
[1] Removed auto IT and added manual.
lyndonbauto dceb32e
[1] Fixed spacing
lyndonbauto 571f25b
[1] Fixed type definitions
lyndonbauto 8e369eb
[1] Fixed integer values and ltrim
lyndonbauto c19e084
[1] COrrecting ltrim again
lyndonbauto 9cf9cac
[1] Changed patterns
lyndonbauto 58553ed
[1] Fixed some minor issues
lyndonbauto 9d2ee97
[1] reverting change i didnt make
lyndonbauto fdcd554
[1] Condensed logic
lyndonbauto 99532e6
[1] Removed SUBSTRING FunctionName.
lyndonbauto 334509b
[1] Reverted failure issues
lyndonbauto d3cef5b
[1] Combined substring and substr test.
lyndonbauto 539fa2a
[1] Added ppl test and edited caps in textfunctiontest
lyndonbauto fc24427
[1] Testing without source
lyndonbauto e5e4d15
[1] Correcting format of string
lyndonbauto 7fd259d
[1] Testing new queries
lyndonbauto 6708826
[1] Adding resource and fixed tests
lyndonbauto e51aa81
[1] Added maapping and adjusted tests
lyndonbauto cba5aae
[1] minor corrections
lyndonbauto 85a24e3
[1] Additional debug info
lyndonbauto 018da57
[1] Removing unsuspported ppl functions.
lyndonbauto 24066a9
[1] Added back unsupported
lyndonbauto 10f143d
[1] Checking regex
lyndonbauto b918bad
[1] Removing printout
lyndonbauto 3fa954c
[1] Trying to fix commit
lyndonbauto 0ceb735
Revert "[1] Trying to fix commit"
lyndonbauto b03c6d6
[1] Pulling develop to fix conflicts
lyndonbauto 3e87927
[1] Adding rest of commit
lyndonbauto 8c67ee8
[1] Adding workflow files
lyndonbauto 7342fb9
[1] fixed docs
lyndonbauto d85c258
[1] Updated based on PR comments.
lyndonbauto 8dfd7a6
[1] Fixed code so tests pass
lyndonbauto File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
231 changes: 231 additions & 0 deletions
231
...src/main/java/com/amazon/opendistroforelasticsearch/sql/expression/text/TextFunction.java
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,231 @@ | ||
/* | ||
* | ||
* Copyright 2020 Amazon.com, Inc. or its affiliates. All Rights Reserved. | ||
* | ||
* Licensed under the Apache License, Version 2.0 (the "License"). | ||
* You may not use this file except in compliance with the License. | ||
* A copy of the License is located at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* or in the "license" file accompanying this file. This file is distributed | ||
* on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either | ||
* express or implied. See the License for the specific language governing | ||
* permissions and limitations under the License. | ||
* | ||
*/ | ||
|
||
package com.amazon.opendistroforelasticsearch.sql.expression.text; | ||
|
||
import static com.amazon.opendistroforelasticsearch.sql.data.type.ExprCoreType.INTEGER; | ||
import static com.amazon.opendistroforelasticsearch.sql.data.type.ExprCoreType.STRING; | ||
import static com.amazon.opendistroforelasticsearch.sql.expression.function.FunctionDSL.define; | ||
import static com.amazon.opendistroforelasticsearch.sql.expression.function.FunctionDSL.impl; | ||
import static com.amazon.opendistroforelasticsearch.sql.expression.function.FunctionDSL.nullMissingHandling; | ||
|
||
import com.amazon.opendistroforelasticsearch.sql.data.model.ExprIntegerValue; | ||
import com.amazon.opendistroforelasticsearch.sql.data.model.ExprStringValue; | ||
import com.amazon.opendistroforelasticsearch.sql.data.model.ExprValue; | ||
import com.amazon.opendistroforelasticsearch.sql.expression.function.BuiltinFunctionName; | ||
import com.amazon.opendistroforelasticsearch.sql.expression.function.BuiltinFunctionRepository; | ||
import com.amazon.opendistroforelasticsearch.sql.expression.function.FunctionResolver; | ||
|
||
import lombok.experimental.UtilityClass; | ||
|
||
|
||
/** | ||
* The definition of text functions. | ||
* 1) have the clear interface for function define. | ||
* 2) the implementation should rely on ExprValue. | ||
*/ | ||
@UtilityClass | ||
public class TextFunction { | ||
private static String EMPTY_STRING = ""; | ||
|
||
/** | ||
* Register String Functions. | ||
* | ||
* @param repository {@link BuiltinFunctionRepository}. | ||
*/ | ||
public void register(BuiltinFunctionRepository repository) { | ||
repository.register(substr()); | ||
repository.register(substring()); | ||
repository.register(ltrim()); | ||
repository.register(rtrim()); | ||
repository.register(trim()); | ||
repository.register(lower()); | ||
repository.register(upper()); | ||
repository.register(concat()); | ||
repository.register(concat_ws()); | ||
repository.register(length()); | ||
repository.register(strcmp()); | ||
} | ||
|
||
/** | ||
* Gets substring starting at given point, for optional given length. | ||
* Form of this function using keywords instead of comma delimited variables is not supported. | ||
* Supports following signatures: | ||
* (STRING, INTEGER)/(STRING, INTEGER, INTEGER) -> STRING | ||
*/ | ||
private FunctionResolver substring() { | ||
return define(BuiltinFunctionName.SUBSTRING.getName(), | ||
impl(nullMissingHandling(TextFunction::exprSubstrStart), | ||
STRING, STRING, INTEGER), | ||
impl(nullMissingHandling(TextFunction::exprSubstrStartLength), | ||
STRING, STRING, INTEGER, INTEGER)); | ||
} | ||
|
||
private FunctionResolver substr() { | ||
return define(BuiltinFunctionName.SUBSTR.getName(), | ||
impl(nullMissingHandling(TextFunction::exprSubstrStart), | ||
STRING, STRING, INTEGER), | ||
impl(nullMissingHandling(TextFunction::exprSubstrStartLength), | ||
STRING, STRING, INTEGER, INTEGER)); | ||
} | ||
|
||
/** | ||
* Removes leading whitespace from string. | ||
* Supports following signatures: | ||
* STRING -> STRING | ||
*/ | ||
private FunctionResolver ltrim() { | ||
return define(BuiltinFunctionName.LTRIM.getName(), | ||
impl(nullMissingHandling((v) -> new ExprStringValue(v.stringValue().stripLeading())), | ||
STRING, STRING)); | ||
} | ||
|
||
/** | ||
* Removes trailing whitespace from string. | ||
* Supports following signatures: | ||
* STRING -> STRING | ||
*/ | ||
private FunctionResolver rtrim() { | ||
return define(BuiltinFunctionName.RTRIM.getName(), | ||
impl(nullMissingHandling((v) -> new ExprStringValue(v.stringValue().stripTrailing())), | ||
STRING, STRING)); | ||
} | ||
|
||
/** | ||
* Removes leading and trailing whitespace from string. | ||
* Has option to specify a String to trim instead of whitespace but this is not yet supported. | ||
* Supporting String specification requires finding keywords inside TRIM command. | ||
* Supports following signatures: | ||
* STRING -> STRING | ||
*/ | ||
private FunctionResolver trim() { | ||
return define(BuiltinFunctionName.TRIM.getName(), | ||
impl(nullMissingHandling((v) -> new ExprStringValue(v.stringValue().trim())), | ||
STRING, STRING)); | ||
} | ||
|
||
/** | ||
* Converts String to lowercase. | ||
* Supports following signatures: | ||
* STRING -> STRING | ||
*/ | ||
private FunctionResolver lower() { | ||
return define(BuiltinFunctionName.LOWER.getName(), | ||
impl(nullMissingHandling((v) -> new ExprStringValue((v.stringValue().toLowerCase()))), | ||
STRING, STRING) | ||
); | ||
} | ||
|
||
/** | ||
* Converts String to uppercase. | ||
* Supports following signatures: | ||
* STRING -> STRING | ||
*/ | ||
private FunctionResolver upper() { | ||
return define(BuiltinFunctionName.UPPER.getName(), | ||
impl(nullMissingHandling((v) -> new ExprStringValue((v.stringValue().toUpperCase()))), | ||
STRING, STRING) | ||
); | ||
} | ||
|
||
/** | ||
* TODO: https://github.com/opendistro-for-elasticsearch/sql/issues/710 | ||
* Extend to accept variable argument amounts. | ||
* Concatenates a list of Strings. | ||
* Supports following signatures: | ||
* (STRING, STRING) -> STRING | ||
*/ | ||
private FunctionResolver concat() { | ||
return define(BuiltinFunctionName.CONCAT.getName(), | ||
impl(nullMissingHandling((str1, str2) -> | ||
new ExprStringValue(str1.stringValue() + str2.stringValue())), STRING, STRING, STRING)); | ||
} | ||
|
||
/** | ||
* TODO: https://github.com/opendistro-for-elasticsearch/sql/issues/710 | ||
* Extend to accept variable argument amounts. | ||
* Concatenates a list of Strings with a separator string. | ||
* Supports following signatures: | ||
* (STRING, STRING, STRING) -> STRING | ||
*/ | ||
private FunctionResolver concat_ws() { | ||
return define(BuiltinFunctionName.CONCAT_WS.getName(), | ||
impl(nullMissingHandling((sep, str1, str2) -> | ||
new ExprStringValue(str1.stringValue() + sep.stringValue() + str2.stringValue())), | ||
STRING, STRING, STRING, STRING)); | ||
} | ||
|
||
/** | ||
* Calculates length of String in bytes. | ||
* Supports following signatures: | ||
* STRING -> INTEGER | ||
*/ | ||
private FunctionResolver length() { | ||
return define(BuiltinFunctionName.LENGTH.getName(), | ||
impl(nullMissingHandling((str) -> | ||
new ExprIntegerValue(str.stringValue().getBytes().length)), INTEGER, STRING)); | ||
} | ||
|
||
/** | ||
* Does String comparison of two Strings and returns Integer value. | ||
* Supports following signatures: | ||
* (STRING, STRING) -> INTEGER | ||
*/ | ||
private FunctionResolver strcmp() { | ||
return define(BuiltinFunctionName.STRCMP.getName(), | ||
impl(nullMissingHandling((str1, str2) -> | ||
new ExprIntegerValue(Integer.compare( | ||
str1.stringValue().compareTo(str2.stringValue()), 0))), | ||
INTEGER, STRING, STRING)); | ||
} | ||
|
||
private static ExprValue exprSubstrStart(ExprValue exprValue, ExprValue start) { | ||
int startIdx = start.integerValue(); | ||
if (startIdx == 0) { | ||
return new ExprStringValue(EMPTY_STRING); | ||
} | ||
String str = exprValue.stringValue(); | ||
return exprSubStr(str, startIdx, 0); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. the length=0 is misunderstanding, in Java SDK, the definition like below
|
||
} | ||
|
||
private static ExprValue exprSubstrStartLength( | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. considering merge whith exprSubstrStart There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Merged together to use function below. |
||
ExprValue exprValue, ExprValue start, ExprValue length) { | ||
int startIdx = start.integerValue(); | ||
int len = length.integerValue(); | ||
if ((startIdx == 0) || (len == 0)) { | ||
return new ExprStringValue(EMPTY_STRING); | ||
} | ||
String str = exprValue.stringValue(); | ||
return exprSubStr(str, startIdx, len); | ||
} | ||
|
||
private static ExprValue exprSubStr(String str, int start, int len) { | ||
// Correct negative start | ||
start = (start > 0) ? (start - 1) : (str.length() + start); | ||
|
||
// Length 0 is only given by exprSubstrStart, exprSubstrStartLength handles this explicitly. | ||
if ((start + len > str.length()) || (len == 0)) { | ||
// Start is after string, return empty. | ||
if (start > str.length()) { | ||
return new ExprStringValue(EMPTY_STRING); | ||
} | ||
return new ExprStringValue(str.substring(start)); | ||
} | ||
return new ExprStringValue(str.substring(start, start + len)); | ||
} | ||
} | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same with substring?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah they are synonyms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you merge them, if they are same.