Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#963 Unimplemented PPL Sort Syntax #994

Merged
merged 12 commits into from
Jan 9, 2025
1 change: 1 addition & 0 deletions docs/ppl-lang/PPL-Example-Commands.md
Original file line number Diff line number Diff line change
Expand Up @@ -275,6 +275,7 @@ source = table | where ispresent(a) |
- `source=accounts | parse email '.+@(?<host>.+)' | eval eval_result=1 | fields host, eval_result`
- `source=accounts | parse email '.+@(?<host>.+)' | where age > 45 | sort - age | fields age, email, host`
- `source=accounts | parse address '(?<streetNumber>\d+) (?<street>.+)' | where streetNumber > 500 | sort num(streetNumber) | fields streetNumber, street`
- **Note**: The `sort num` syntax is deprecated. To sort numerically, cast to a numerical data type - e.g. `sort cast(streetNumber as integer)`. See [#963](https://github.com/opensearch-project/opensearch-spark/issues/963) for more details.
currantw marked this conversation as resolved.
Show resolved Hide resolved
- Limitation: [see limitations](ppl-parse-command.md#limitations)

#### **Grok**
Expand Down
2 changes: 2 additions & 0 deletions docs/ppl-lang/ppl-parse-command.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,8 @@ PPL query:
| 880 | Holmes Lane |
+----------------+----------------+

**Note**: The `sort num` syntax is deprecated. To sort numerically, cast to a numerical data type - e.g. `sort cast(streetNumber as integer)`. See [#963](https://github.com/opensearch-project/opensearch-spark/issues/963) for more details.
currantw marked this conversation as resolved.
Show resolved Hide resolved

### Limitations

There are a few limitations with parse command:
Expand Down
4 changes: 2 additions & 2 deletions ppl-spark-integration/src/main/antlr4/OpenSearchPPLLexer.g4
Original file line number Diff line number Diff line change
Expand Up @@ -82,13 +82,13 @@ DATASOURCES: 'DATASOURCES';
USING: 'USING';
WITH: 'WITH';

// FIELD KEYWORDS
// SORT FIELD KEYWORDS
// TODO #963: Remove unimplemented sort syntax
AUTO: 'AUTO';
STR: 'STR';
IP: 'IP';
NUM: 'NUM';


// FIELDSUMMARY keywords
FIELDSUMMARY: 'FIELDSUMMARY';
INCLUDEFIELDS: 'INCLUDEFIELDS';
Expand Down
12 changes: 8 additions & 4 deletions ppl-spark-integration/src/main/antlr4/OpenSearchPPLParser.g4
Original file line number Diff line number Diff line change
Expand Up @@ -520,6 +520,8 @@ sortField

sortFieldExpression
currantw marked this conversation as resolved.
Show resolved Hide resolved
: fieldExpression

// TODO #963: Remove unimplemented sort syntax
| AUTO LT_PRTHS fieldExpression RT_PRTHS
| STR LT_PRTHS fieldExpression RT_PRTHS
| IP LT_PRTHS fieldExpression RT_PRTHS
Expand Down Expand Up @@ -1095,10 +1097,6 @@ keywordsCanBeId
| INDEX
| DESC
| DATASOURCES
| AUTO
| STR
| IP
| NUM
| FROM
| PATTERN
| NEW_FIELD
Expand Down Expand Up @@ -1181,4 +1179,10 @@ keywordsCanBeId
| BETWEEN
| CIDRMATCH
| trendlineType
// SORT FIELD KEYWORDS
// TODO #963: Remove unimplemented sort syntax
| AUTO
| STR
| IP
| NUM
;
Original file line number Diff line number Diff line change
Expand Up @@ -45,8 +45,6 @@
import org.opensearch.sql.ast.expression.subquery.ExistsSubquery;
import org.opensearch.sql.ast.expression.subquery.InSubquery;
import org.opensearch.sql.ast.expression.subquery.ScalarSubquery;
import org.opensearch.sql.ast.tree.Trendline;
import org.opensearch.sql.common.antlr.SyntaxCheckException;
import org.opensearch.sql.common.utils.StringUtils;
import org.opensearch.sql.ppl.utils.ArgumentFactory;

Expand All @@ -58,7 +56,6 @@
import java.util.stream.IntStream;
import java.util.stream.Stream;

import static org.opensearch.sql.expression.function.BuiltinFunctionName.EQUAL;
import static org.opensearch.sql.expression.function.BuiltinFunctionName.IS_NOT_NULL;
import static org.opensearch.sql.expression.function.BuiltinFunctionName.IS_NULL;
import static org.opensearch.sql.expression.function.BuiltinFunctionName.LENGTH;
Expand All @@ -80,7 +77,7 @@ public class AstExpressionBuilder extends OpenSearchPPLParserBaseVisitor<Unresol
.put("isnotnull", IS_NOT_NULL.getName().getFunctionName())
.put("ispresent", IS_NOT_NULL.getName().getFunctionName())
.build();
private AstBuilder astBuilder;
private final AstBuilder astBuilder;
currantw marked this conversation as resolved.
Show resolved Hide resolved

public AstExpressionBuilder(AstBuilder astBuilder) {
this.astBuilder = astBuilder;
Expand Down Expand Up @@ -183,6 +180,8 @@ public UnresolvedExpression visitWcFieldExpression(OpenSearchPPLParser.WcFieldEx

@Override
public UnresolvedExpression visitSortField(OpenSearchPPLParser.SortFieldContext ctx) {

// TODO #963: Remove unimplemented sort syntax
return new Field((QualifiedName)
visit(ctx.sortFieldExpression().fieldExpression().qualifiedName()),
ArgumentFactory.getArgumentList(ctx));
Expand Down Expand Up @@ -263,7 +262,6 @@ public UnresolvedExpression visitCaseExpr(OpenSearchPPLParser.CaseExprContext ct
public UnresolvedExpression visitIsEmptyExpression(OpenSearchPPLParser.IsEmptyExpressionContext ctx) {
Function trimFunction = new Function(TRIM.getName().getFunctionName(), Collections.singletonList(this.visitFunctionArg(ctx.functionArg())));
Function lengthFunction = new Function(LENGTH.getName().getFunctionName(), Collections.singletonList(trimFunction));
Compare lengthEqualsZero = new Compare(EQUAL.getName().getFunctionName(), lengthFunction, new Literal(0, DataType.INTEGER));
currantw marked this conversation as resolved.
Show resolved Hide resolved
Literal whenCompareValue = new Literal(0, DataType.INTEGER);
Literal isEmptyFalse = new Literal(false, DataType.BOOLEAN);
Literal isEmptyTrue = new Literal(true, DataType.BOOLEAN);
Expand Down Expand Up @@ -452,12 +450,10 @@ public UnresolvedExpression visitLambda(OpenSearchPPLParser.LambdaContext ctx) {

private List<UnresolvedExpression> timestampFunctionArguments(
OpenSearchPPLParser.TimestampFunctionCallContext ctx) {
List<UnresolvedExpression> args =
Arrays.asList(
return Arrays.asList(
new Literal(ctx.timestampFunction().simpleDateTimePart().getText(), DataType.STRING),
visitFunctionArg(ctx.timestampFunction().firstArg),
visitFunctionArg(ctx.timestampFunction().secondArg));
return args;
}

private QualifiedName visitIdentifiers(List<? extends ParserRuleContext> ctx) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -122,12 +122,12 @@ class PPLLogicalPlanParseTranslatorTestSuite

test("test parse email & host expressions including cast and sort commands") {
Copy link
Member

@YANG-DB YANG-DB Jan 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@currantw plz add the corresponding IT (place holder ?) for a concrete index:

    val query =
      "source=t" +
        " | parse address '(?<streetNumber>\\d+) (?<street>.+)'" +
        " | eval streetNumberInt = cast(streetNumber as integer)" +
        " | where streetNumberInt > 500" +
        " | sort streetNumberInt" +
        " | fields streetNumber, street"

val context = new CatalystPlanContext
val logPlan =
planTransformer.visit(
plan(
pplParser,
"source=t | parse address '(?<streetNumber>\\d+) (?<street>.+)' | where streetNumber > 500 | sort num(streetNumber) | fields streetNumber, street"),
context)

// TODO #963: Remove unimplemented sort syntax
val query =
"source=t | parse address '(?<streetNumber>\\d+) (?<street>.+)' | where streetNumber > 500 | sort num(streetNumber) | fields streetNumber, street"

val logPlan = planTransformer.visit(plan(pplParser, query), context)

val addressAttribute = UnresolvedAttribute("address")
val streetNumberAttribute = UnresolvedAttribute("streetNumber")
Expand Down
Loading