Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bag constructor parsing #1500

Merged
merged 2 commits into from
Jul 3, 2024
Merged

Fix bag constructor parsing #1500

merged 2 commits into from
Jul 3, 2024

Conversation

alancai98
Copy link
Member

Relevant Issues

Description

  • Previous behavior on v1 allowed for inserting whitespace and comments between the angle brackets of a bag constructor. This PR corrects the behavior and gives an error for these scenarios.
  • This PR follows what the Definitive ANTLR 4 Reference describes in Chapter 12 regarding double right angle brackets:

To address this, either we can add semantic predicates to the grammar or
we can check the parse tree afterward using a listener or visitor to ensure that
the > token column numbers are adjacent for shift operators. It’d be inefficient
to use predicates during the parse, so it’s better to check the right shift operators
after the parse. Most language applications need to walk the parse tree anyway.

I looked at some other alternatives such as

  • pushing the typing to a separate mode in the lexer (similar to what Ion lexer rules we currently have)
    • this had some drawbacks in some duplicated lexer rules
    • we could always revisit this approach in the future
  • semantic predicates
    • potential slowdown in the lexer step

but decided to keep the recommendation of the ANTLR reference book.

Other Information

  • Updated Unreleased Section in CHANGELOG: [NO]

    • No, on v1 branch
  • Any backward-incompatible changes? [NO]

  • Any new external dependencies? [NO]

  • Do your changes comply with the Contributing Guidelines
    and Code Style Guidelines? [YES]

License Information

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@alancai98 alancai98 requested a review from yliuuuu July 2, 2024 23:22
@alancai98 alancai98 self-assigned this Jul 2, 2024
Copy link

github-actions bot commented Jul 2, 2024

Conformance comparison report-Cross Engine

Base (legacy) eval +/-
% Passing 92.59% 87.05% -5.53%
✅ Passing 5420 5097 -323
❌ Failing 434 758 324
🔶 Ignored 0 0 0
Total Tests 5854 5855 1
Number passing in both: 4894

Number failing in both: 231

Number passing in legacy engine but fail in eval engine: 527

Number failing in legacy engine but pass in eval engine: 203
⁉️ CONFORMANCE REPORT REGRESSION DETECTED ⁉️
The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact.
203 test(s) were failing in legacy but now pass in eval. Before merging, confirm they are intended to pass.
The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact.

Conformance comparison report-Cross Commit-LEGACY

Base (4dd0972) f8e102a +/-
% Passing 92.59% 92.59% 0.00%
✅ Passing 5420 5420 0
❌ Failing 434 434 0
🔶 Ignored 0 0 0
Total Tests 5854 5854 0
Number passing in both: 5420

Number failing in both: 434

Number passing in Base (4dd0972) but now fail: 0

Number failing in Base (4dd0972) but now pass: 0

Conformance comparison report-Cross Commit-EVAL

Base (4dd0972) f8e102a +/-
% Passing 87.05% 87.05% 0.00%
✅ Passing 5097 5097 0
❌ Failing 758 758 0
🔶 Ignored 0 0 0
Total Tests 5855 5855 0
Number passing in both: 5097

Number failing in both: 758

Number passing in Base (4dd0972) but now fail: 1

Number failing in Base (4dd0972) but now pass: 1
⁉️ CONFORMANCE REPORT REGRESSION DETECTED ⁉️. The following test(s) were previously passing but now fail:

Click here to see
  • Example 6 — Value Coercion, compileOption: LEGACY
The following test(s) were previously failing but now pass. Before merging, confirm they are intended to pass:
Click here to see
  • Example 6 — Value Coercion, compileOption: LEGACY

val startTokenIndex = ctx.start.tokenIndex
val endTokenIndex = ctx.stop.tokenIndex
if (tokens.getHiddenTokensToRight(startTokenIndex, PartiQLTokens.HIDDEN) != null || tokens.getHiddenTokensToLeft(endTokenIndex, PartiQLTokens.HIDDEN) != null) {
throw ParserException("Invalid bag expression", ErrorCode.PARSE_INVALID_QUERY)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a little weird that two error code does not match:
PARSE_INVALID_QUERY and PARSE_UNEXPECTED_TOKEN below.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (assertContext && (this@forEachTarget == ParserTarget.EXPERIMENTAL)) {
checkErrorAndErrorContext(errorCode, ex, expectErrorContextValues)
}

Here if I remembered correctly the error code has not been implemented for the experimental parser and hence it should be a not equal operation.

Copy link
Member Author

@alancai98 alancai98 Jul 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was a bit tripped up by this testing method's (checkInputThrowingParserException) logic. For clarity,

  • ParserTarget.DEFAULT -> refers to parser for PIG AST
  • ParserTarget.EXPERIMENTAL -> refers to parser for partiql-ast

The targets parameter of checkInputThrowingParserException is unused --

targets: Array<ParserTarget> = arrayOf(ParserTarget.DEFAULT),
. Instead, the targets property defined for the abstract parser test class is used . PartiQLParserTests sets targets to both ParserTarget.DEFAULT and ParserTarget.EXPERIMENTAL.

This function currently just does two things:

  • for ParserTarget.EXPERIMENTAL, check for an error and validate the error code + error context
  • for ParserTarget.DEFAULT, check for an error and do NO further validations.

That is why the error codes differ between the two error tests and why the error tests use the ParserTarget.EXPERIMENTAL error code, PARSE_UNEXPECTED_TOKEN.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're going to get rid of ParserTarget.DEFAULT for v1, I didn't want to alter the testing infrastructure further. I'll leave the tests as they are and add a couple comments for clarity.

Comment on lines +2027 to +2031
val startTokenIndex = ctx.start.tokenIndex
val endTokenIndex = ctx.stop.tokenIndex
if (tokens.getHiddenTokensToRight(startTokenIndex, GeneratedLexer.HIDDEN) != null || tokens.getHiddenTokensToLeft(endTokenIndex, GeneratedLexer.HIDDEN) != null) {
throw error(ctx, "Invalid bag expression")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should also port the fix to the some of the binary ops.

GeneratedParser.ANGLE_LEFT -> {
if (ctx.stop.type == GeneratedParser.ANGLE_RIGHT) Expr.Binary.Op.NE
else Expr.Binary.Op.LT
}
GeneratedParser.LT_EQ -> Expr.Binary.Op.LTE
GeneratedParser.ANGLE_RIGHT -> Expr.Binary.Op.GT
GeneratedParser.GT_EQ -> Expr.Binary.Op.GTE
GeneratedParser.BANG -> Expr.Binary.Op.NE
GeneratedParser.EQ -> Expr.Binary.Op.EQ

But feel free to leave this out for now.

Copy link
Member Author

@alancai98 alancai98 Jul 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah good catch, for <> and != this is also an issue. In the operator PR #1499, I'll fix the issues in that PR since there is a refactoring of the special symbol parsing.

@alancai98 alancai98 requested a review from yliuuuu July 3, 2024 19:42
@alancai98 alancai98 merged commit 5b86afc into v1 Jul 3, 2024
14 checks passed
@alancai98 alancai98 deleted the fix-bag-constr-parsing branch July 3, 2024 20:33
@alancai98 alancai98 linked an issue Jul 3, 2024 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[v1] bag constructor allows for whitespace between angle brackets
2 participants