Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bag constructor parsing #1500

Merged
merged 2 commits into from
Jul 3, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@ internal class PartiQLPigParser(val customTypes: List<CustomType> = listOf()) :
val tokenStream = createTokenStream(queryStream)
val parser = parserInit(tokenStream)
val tree = parser.root()
val visitor = PartiQLPigVisitor(customTypes, tokenStream.parameterIndexes)
val visitor = PartiQLPigVisitor(tokenStream, customTypes, tokenStream.parameterIndexes)
return visitor.visit(tree) as PartiqlAst.Statement
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ import com.amazon.ionelement.api.ionNull
import com.amazon.ionelement.api.ionString
import com.amazon.ionelement.api.ionSymbol
import com.amazon.ionelement.api.loadSingleElement
import org.antlr.v4.runtime.CommonTokenStream
import org.antlr.v4.runtime.ParserRuleContext
import org.antlr.v4.runtime.Token
import org.antlr.v4.runtime.tree.TerminalNode
Expand Down Expand Up @@ -62,6 +63,7 @@ import org.partiql.lang.util.getPrecisionFromTimeString
import org.partiql.lang.util.unaryMinus
import org.partiql.parser.internal.antlr.PartiQLParser
import org.partiql.parser.internal.antlr.PartiQLParserBaseVisitor
import org.partiql.parser.internal.antlr.PartiQLTokens
import org.partiql.pig.runtime.SymbolPrimitive
import org.partiql.value.datetime.DateTimeException
import org.partiql.value.datetime.TimeZone
Expand Down Expand Up @@ -116,6 +118,7 @@ import java.time.format.DateTimeParseException
* There could be clever ways of exploiting this, to avoid the dispatch via `visit()`.
*/
internal class PartiQLPigVisitor(
private val tokens: CommonTokenStream,
val customTypes: List<CustomType> = listOf(),
private val parameterIndexes: Map<Int, Int> = mapOf(),
) :
Expand Down Expand Up @@ -1507,6 +1510,12 @@ internal class PartiQLPigVisitor(
*/

override fun visitBag(ctx: PartiQLParser.BagContext) = PartiqlAst.build {
// Prohibit hidden characters between angle brackets
val startTokenIndex = ctx.start.tokenIndex
val endTokenIndex = ctx.stop.tokenIndex
if (tokens.getHiddenTokensToRight(startTokenIndex, PartiQLTokens.HIDDEN) != null || tokens.getHiddenTokensToLeft(endTokenIndex, PartiQLTokens.HIDDEN) != null) {
throw ParserException("Invalid bag expression", ErrorCode.PARSE_INVALID_QUERY)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a little weird that two error code does not match:
PARSE_INVALID_QUERY and PARSE_UNEXPECTED_TOKEN below.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (assertContext && (this@forEachTarget == ParserTarget.EXPERIMENTAL)) {
checkErrorAndErrorContext(errorCode, ex, expectErrorContextValues)
}

Here if I remembered correctly the error code has not been implemented for the experimental parser and hence it should be a not equal operation.

Copy link
Member Author

@alancai98 alancai98 Jul 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was a bit tripped up by this testing method's (checkInputThrowingParserException) logic. For clarity,

  • ParserTarget.DEFAULT -> refers to parser for PIG AST
  • ParserTarget.EXPERIMENTAL -> refers to parser for partiql-ast

The targets parameter of checkInputThrowingParserException is unused --

targets: Array<ParserTarget> = arrayOf(ParserTarget.DEFAULT),
. Instead, the targets property defined for the abstract parser test class is used . PartiQLParserTests sets targets to both ParserTarget.DEFAULT and ParserTarget.EXPERIMENTAL.

This function currently just does two things:

  • for ParserTarget.EXPERIMENTAL, check for an error and validate the error code + error context
  • for ParserTarget.DEFAULT, check for an error and do NO further validations.

That is why the error codes differ between the two error tests and why the error tests use the ParserTarget.EXPERIMENTAL error code, PARSE_UNEXPECTED_TOKEN.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're going to get rid of ParserTarget.DEFAULT for v1, I didn't want to alter the testing infrastructure further. I'll leave the tests as they are and add a couple comments for clarity.

}
val exprList = ctx.expr().map { visitExpr(it) }
bag(exprList, ctx.ANGLE_LEFT(0).getSourceMetaContainer())
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5019,4 +5019,38 @@ class PartiQLParserTest : PartiQLParserTestBase() {
lit(ionInt(1))
)
}

// regression tests for bag constructor angle bracket
@Test
fun testBagConstructor() = assertExpression("<<<<1>>>>") {
bag(
bag(
lit(ionInt(1))
)
)
}

@Test
fun testSpacesInBagConstructor() = checkInputThrowingParserException(
"< < < < 1 > > > >",
ErrorCode.PARSE_UNEXPECTED_TOKEN, // partiql-ast parser ErrorCode
expectErrorContextValues = mapOf(
Property.LINE_NUMBER to 1L,
Property.COLUMN_NUMBER to 1L,
Property.TOKEN_DESCRIPTION to PartiQLParser.ANGLE_LEFT.getAntlrDisplayString(),
Property.TOKEN_VALUE to ION.newSymbol("<")
)
)

@Test
fun testCommentsInBagConstructor() = checkInputThrowingParserException(
"</* some comment */<<<1>>>>",
ErrorCode.PARSE_UNEXPECTED_TOKEN, // partiql-ast parser ErrorCode
expectErrorContextValues = mapOf(
Property.LINE_NUMBER to 1L,
Property.COLUMN_NUMBER to 1L,
Property.TOKEN_DESCRIPTION to PartiQLParser.ANGLE_LEFT.getAntlrDisplayString(),
Property.TOKEN_VALUE to ION.newSymbol("<")
)
)
}
Original file line number Diff line number Diff line change
Expand Up @@ -152,15 +152,15 @@ abstract class PartiQLParserTestBase : TestBase() {
input: String,
errorCode: ErrorCode,
expectErrorContextValues: Map<Property, Any>,
targets: Array<ParserTarget> = arrayOf(ParserTarget.DEFAULT),
assertContext: Boolean = true,
): Unit = forEachTarget {
softAssert {
try {
parser.parseAstStatement(input)
fail("Expected ParserException but there was no Exception")
} catch (ex: ParserException) {
// split parser target does not use ErrorCode
// NOTE: only perform error code and error context checks for `ParserTarget.EXPERIMENTAL` (partiql-ast
// parser).
if (assertContext && (this@forEachTarget == ParserTarget.EXPERIMENTAL)) {
checkErrorAndErrorContext(errorCode, ex, expectErrorContextValues)
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -425,6 +425,7 @@ internal class PartiQLParserDefault : PartiQLParser {
*/
@OptIn(PartiQLValueExperimental::class)
private class Visitor(
private val tokens: CommonTokenStream,
private val locations: SourceLocations.Mutable,
private val parameters: Map<Int, Int> = mapOf(),
) : PartiQLParserBaseVisitor<AstNode>() {
Expand All @@ -442,7 +443,7 @@ internal class PartiQLParserDefault : PartiQLParser {
tree: GeneratedParser.RootContext,
): PartiQLParser.Result {
val locations = SourceLocations.Mutable()
val visitor = Visitor(locations, tokens.parameterIndexes)
val visitor = Visitor(tokens, locations, tokens.parameterIndexes)
val root = visitor.visitAs<AstNode>(tree) as Statement
return PartiQLParser.Result(
source = source,
Expand Down Expand Up @@ -2022,6 +2023,12 @@ internal class PartiQLParserDefault : PartiQLParser {
*/

override fun visitBag(ctx: GeneratedParser.BagContext) = translate(ctx) {
// Prohibit hidden characters between angle brackets
val startTokenIndex = ctx.start.tokenIndex
val endTokenIndex = ctx.stop.tokenIndex
if (tokens.getHiddenTokensToRight(startTokenIndex, GeneratedLexer.HIDDEN) != null || tokens.getHiddenTokensToLeft(endTokenIndex, GeneratedLexer.HIDDEN) != null) {
throw error(ctx, "Invalid bag expression")
}
Comment on lines +2027 to +2031
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should also port the fix to the some of the binary ops.

GeneratedParser.ANGLE_LEFT -> {
if (ctx.stop.type == GeneratedParser.ANGLE_RIGHT) Expr.Binary.Op.NE
else Expr.Binary.Op.LT
}
GeneratedParser.LT_EQ -> Expr.Binary.Op.LTE
GeneratedParser.ANGLE_RIGHT -> Expr.Binary.Op.GT
GeneratedParser.GT_EQ -> Expr.Binary.Op.GTE
GeneratedParser.BANG -> Expr.Binary.Op.NE
GeneratedParser.EQ -> Expr.Binary.Op.EQ

But feel free to leave this out for now.

Copy link
Member Author

@alancai98 alancai98 Jul 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah good catch, for <> and != this is also an issue. In the operator PR #1499, I'll fix the issues in that PR since there is a refactoring of the special symbol parsing.

val expressions = visitOrEmpty<Expr>(ctx.expr())
exprCollection(Expr.Collection.Type.BAG, expressions)
}
Expand Down
Loading