Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-17832][SQL] TableIdentifier.quotedString creates un-parseable names when name contains a backtick #15403

Closed
wants to merge 3 commits into from

Conversation

jiangxb1987
Copy link
Contributor

What changes were proposed in this pull request?

The quotedString method in TableIdentifier and FunctionIdentifier produce an illegal (un-parseable) name when the name contains a backtick. For example:

import org.apache.spark.sql.catalyst.parser.CatalystSqlParser._
import org.apache.spark.sql.catalyst.TableIdentifier
import org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute
val complexName = TableIdentifier("`weird`table`name", Some("`d`b`1"))
parseTableIdentifier(complexName.unquotedString) // Does not work
parseTableIdentifier(complexName.quotedString) // Does not work
parseExpression(complexName.unquotedString) // Does not work
parseExpression(complexName.quotedString) // Does not work

We should handle the backtick properly to make quotedString parseable.

How was this patch tested?

Add new testcases in TableIdentifierParserSuite and ExpressionParserSuite.

@jiangxb1987
Copy link
Contributor Author

jiangxb1987 commented Oct 8, 2016

Besides, we have similar problem in getFunction as well as getTable if one of databasetablefunction contains backtick, should we resolve the problem in this PR too?

@SparkQA
Copy link

SparkQA commented Oct 8, 2016

Test build #66576 has finished for PR 15403 at commit b123082.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

}

def unquotedString: String = {
if (database.isDefined) s"${database.get}.$identifier" else identifier
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should keep unquotedString this as it is. That gives some a more readable (less parseable) version of the identifier. I just added an example with unquotedIdentifier to show that there was no way to get a parseable string from the table identifier.


test("SPARK-17832 table identifier - contains backtick") {
val complexName = TableIdentifier("`weird`table`name", Some("`d`b`1"))
assert(TableIdentifier("`weird`table`name", Some("`d`b`1")) ===
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use complexName? Why create the identifier twice?

@hvanhovell
Copy link
Contributor

@jiangxb1987 This looks pretty good.

What is the problem getTable and getFunction? Could you give an example? assume you are talking about the org.apache.spark.sql.catalog.Catalog here?

@jiangxb1987
Copy link
Contributor Author

@hvanhovell nvm about the catalog.getTable issue, it turns out to be my mistake. Sorry about that...

@SparkQA
Copy link

SparkQA commented Oct 9, 2016

Test build #66596 has finished for PR 15403 at commit 59a1f0e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@hvanhovell
Copy link
Contributor

LGTM - merging to master/2.0. Thanks!

asfgit pushed a commit that referenced this pull request Oct 10, 2016
…names when name contains a backtick

## What changes were proposed in this pull request?

The `quotedString` method in `TableIdentifier` and `FunctionIdentifier` produce an illegal (un-parseable) name when the name contains a backtick. For example:
```
import org.apache.spark.sql.catalyst.parser.CatalystSqlParser._
import org.apache.spark.sql.catalyst.TableIdentifier
import org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute
val complexName = TableIdentifier("`weird`table`name", Some("`d`b`1"))
parseTableIdentifier(complexName.unquotedString) // Does not work
parseTableIdentifier(complexName.quotedString) // Does not work
parseExpression(complexName.unquotedString) // Does not work
parseExpression(complexName.quotedString) // Does not work
```
We should handle the backtick properly to make `quotedString` parseable.

## How was this patch tested?
Add new testcases in `TableIdentifierParserSuite` and `ExpressionParserSuite`.

Author: jiangxingbo <jiangxb1987@gmail.com>

Closes #15403 from jiangxb1987/backtick.

(cherry picked from commit 26fbca4)
Signed-off-by: Herman van Hovell <hvanhovell@databricks.com>
@asfgit asfgit closed this in 26fbca4 Oct 10, 2016
@jiangxb1987 jiangxb1987 deleted the backtick branch October 10, 2016 06:17
uzadude pushed a commit to uzadude/spark that referenced this pull request Jan 27, 2017
…names when name contains a backtick

## What changes were proposed in this pull request?

The `quotedString` method in `TableIdentifier` and `FunctionIdentifier` produce an illegal (un-parseable) name when the name contains a backtick. For example:
```
import org.apache.spark.sql.catalyst.parser.CatalystSqlParser._
import org.apache.spark.sql.catalyst.TableIdentifier
import org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute
val complexName = TableIdentifier("`weird`table`name", Some("`d`b`1"))
parseTableIdentifier(complexName.unquotedString) // Does not work
parseTableIdentifier(complexName.quotedString) // Does not work
parseExpression(complexName.unquotedString) // Does not work
parseExpression(complexName.quotedString) // Does not work
```
We should handle the backtick properly to make `quotedString` parseable.

## How was this patch tested?
Add new testcases in `TableIdentifierParserSuite` and `ExpressionParserSuite`.

Author: jiangxingbo <jiangxb1987@gmail.com>

Closes apache#15403 from jiangxb1987/backtick.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants