Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: ensure null values cast to varchar/string remain null #5769

Merged
merged 4 commits into from
Jul 14, 2020

Conversation

stevenpyzhang
Copy link
Member

Description

Fixes #4892
String.valueOf() will return the string "null" when a null value is passed in. Using Object.toString(obj, defaultValue) allows the null value to be returned when a null is passed in https://docs.oracle.com/javase/8/docs/api/java/util/Objects.html#toString-java.lang.Object-java.lang.String-

Testing done

Tested the scenario described in the ticket
Added new QTT case

Reviewer checklist

  • Ensure docs are updated if necessary. (eg. if a user visible feature is being added or changed).
  • Ensure relevant issues are linked (description should include text like "Fixes #")

@stevenpyzhang stevenpyzhang requested a review from a team as a code owner July 6, 2020 23:55
Copy link
Contributor

@agavra agavra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fix LGTM, but I think it's backwards incompatible.

@@ -932,7 +933,7 @@ private CastVisitor() {
final int scale = decimal.getScale();
exprStr = String.format("DecimalUtil.format(%d, %d, %s)", precision, scale, expr.getLeft());
} else {
exprStr = "String.valueOf(" + expr.getLeft() + ")";
exprStr = "Objects.toString(" + expr.getLeft() + ", null)";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking at the plan.json that was generated from the test, I'm not sure this is backwards compatible - unless I'm missing something, we don't add the generated java code to the plan that we enqueue (cc @rodesai to comment on whether or not including that might be a good idea). That would mean that we would be changing any existing query to have this behavior.

we should consider whether this backwards incompatible change is acceptable or if we should guard this with a config that defaults to disabled, but any new query would have it enabled

@stevenpyzhang stevenpyzhang force-pushed the cast-null-bug branch 2 times, most recently from e8a5cbe to 2a4c54d Compare July 13, 2020 19:42
@stevenpyzhang
Copy link
Member Author

Added a config for preserving old query behavior.
Testing:

-- On master
SET 'auto.offset.reset'='earliest';
CREATE STREAM HAS_NULL_VALUES (USER VARCHAR, VIEWTIME INT) WITH (KAFKA_TOPIC='test', VALUE_FORMAT='DELIMITED');

CREATE TABLE COUNT_WITH_CAST_NULL_TOGGLE_OFF AS
SELECT USER, COUNT(CAST(VIEWTIME AS VARCHAR)) FROM HAS_NULL_VALUES GROUP BY USER;

-- On branch with fix
SET 'auto.offset.reset'='earliest';
CREATE TABLE COUNT_WITH_CAST_NULL_TOGGLE_ON AS
SELECT USER, COUNT(CAST(VIEWTIME AS VARCHAR)) FROM HAS_NULL_VALUES GROUP BY USER;

-- insert some data into it with null values
INSERT INTO HAS_NULL_VALUES (USER) VALUES ('amy');
INSERT INTO HAS_NULL_VALUES (USER, VIEWTIME) VALUES ('amy', 123);

SELECT * FROM COUNT_WITH_CAST_NULL_TOGGLE_OFF EMIT CHANGES LIMIT 1;
+---------------------+---------------------+---------------------+---------------------+
|ROWTIME              |ROWKEY               |USER                 |KSQL_COL_1           |
+---------------------+---------------------+---------------------+---------------------+
|1585140737589        |amy                  |amy                  |2                    |

SELECT * FROM COUNT_WITH_CAST_NULL_TOGGLE_ON EMIT CHANGES LIMIT 1;

+---------------------+---------------------+---------------------+---------------------+
|ROWTIME              |ROWKEY               |USER                 |KSQL_COL_1           |
+---------------------+---------------------+---------------------+---------------------+
|1585140737589        |amy                  |amy                  |1                   |

Copy link
Contributor

@agavra agavra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Only nit is that I don't love the config name, it's a little cryptic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CAST(col, VARCHAR) null values become the string "null"
2 participants