Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: JSON format to set correct scale of decimals #6295

Merged

Conversation

big-andy-coates
Copy link
Contributor

@big-andy-coates big-andy-coates commented Sep 24, 2020

Description

It is the responsibility of the format to ensure the data returned matches the required schema. This includes the scale of decimals. The JSON format was not correctly setting the scale of decimals when deserializing. For example, give:

CREATE STREAM S (ID INT KEY, PRICE DECIMAL(10,2) WITH (kafka_topic='S', formats='JSON');

The above creates a stream with a single value column PRICE which should have a scale of 2, i.e. two decimal places.

If the data in tbe Kafka record's value was to have too small a scale, e.g.

{
   "price": 12
}

or

{
   "price": 12.1
}

Then the deserializer was returning the decimal as provided, i.e. 12 or 12.1. However, this is incorrect as the schema of the column states is has a scale of two. So all values for the column should have the scale set to two, i.e. the above examples should deserialize to 12.00 and 12.10. With this change they now do.

Testing done

Usual

Reviewer checklist

  • Ensure docs are updated if necessary. (eg. if a user visible feature is being added or changed).
  • Ensure relevant issues are linked (description should include text like "Fixes #")

It is the responsibility of the format to ensure the data returned matches the required schema. This includes the scale of decimals. The JSON format was not correctly setting the scale of decimals when deserializing. For example, give:

```sql
CREATE STREAM S (ID INT KEY, PRICE DECIMAL(10,2) WITH (kafka_topic='S', formats='JSON');
```

The above creates a stream with a single value column `PRICE` which should have a scale of `2`, i.e. two decimal places.

If the data in tbe Kafka record's value was to have the incorrect, e.g.

```json
{
   "price": 12
}
```

or

```json
{
   "price": 12.1
}
```

Then the deserializer was returning the decimal as provided, i.e. `12` or `12.1`. However, this is incorrect as the schema of the column states is has a scale of two. So all values for the column should have the scale set to two, i.e. the above examples should deserialize to `12.00` and `12.10`. With this change they no do.
@big-andy-coates big-andy-coates requested a review from a team as a code owner September 24, 2020 15:20
@big-andy-coates big-andy-coates changed the title Json decimal deser fix: JSON format to set correct scale of decimals Sep 24, 2020
Copy link
Contributor

@vcrfxia vcrfxia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@big-andy-coates big-andy-coates merged commit 57b7b2e into confluentinc:master Sep 28, 2020
@big-andy-coates big-andy-coates deleted the json_decimal_deser branch September 28, 2020 16:22
agavra pushed a commit that referenced this pull request Oct 9, 2020
* fix: JSON format should correct scale of decimals when deserializing

It is the responsibility of the format to ensure the data returned matches the required schema. This includes the scale of decimals. The JSON format was not correctly setting the scale of decimals when deserializing. For example, give:

```sql
CREATE STREAM S (ID INT KEY, PRICE DECIMAL(10,2) WITH (kafka_topic='S', formats='JSON');
```

The above creates a stream with a single value column `PRICE` which should have a scale of `2`, i.e. two decimal places.

If the data in tbe Kafka record's value was to have the incorrect, e.g.

```json
{
   "price": 12
}
```

or

```json
{
   "price": 12.1
}
```

Then the deserializer was returning the decimal as provided, i.e. `12` or `12.1`. However, this is incorrect as the schema of the column states is has a scale of two. So all values for the column should have the scale set to two, i.e. the above examples should deserialize to `12.00` and `12.10`. With this change they no do.


Co-authored-by: Andy Coates <big-andy-coates@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants