Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coercion feature #344

Closed
Navgeet opened this issue Nov 13, 2020 · 5 comments
Closed

Coercion feature #344

Navgeet opened this issue Nov 13, 2020 · 5 comments

Comments

@Navgeet
Copy link

Navgeet commented Nov 13, 2020

Hi, my company is using this library and we have a requirement of coercing types and also perform other types of data correction. I see that this feature is available in some other libraries too. See https://ajv.js.org/coercion.html.

Rather than implement this for ourselves, I think it would be better to add this feature here. That is, if you would like to integrate coercion into this library. If this sounds good, I can come up with a plan (and perhaps a poc) to demonstrate how this would work. I would also like to make it configurable by adding an interface which user can implement and add their own coercion implementations.

@stevehu
Copy link
Contributor

stevehu commented Nov 15, 2020

@Navgeet That is a fantastic idea. In fact, we have done some basic work for the OpenAPI specification already as all number fields are represented in string in the spec. There is a config flag to enable it or not at this line.

https://github.com/networknt/json-schema-validator/blob/master/src/main/java/com/networknt/schema/SchemaValidatorsConfig.java#L28

The solution is ad-hoc and only works with OpenAPI specifications. If we can add an additional module and make it configurable for users, it would be a great feature for users with different use cases.

Let me know if you need any help during the implementation. Thanks.

@hkupty
Copy link
Contributor

hkupty commented Feb 24, 2021

I'm willing to take a stab at this, if that's ok. I believe there are three possible types of coercions:

  • Widening: Taking a narrow type (like Integer) and making it fit in a wider type (like Number). I guess this one already happens implicitly
  • Narrowing: Taking a wider type (like Number) and making it fit in a narrow type (like Integer). That can either be lossless (1.0 -> 1) or lossy (1.3 -> 1).
  • Duck typing: Taking data from one shape and trying to make it fit in another shape, i.e. [] -> false, [] -> "", 1 -> [1].
    That is sometimes happening (like single elements to arrays with typeLoose) but not all of those happen.

To be fair, what I need is only lossless narrowing. I guess lossy narrowing is potentially dangerous, so maybe it shouldn't be implemented (instead, explicitly handled by the caller, maybe?). I'm not really sure about duck typing though (which seems to be the case for this issue), though.. Any thoughts?

@stevehu
Copy link
Contributor

stevehu commented Feb 24, 2021

@Navgeet Could you please review the change?

@Navgeet
Copy link
Author

Navgeet commented Feb 25, 2021

Hi, we never ended up picking coercion at my workplace, so I didn't get around to implementing this. That said, my original vision was not only to implement common and sensible types of coercions, but also to allow the user to implement their own coercers.

For eg, ajv only provides type coercions, but at my work we wanted to "fix" bad enum values too. So if a field is defined as an enum with Foo and Bar as the permissible values, the EnumCoercer would covert values like foo, foO etc to Foo. IMO this is also a good coercion usecase, and would greatly help to validate and clean data in the same step.

Point being that coercion usecases can be more than we're willing to implement ourselves, and there is value in allowing the user to provide their own implementations. So I think we should start by creating a Coercer interface, and lossless narrowing should implement that. In fact, lossless narrowing is just another case of type coercions, so perhaps their should be a TypeCoercer class that provides options to choose various types of type coercions desired. To start with we can just implement lossless narrowing in TypeCoercer, and later extend it.

To start with I think it's ok to "bake-in" TypeCoercer, and later add methods to add user-defined coercers.

@hkupty
Copy link
Contributor

hkupty commented Feb 26, 2021

I think that makes sense in general. I tried to take an approach that doesn't diverge from the current implementation much - that's why I went with the simple config + if case. I don't know how the library is managed and what are the future plans, but I suspect such change would be better suited for a 2.x kind of change since it would end up touching existing code much more than this one.

@stevehu stevehu closed this as completed Dec 2, 2021
stevehu pushed a commit that referenced this issue Jan 14, 2022
* Improve type validation of integrals

Instead of doing string comparison, use new jackson method to determine if a number is an integral.

The `javaSemantics` config option was added in PR #343 which partially addressed issue #334. In the notes for this PR:
> Once jackson-databind 2.12.0 is out, I'll replace my solution with a call to canConvertToExactIntegral

jackson-databind has been updated to 2.12.1 so this is available but the change has not yet been made.

PR #450 which addressed #446 missed this location which is used when calling `JsonSchemaFactory.getSchema`.

Issue #344 requested coercion of various types but the only type implemented in PR #379 was lossless narrowing, set with configuration option `losslessNarrowing`. I believe that setting is unnecessary now as this implementation of `javaSemantics` addresses the original issue, but left it for backwards compatibility. It also allows you to set `javaSemantics=false` and `losslessNarrowing=true` to achieve only this specific case rather than anything that `javaSemantics` is used for in the future. At this time, these properties do exactly the same thing.

- Change from string comparison to `canConvertToExactIntegral` for `javaSemantics` and `losslessNarrowing`
- Add missing documentation around `losslessNarrowing`
- Add more test cases around integrals

* Update changelog
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants